RE: Replication problem of HDFS

Stu Hood Wed, 05 Sep 2007 21:46:00 -0700

ChaoChun,

Since you set the 'replication = 1' for the file, only 1 copy of the file's 
blocks will be stored in Hadoop. If you want all 5 machines to have copies of 
each block, then you would set 'replication = 5' for the file.


The default for replication is 3.

Thanks,
Stu



-----Original Message-----
From: ChaoChun Liang 
Sent: Wednesday, September 5, 2007 9:26pm
To: [email protected]
Subject: RE: Replication problem of HDFS


Yes, you are right. the namenode and datanode are in the same machine
and upload data into HDFS in the same one in my environment. I suppose 
the HDFS will distribute these blocks to all others datanode(according the 
HDFS reference), but it is not actually. 

>>Inthis case, the only replica of the file will reside on the Datanode that
is
>>local to the client.
So, does it conflict with the HDFS reference? (a file in the HDFS will be
split into 
one or more blocks and these blocks are stored in a set of Datanodes. )

What kind of uploading to let all data/files store into the datanodes(not a
single one)?

ChaoChun



Dhruba Borthakur wrote:
> 
> Hi ChaoChun,
> 
> I do not fully understand your problem. I am guessing that you are running
> a
> Datanode on the same machine as the Namenode. I am also guessing that you
> are using the Namenode machine as a client to upload a file into HDFS. In
> this case, the only replica of the file will reside on the Datanode that
> is
> local to the client.
> 
> Thanks,
> dhruba
> 
> -----Original Message-----
> From: ChaoChun Liang [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, September 05, 2007 1:58 AM
> To: [email protected]
> Subject: Replication problem of HDFS
> 
> 
> According the reference of
> HDFS(http://lucene.apache.org/hadoop/hdfs_design.html),
> a file in the HDFS will be split into one or more blocks and these blocks
> are stored in 
> a set of Datanodes. 
> 
> I put(set replication=1) a 2GB data set to a 5-nodes cluster, but found
> only
> the 
> namenode increase the block numbers, others nodes keep the same value. It
> means
> all blocks copied to the namenode, none to datanodes. Is it correct?
> 
> ChaoChun
> -- 
> View this message in context:
> http://www.nabble.com/Replication-problem-of-HDFS-tf4382878.html#a12494269
> Sent from the Hadoop Users mailing list archive at Nabble.com.
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Replication-problem-of-HDFS-tf4382878.html#a12514090
Sent from the Hadoop Users mailing list archive at Nabble.com.

RE: Replication problem of HDFS

Reply via email to