According the HDFS reference, it sounds right.

ChaoChun


Earney, Billy C. wrote:
> 
> ChoaChun,
> 
> I'm new to hadoop, but my understanding is that the data is divided into
> blocks, and that not all blocks need be on the same node.  So if a file
> has 2 blocks the first block of a file could be on node 1 and the second
> block could be on node 2.  From the link below, it seems that for each
> block, the client will contact the namenode and request one or more
> datanodes to store the block on.
> 
> http://lucene.apache.org/hadoop/hdfs_design.html#Replication+Pipelining
> 
> Is my understanding of the documentation correct?
> 
> 
> -----Original Message-----
> From: ChaoChun Liang [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, September 06, 2007 9:23 PM
> To: [email protected]
> Subject: RE: Replication problem of HDFS
> 
> 
> So, the upload process(from local file system to HDFS) will store all
> blocks(split from the dataset, 
> said M split blocks) into a single node(depend on which client you put),
> not
> to all datanodes. 
> And the "replication" means to replicate to N clients(if replication=N)
> and
> each client owns
> a completed/all M blocks. If I am wrong, please correct it. Thanks.
> 
> ChaoChun
> 
> 
> Stu Hood-2 wrote:
>> 
>> ChaoChun,
>> 
>> Since you set the 'replication = 1' for the file, only 1 copy of the
>> file's blocks will be stored in Hadoop. If you want all 5 machines to
> have
>> copies of each block, then you would set 'replication = 5' for the
> file.
>> 
>> The default for replication is 3.
>> 
>> Thanks,
>> Stu
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/Replication-problem-of-HDFS-tf4382878.html#a125348
> 39
> Sent from the Hadoop Users mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Replication-problem-of-HDFS-tf4382878.html#a12607024
Sent from the Hadoop Users mailing list archive at Nabble.com.

Reply via email to