ChaoChun, Since you set the 'replication = 1' for the file, only 1 copy of the file's blocks will be stored in Hadoop. If you want all 5 machines to have copies of each block, then you would set 'replication = 5' for the file.
The default for replication is 3. Thanks, Stu -----Original Message----- From: ChaoChun Liang Sent: Wednesday, September 5, 2007 9:26pm To: [email protected] Subject: RE: Replication problem of HDFS Yes, you are right. the namenode and datanode are in the same machine and upload data into HDFS in the same one in my environment. I suppose the HDFS will distribute these blocks to all others datanode(according the HDFS reference), but it is not actually. >>Inthis case, the only replica of the file will reside on the Datanode that is >>local to the client. So, does it conflict with the HDFS reference? (a file in the HDFS will be split into one or more blocks and these blocks are stored in a set of Datanodes. ) What kind of uploading to let all data/files store into the datanodes(not a single one)? ChaoChun Dhruba Borthakur wrote: > > Hi ChaoChun, > > I do not fully understand your problem. I am guessing that you are running > a > Datanode on the same machine as the Namenode. I am also guessing that you > are using the Namenode machine as a client to upload a file into HDFS. In > this case, the only replica of the file will reside on the Datanode that > is > local to the client. > > Thanks, > dhruba > > -----Original Message----- > From: ChaoChun Liang [mailto:[EMAIL PROTECTED] > Sent: Wednesday, September 05, 2007 1:58 AM > To: [email protected] > Subject: Replication problem of HDFS > > > According the reference of > HDFS(http://lucene.apache.org/hadoop/hdfs_design.html), > a file in the HDFS will be split into one or more blocks and these blocks > are stored in > a set of Datanodes. > > I put(set replication=1) a 2GB data set to a 5-nodes cluster, but found > only > the > namenode increase the block numbers, others nodes keep the same value. It > means > all blocks copied to the namenode, none to datanodes. Is it correct? > > ChaoChun > -- > View this message in context: > http://www.nabble.com/Replication-problem-of-HDFS-tf4382878.html#a12494269 > Sent from the Hadoop Users mailing list archive at Nabble.com. > > > > -- View this message in context: http://www.nabble.com/Replication-problem-of-HDFS-tf4382878.html#a12514090 Sent from the Hadoop Users mailing list archive at Nabble.com.
