To test the block distribution, run the same put command from the NameNode and then again from the DataNode. Check the HDFS filesystem after both commands. In my case, a 2GB file was distributed mostly evenly across the datanodes when put was run on the NameNode, and then put only on the DataNode where I ran the put command
On Tue, Jul 13, 2010 at 9:32 AM, C.V.Krishnakumar <[email protected]>wrote: > Hi, > I am a newbie. I am curious to know how you discovered that all the blocks > are written to datanode's hdfs? I thought the replication by namenode was > transparent. Am I missing something? > Thanks, > Krishna > On Jul 12, 2010, at 4:21 PM, Nathan Grice wrote: > > > We are trying to load data into hdfs from one of the slaves and when the > put > > command is run from a slave(datanode) all of the blocks are written to > the > > datanode's hdfs, and not distributed to all of the nodes in the cluster. > It > > does not seem to matter what destination format we use ( /filename vs > > hdfs://master:9000/filename) it always behaves the same. > > Conversely, running the same command from the namenode distributes the > files > > across the datanodes. > > > > Is there something I am missing? > > > > -Nathan > >
