It's normal. The default placement strategy stores the first block on the same node for performance, then choses a second random node on another rack, then a third node on the same rack as the second node. Using a replication factor of 1 is not advised if you value your data. However, if you want a better distribution of blocks with 1 replica then consider using a non-DN host to upload your files.
Daryn On Jun 10, 2013, at 8:36 AM, Razen Al Harbi wrote: > Hello, > > I have deployed Hadoop on a cluster of 20 machines. I set the replication > factor to one. When I put a file (larger than HDFS block size) into HDFS, all > the blocks are stored on the machine where the Hadoop put command is invoked. > > For higher replication factor, I see the same behavior but the replicated > blocks are stored randomly on all the other machines. > > Is this a normal behavior, if not what would be the cause? > > Thanks, > > Razen
