Re: ALL HDFS Blocks on the Same Machine if Replication factor = 1

Daryn Sharp Mon, 10 Jun 2013 06:56:35 -0700

It's normal.  The default placement strategy stores the first block on the same 
node for performance, then choses a second random node on another rack, then a 
third node on the same rack as the second node.  Using a replication factor of 
1 is not advised if you value your data.  However, if you want a better 
distribution of blocks with 1 replica then consider using a non-DN host to 
upload your files.


Daryn

On Jun 10, 2013, at 8:36 AM, Razen Al Harbi wrote:

> Hello,
> 
> I have deployed Hadoop on a cluster of 20 machines. I set the replication 
> factor to one. When I put a file (larger than HDFS block size) into HDFS, all 
> the blocks are stored on the machine where the Hadoop put command is invoked. 
> 
> For higher replication factor, I see the same behavior but the replicated 
> blocks are stored randomly on all the other machines.
> 
> Is this a normal behavior, if not what would be the cause?
> 
> Thanks, 
> 
> Razen

Re: ALL HDFS Blocks on the Same Machine if Replication factor = 1

Reply via email to