question on HDFS block distribution

[EMAIL PROTECTED] Tue, 22 May 2007 16:18:02 -0700

  hi guys, when a file being copied to HDFS, it seems that HDFS always  writes 
the first copy of a block to the data node running on the  machine that invoked 
the copy, and the data nodes for the replicas are  selected evenly from the 
remaining data nodes. so, for example, on a 5  node cluster with replication 
factor set to 2, if i copy a N-byte file  from node 1, then node 1 will use up 
N bytes and nodes 2,3,4,5 will use  up N/4 bytes each.
  is this a known issue, or there any way to configure HDFS so that the  blocks 
are distributed evenly (so with each node using up 2*N/5 bytes  in this case)?
  thanks,
  
  
  
       
---------------------------------
Get the free Yahoo! toolbar and rest assured with the added security of spyware 
protection.

question on HDFS block distribution

Reply via email to