Hi,
Our current cluster runs with 22 data nodes - each with 4TB .
We should be installing new data nodes on this existing cluster , but each will 
have 8TB of storage capacity.
I am wondering how will the namenode distribute the blocks, It is my 
understanding that Replica Placement policy is that data nodes are chosen at 
random, so an even distribution
is expected , So eventually the smaller nodes
will fill up while the larger nodes will reach 50% at which point the small
nodes will become unusable. 
Am I correct? 
Is there any recommended practice in this case? would running a balancer 
periodically help? 
 











                                          

Reply via email to