Dear Sai Sai,

"Hadoop, the definitive guide" says regarding default replica placement:

- first replica is placed on the same node as the client (lowest bandwidth
penalty).
- second replica is placed off-rack, at a random node of the other rack
(avoiding busy racks).
- third replicate is placed on random node on rack where second replica is
stored.
- other replicas are placed on random nodes of the cluster (avoiding busy
racks).

If client is not on the cluster, first replica is placed on a random node
(avoiding busy racks).

Best regards,


Jens

Reply via email to