Phantom wrote:
Here is the scenario I was concerned about. Consider three nodes in the system A, B and C which are placed say in different racks. Let us say that the disk on A fries up today. Now the blocks that were stored on A are not going to re-replicated (this is my understanding but I could be wrong in this assumption) to some other node or to the new disk with which you would bring back A.
That's incorrect. When a datanode fails to send a heartbeat to the namenode in a timely manner then its data is assumed missing and is re-replicated. And when block corruption is detected, corrupt replicas are removed and non-corrupt replicas are re-replicated to maintain the desired level of replication.
Doug
