Hi, This is about case where HDFS has some data blocks which are over-replicated.
Scenario is discussed below, If one of datanodes goes down, Namenode will see some blocks as under replicated and will start replication of under replicated blocks to bring their replication level back to expected. If after that datanode which was down comes up again without any data loss, at this time there will be blocks having more replication level than expected. Does namenode itself take care of removing extra blocks? or do we need to schedule balancer for that? -Ajit