Synopsis: * After shutting down a datanode in a cluster, fsck declares CORRUPT with missing blocks, * I restore/restart the datanode and fsck soon declares things healthy * But dfsadmin -report says a small number of blocks have corrupt replicas and an even smaller number of under replicated blocks * After a couple of days that number corrupt replicas and under replicated blocks stays the same
Full Story: My Goal is to rebalance blocks across 3 drives each within 2 datanodes in a 9 datanode (Replication=3) cluster running hadoop 0.20.1 (EBS Volumes were added to the datanodes over time so one disk had 95% usage and the others had significantly less) The plan was to decommission the nodes and then wipe the disks and then add them back in to the cluster. Before I started I ran fsck and all was healthy. (Unfortunately I did not really look at the dfsadmin -report at that time, so I can't be sure if there were no blocks with corrupt replicas at this point) I put two nodes into the Decommission process and after waiting about 36 hours it hadn't finished decommissioning ether. So I decided to throw caution to the wind and shut down one of them. (and had taken the node I was shutting down out of the dfs.exclude.file file, also removed the 2nd node from the dfs.exclude.file , dfsadmin -refreshNodes but kept the 2nd node live) After shutting down one node, running fsck showed about 400 blocks as missing. So I brought back up the shutdown node (it took a while as I had to restore it from EBS snapshot) and fsck quickly went back to healthy but with a significant amount of Over replicated blocks I put that node back into the decommissioning state (put just that one node back in the dfs.exclude.file and ran dfsadmin -refreshNodes. After another day or so, its still in the decommissioning mode. Fsck says the cluster is healthy but still 37% over-replicated blocks. But the thing that concerns me is that dfsadmin -report says: Under replicated blocks: 18 Blocks with corrupt replicas: 34 So really two questions: * Is there a way to force these corrupt replicas and under replicated blocks to get fixed? * Is there a way to speed up the decommissioning process (without restarting the cluster) I presume that its not safe for me to take down this node until the decommissioning completes and/or the corrupt replicas are fixed.. And finally, is there a better way to accomplish the original task of rebalancing disks on a datanode? Thanks! Rob __________________ Robert J Berger - CTO Runa Inc. +1 408-838-8896 http://blog.ibd.com