Synopsis: 
* After shutting down a datanode in  a cluster, fsck declares CORRUPT with 
missing blocks, 
* I restore/restart the datanode and fsck soon declares things healthy
* But dfsadmin -report says a small number of blocks have corrupt replicas and 
an even smaller number of under replicated blocks
* After a couple of days that number corrupt replicas and under replicated 
blocks stays the same

Full Story:
My Goal is to rebalance blocks across 3 drives each within 2 datanodes in a 9 
datanode (Replication=3) cluster running hadoop 0.20.1
(EBS Volumes were added to the datanodes over time so one disk had 95% usage 
and the others had significantly less)

The plan was to decommission the nodes and then wipe the disks and then add 
them back in to the cluster.

Before I started I ran fsck and all was healthy. (Unfortunately I did not 
really look at the dfsadmin -report at that time, so I can't be sure if there 
were no blocks with corrupt replicas at this point)

I put two nodes into the Decommission process and after waiting about 36 hours 
it hadn't finished decommissioning ether. So I decided to throw caution to the 
wind and shut down one of them. (and had taken the node I was shutting down  
out of the dfs.exclude.file file, also removed the 2nd node from the 
dfs.exclude.file , dfsadmin -refreshNodes but kept the 2nd node live)

After shutting down one node, running fsck showed about 400 blocks as missing.

So I brought back up the shutdown node (it took a while as I had to restore it 
from EBS snapshot) and fsck quickly went back to healthy but with a significant 
amount of Over replicated blocks

I put that node back into the decommissioning state (put just that one node 
back in the dfs.exclude.file and ran dfsadmin -refreshNodes.

After another day or so, its still in the decommissioning mode. Fsck says the 
cluster is healthy but still 37% over-replicated blocks. 

But the thing that concerns me is that  dfsadmin -report says:

Under replicated blocks: 18
Blocks with corrupt replicas: 34

So really two questions:

* Is there a way to force these corrupt replicas and under replicated blocks to 
get fixed?
* Is there a way to speed up the decommissioning process (without restarting 
the cluster)

I presume that its not safe for me to take down this node until the 
decommissioning completes and/or the corrupt replicas are fixed..

And finally, is there a better way to accomplish the original task of 
rebalancing disks on a datanode?

Thanks!
Rob
__________________
Robert J Berger - CTO
Runa Inc.
+1 408-838-8896
http://blog.ibd.com



Reply via email to