I am so sorry to hear this, but I don’t think we have any tool at this point of 
time that can fix that layout issue and I don’t know enough about the 
volume-balancer tool to comment on other options.

If you are okay with losing some of your blocks ( since other nodes are in bad 
state too),  you can decommission the node and just re-add it  and wait for 
cluster to heal itself.
We have been working on a tool to address disk balancing issue, if you are 
interested  you can follow the progress of that tool in HDFS-1312.

—Anu

Ps. Just out of curiosity, can I ask you what prompted you to run this tool ? 
Did you replace a disk or where you running out of space on one disk on that 
node ?

From: David Watzke <[email protected]<mailto:[email protected]>>
Date: Saturday, March 5, 2016 at 6:47 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: datanode directory structure mess-up


Hi list,


I ran into trouble because I accidentally used this tool 
https://github.com/killerwhile/volume-balancer with Hadoop 2.6.0 (just like 
that page warns you not to -- I used it successfully before and didn't think to 
check that page before using it again) and it messed up my datadirs because as 
I understand it that software now makes invalid assumptions about what 
directory moves can it do. Now the datanode logs are filled with these:

WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while 
finding block BP-680964103-A.B.C.D-1375882473930:blk_5822441067008155275_0 on 
volume /xyz/dfs/dn

What can I do to fix this? I don't know what files/dirs were moved and from 
where but is there a reasonable way out of this? Such as editing VERSION file 
to a previous version when DN is down so that it fixes the layout by itself - 
would that work?

Please note that I've lost the other replica due to a filesystem error so I 
can't just ignore it. This is literally my only option to recover some missing 
blocks.

Thanks,

--
David Watzke

Reply via email to