I believe I've encountered some curious HDFS behavior using Hadoop 2.7.1. Unfortunately I'm in a situation where I need to manually migrate the contents of two volumes used by HDFS to a new volume, on each node. After doing so there are a few file conflicts coming from the two original volumes, specifically the top-level VERSION file, scanner.cursor file, and "dfsUsed" file. If the dfsUsed file is deleted, when restarting the cluster the blocks on each DataNode are erased completely and a new dfsUsed file is generated, this time showing that the volume is nearly empty.
I understand that the "dfsUsed" file is an important piece of metadata for HDFS, but I would expect that if the file disappeared (a very rare corner case, I admit), that HDFS could just regenerate it by verifying the blocks on disk against what is expected by the NameNode. More importantly, I wouldn't expect HDFS to actually delete valid blocks from disk just because that one text file went missing. Immediately after starting the cluster I ran "hdfs fsck /" and it reported that every block was missing and therefore corrupt. Prior to running "start-dfs.sh" and the "fsck" I had successfully copied ~32 TB, then immediately afterward all 32 TB of blocks across 10 nodes disappeared from each node's filesystem. Is this expected behavior? If I'm going to *manually* migrate blocks from two source volumes to a new destination volume, is there a "safe" way to do it? (e.g. generate a new, valid "dfsUsed" file by hand? What about the VERSION files, which contain unique storageIDs?) Should I ask about this on the developer list? Thanks, Joe Naegele
