Renaud Delbru wrote:
We are using HBase 0.2.0-dev, Hudson Build #208.
Then its a bug. In the branch we made it so that we scream in the logs
but then keep going figuring this the general preference rather than
have the cluster stuck cycling deploying/failing/redeploying/etc. the
horked region.
What should have happened instead is that the region will successfully
deploy but with the following left in the log:
LOG.warn("Mapfile " + mapfile.toString() + " has empty data. " +
"Deleting. Continuing...Probable DATA LOSS!!! See HBASE-646.");
Would you mind trying the patch in
https://issues.apache.org/jira/browse/HBASE-766? It 'fixes' TRUNK so it
does the above.
That we're 'losing' the 'data' file from StoreFiles/MapFiles in times of
'stress' is disconcerting. In my experience, it happened here once when
there was a storm in HDFS. We lost more than one data file (If we lose
the MapFile index, hbase will make a repair reconstructing it). To
debug, we would need to run with DEBUG enabled on HDFS but no one likes
doing that on the off-chance that there'll be an incident because of the
shear volume of logs generated. We need to somehow develop the
particular sequence that can provoke these losses. I've opened an issue
for now -- HBASE-767 -- to track loss of 'data' files.
Thanks,
St.Ack