Renaud Delbru wrote:
Andrew Purtell wrote:
Did I understand you correctly that the corruption appeared to happen
during the period of time when your cluster and DFS was unstable due
to excessive load?
Yes, it is exact. These errors message appeared after "hard reboots"
of HBase (by hard reboot, I mean kill signal since the HRegionServer
process was stuck), when the cluster was not stable.
These errors does not seem to interfere with the "normal operations"
of HBase. We are still able to query and upload data. The only things
is that HBase seems to be stuck in a loop, trying to read these
regions, and fills the log with this error message.
To second Andrews notion, I've seen the FileNotFoundException trying to
pick up the data side of a StoreFile/MapFile on our cluster after an
episode where our hdfs went awry (I have no particulars. I came upon
the scene after the firefighters had left).
Renaud, did you say what version of hbase you're running? I'd thought
I'd added handlers for this kind of situation after encountering the
above crash on our internal cluster.
St.Ack