Always Skip Errors during Log Recovery
--------------------------------------

                 Key: HBASE-2933
                 URL: https://issues.apache.org/jira/browse/HBASE-2933
             Project: HBase
          Issue Type: Bug
            Reporter: Nicolas Spiegelberg
            Assignee: Nicolas Spiegelberg


While testing a cluster, we hit upon the following assert during region 
assigment.  We were killing the master during a long run of splits.  We think 
what happened is that the HMaster was killed while splitting, woke up & split 
again.  If this happens, we will have 2 files: 1 partially written and 1 
complete one.  Since encountering partial log splits upon Master failure is 
considered normal behavior, we should continue at the RS level if we encounter 
an EOFException & not an filesystem-level exception, even with skip.errors == 
false.

2010-08-20 16:59:07,718 ERROR 
org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening 
MailBox_dsanduleac,57db45276ece7ce03ef7e8d9969eb189:[email protected],1280960828959.7c542d24d4496e273b739231b01885e6.
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:375)
        at 
org.apache.hadoop.io.SequenceFile$Reader.readRecordLength(SequenceFile.java:1902)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1932)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1837)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1883)
        at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:121)
        at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:113)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:1981)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:1956)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:1915)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:344)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1490)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1437)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1345)
        at java.lang.Thread.run(Thread.java:619)
2010-08-20 16:59:07,719 ERROR 
org.apache.hadoop.hbase.regionserver.RSZookeeperUpdater: Aborting open of 
region 7c542d24d4496e273b739231b01885e6

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to