[ 
https://issues.apache.org/jira/browse/HBASE-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914596#action_12914596
 ] 

Jean-Daniel Cryans commented on HBASE-2643:
-------------------------------------------

While testing 0.89.20100924, I got this:

{noformat}
Tests run: 14, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 33.948 sec <<< 
FAILURE!
testEOFisIgnored(org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit)  Time 
elapsed: 0.179 sec  <<< ERROR!
java.io.IOException: hdfs://localhost:62668/hbase/hlog/hlog.dat.0, pos=1012, 
edit=9
        at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.addFileInfoToException(SequenceFileLogReader.java:165)
        at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:137)
        at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:122)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLog.parseHLog(HLog.java:1564)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLog.splitLog(HLog.java:1323)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLog.splitLog(HLog.java:1210)
        at 
org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit.testEOFisIgnored(TestHLogSplit.java:317)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
        at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:59)
        at 
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:115)
        at 
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:140)
        at org.apache.maven.surefire.Surefire.run(Surefire.java:109)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:290)
        at 
org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1017)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at 
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
        at 
org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1937)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1837)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1883)
        at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:135)
        ... 35 more
{noformat}

And looking at the code it doesn't check if the source of the IOE is a EOF. I 
don't see it failing on trunk, was it handled in the scope of another jira?

> Figure how to deal with eof splitting logs
> ------------------------------------------
>
>                 Key: HBASE-2643
>                 URL: https://issues.apache.org/jira/browse/HBASE-2643
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.89.20100621
>            Reporter: stack
>            Assignee: Nicolas Spiegelberg
>            Priority: Blocker
>             Fix For: 0.89.20100924, 0.90.0
>
>         Attachments: ch03s02.html, HBASE-2643.patch
>
>
> When splitting the WAL and encountering EOF, it's not clear what to do. 
> Initial discussion of this started in http://review.hbase.org/r/74/ - 
> summarizing here for brevity:
> We can get an EOFException while splitting the WAL in the following cases:
> - The writer died after creating the file but before even writing the header 
> (or crashed halfway through writing the header)
> - The writer died in the middle of flushing some data - sync() guarantees 
> that we can see _at least_ the last edit, but we may see half of an edit that 
> was being written out when the RS crashed (especially for large rows)
> - The data was actually corrupted somehow (eg a length field got changed to 
> be too long and thus points past EOF)
> Ideally we would know when we see EOF whether it was really the last record, 
> and in that case, simply drop that record (it wasn't synced, so therefore we 
> dont need to split it). Some open questions:
>   - Currently we ignore empty files. Is it ok to ignore an empty log file if 
> it's not the last one?
>   - Similarly, do we ignore an EOF mid-record if it's not the last log file?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to