[
https://issues.apache.org/jira/browse/HBASE-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914596#action_12914596
]
Jean-Daniel Cryans commented on HBASE-2643:
-------------------------------------------
While testing 0.89.20100924, I got this:
{noformat}
Tests run: 14, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 33.948 sec <<<
FAILURE!
testEOFisIgnored(org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit) Time
elapsed: 0.179 sec <<< ERROR!
java.io.IOException: hdfs://localhost:62668/hbase/hlog/hlog.dat.0, pos=1012,
edit=9
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.addFileInfoToException(SequenceFileLogReader.java:165)
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:137)
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:122)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.parseHLog(HLog.java:1564)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.splitLog(HLog.java:1323)
at
org.apache.hadoop.hbase.regionserver.wal.HLog.splitLog(HLog.java:1210)
at
org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit.testEOFisIgnored(TestHLogSplit.java:317)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:59)
at
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:115)
at
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:140)
at org.apache.maven.surefire.Surefire.run(Surefire.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:290)
at
org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1017)
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at
org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
at
org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1937)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1837)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1883)
at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:135)
... 35 more
{noformat}
And looking at the code it doesn't check if the source of the IOE is a EOF. I
don't see it failing on trunk, was it handled in the scope of another jira?
> Figure how to deal with eof splitting logs
> ------------------------------------------
>
> Key: HBASE-2643
> URL: https://issues.apache.org/jira/browse/HBASE-2643
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.89.20100621
> Reporter: stack
> Assignee: Nicolas Spiegelberg
> Priority: Blocker
> Fix For: 0.89.20100924, 0.90.0
>
> Attachments: ch03s02.html, HBASE-2643.patch
>
>
> When splitting the WAL and encountering EOF, it's not clear what to do.
> Initial discussion of this started in http://review.hbase.org/r/74/ -
> summarizing here for brevity:
> We can get an EOFException while splitting the WAL in the following cases:
> - The writer died after creating the file but before even writing the header
> (or crashed halfway through writing the header)
> - The writer died in the middle of flushing some data - sync() guarantees
> that we can see _at least_ the last edit, but we may see half of an edit that
> was being written out when the RS crashed (especially for large rows)
> - The data was actually corrupted somehow (eg a length field got changed to
> be too long and thus points past EOF)
> Ideally we would know when we see EOF whether it was really the last record,
> and in that case, simply drop that record (it wasn't synced, so therefore we
> dont need to split it). Some open questions:
> - Currently we ignore empty files. Is it ok to ignore an empty log file if
> it's not the last one?
> - Similarly, do we ignore an EOF mid-record if it's not the last log file?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.