[
https://issues.apache.org/jira/browse/HBASE-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719022#comment-13719022
]
stack commented on HBASE-9022:
------------------------------
So here is a local run where testIOEOnOutputThread completes a couple of
seconds:
{code}
2013-07-24 15:32:10,936 INFO [main] wal.HLogSplitter(334): Finishing writing
output logs and closing down.
2013-07-24 15:32:10,936 INFO [main] wal.HLogSplitter$OutputSink(925): Waiting
for split writer threads to finish
2013-07-24 15:32:10,962 INFO [IPC Server handler 8 on 65470]
namenode.FSNamesystem(169): ugi=stack ip=/127.0.0.1 cmd=mkdirs
src=/hbase/t1/bbb/recovered.edits dst=null
perm=stack:supergroup:rwxr-xr-x
2013-07-24 15:32:10,963 INFO [IPC Server handler 6 on 65470]
namenode.FSNamesystem(169): ugi=stack ip=/127.0.0.1 cmd=mkdirs
src=/hbase/t1/ccc/recovered.edits dst=null
perm=stack:supergroup:rwxr-xr-x
2013-07-24 15:32:11,078 DEBUG [WriterThread-1]
wal.HLogSplitter$LogRecoveredEditsOutputSink(1200): Creating writer
path=/hbase/t1/bbb/recovered.edits/0000000000000000001.temp region=bbb
2013-07-24 15:32:11,078 DEBUG [WriterThread-2]
wal.HLogSplitter$LogRecoveredEditsOutputSink(1200): Creating writer
path=/hbase/t1/ccc/recovered.edits/0000000000000000002.temp region=ccc
2013-07-24 15:32:11,082 FATAL [WriterThread-2]
wal.HLogSplitter$LogRecoveredEditsOutputSink(1234): Got while writing log
entry to log
java.io.IOException: Injected
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 15:32:11,082 FATAL [WriterThread-1]
wal.HLogSplitter$LogRecoveredEditsOutputSink(1234): Got while writing log
entry to log
java.io.IOException: Injected
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 15:32:11,082 ERROR [WriterThread-1]
wal.HLogSplitter$WriterThread(793): Error in log splitting write thread
java.io.IOException: Injected
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 15:32:11,082 ERROR [WriterThread-2]
wal.HLogSplitter$WriterThread(793): Error in log splitting write thread
java.io.IOException: Injected
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 15:32:11,087 INFO [split-log-closeStream-2]
wal.HLogSplitter$LogRecoveredEditsOutputSink$2(1042): Closed path
/hbase/t1/ccc/recovered.edits/0000000000000000002.temp (wrote 0 edits in 0ms)
2013-07-24 15:32:11,087 INFO [split-log-closeStream-1]
wal.HLogSplitter$LogRecoveredEditsOutputSink$2(1042): Closed path
/hbase/t1/bbb/recovered.edits/0000000000000000001.temp (wrote 0 edits in 0ms)
{code}
It is not easy to read but notice how we do the 'Creating writer' lines one
after the other.
Here is an example fail.
{code}
2013-07-24 20:52:06,520 INFO [Thread-297] wal.HLogSplitter(334): Finishing
writing output logs and closing down.
2013-07-24 20:52:06,520 INFO [Thread-297] wal.HLogSplitter$OutputSink(925):
Waiting for split writer threads to finish
2013-07-24 20:52:06,529 DEBUG [WriterThread-2]
wal.HLogSplitter$WriterThread(799): Writer thread
Thread[WriterThread-2,5,main]: starting
2013-07-24 20:52:06,542 INFO [IPC Server handler 9 on 50391]
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true ugi=ec2-user
(auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/hbase/t1/bbb
dst=null perm=null
2013-07-24 20:52:06,543 INFO [IPC Server handler 1 on 50391]
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true ugi=ec2-user
(auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo src=/hbase/t1/ccc
dst=null perm=null
2013-07-24 20:52:06,544 INFO [IPC Server handler 3 on 50391]
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true ugi=ec2-user
(auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo
src=/hbase/t1/bbb/recovered.edits dst=null perm=null
2013-07-24 20:52:06,546 INFO [IPC Server handler 4 on 50391]
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true ugi=ec2-user
(auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo
src=/hbase/t1/bbb/recovered.edits dst=null perm=null
2013-07-24 20:52:06,546 INFO [IPC Server handler 2 on 50391]
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true ugi=ec2-user
(auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo
src=/hbase/t1/ccc/recovered.edits dst=null perm=null
2013-07-24 20:52:06,548 INFO [IPC Server handler 5 on 50391]
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true ugi=ec2-user
(auth:SIMPLE) ip=/127.0.0.1 cmd=mkdirs
src=/hbase/t1/bbb/recovered.edits dst=null
perm=ec2-user:supergroup:rwxr-xr-x
2013-07-24 20:52:06,549 INFO [IPC Server handler 8 on 50391]
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true ugi=ec2-user
(auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo
src=/hbase/t1/ccc/recovered.edits dst=null perm=null
2013-07-24 20:52:06,550 INFO [IPC Server handler 6 on 50391]
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true ugi=ec2-user
(auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo
src=/hbase/t1/bbb/recovered.edits/0000000000000000001.temp dst=null
perm=null
2013-07-24 20:52:06,552 INFO [IPC Server handler 0 on 50391]
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true ugi=ec2-user
(auth:SIMPLE) ip=/127.0.0.1 cmd=mkdirs
src=/hbase/t1/ccc/recovered.edits dst=null
perm=ec2-user:supergroup:rwxr-xr-x
2013-07-24 20:52:06,552 DEBUG [WriterThread-1]
wal.HLogSplitter$LogRecoveredEditsOutputSink(1200): Creating writer
path=/hbase/t1/bbb/recovered.edits/0000000000000000001.temp region=bbb
2013-07-24 20:52:06,553 FATAL [WriterThread-1]
wal.HLogSplitter$LogRecoveredEditsOutputSink(1234): Got while writing log
entry to log
java.io.IOException: Injected
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 20:52:06,553 INFO [IPC Server handler 7 on 50391]
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true ugi=ec2-user
(auth:SIMPLE) ip=/127.0.0.1 cmd=getfileinfo
src=/hbase/t1/ccc/recovered.edits/0000000000000000002.temp dst=null
perm=null
2013-07-24 20:52:06,553 ERROR [WriterThread-1]
wal.HLogSplitter$WriterThread(793): Error in log splitting write thread
java.io.IOException: Injected
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 20:52:06,554 DEBUG [WriterThread-2]
wal.HLogSplitter$LogRecoveredEditsOutputSink(1200): Creating writer
path=/hbase/t1/ccc/recovered.edits/0000000000000000002.temp region=ccc
2013-07-24 20:52:06,555 FATAL [WriterThread-2]
wal.HLogSplitter$LogRecoveredEditsOutputSink(1234): Got while writing log
entry to log
java.io.IOException: Injected
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 20:52:06,555 ERROR [WriterThread-2]
wal.HLogSplitter$WriterThread(793): Error in log splitting write thread
java.io.IOException: Injected
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 20:52:06,557 INFO [split-log-closeStream-1]
wal.HLogSplitter$LogRecoveredEditsOutputSink$2(1042): Closed path
/hbase/t1/bbb/recovered.edits/0000000000000000001.temp (wrote 0 edits in 0ms)
{code}
There is a bit of extra ipc logging going on but you can see that the 'Creating
writers' is staggered and that we fail writing the first writer before we open
the second. Failed write is FATAL so we are shutting down and on our way out.
We are leaving the just-opened writer around on our exit.
Let me try and get a thread dump in here. It is tough figuring out what we are
blocked on.
> TestHLogSplit.testIOEOnOutputThread fails
> -----------------------------------------
>
> Key: HBASE-9022
> URL: https://issues.apache.org/jira/browse/HBASE-9022
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Attachments: 9022.txt, 9022.txt, 9022.txt
>
>
> https://builds.apache.org/job/PreCommit-HBASE-Build/6428//testReport/org.apache.hadoop.hbase.regionserver.wal/TestHLogSplit/testIOEOnOutputThread/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira