[ 
https://issues.apache.org/jira/browse/HBASE-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13719022#comment-13719022
 ] 

stack commented on HBASE-9022:
------------------------------

So here is a local run where testIOEOnOutputThread completes a couple of 
seconds:

{code}
2013-07-24 15:32:10,936 INFO  [main] wal.HLogSplitter(334): Finishing writing 
output logs and closing down.
2013-07-24 15:32:10,936 INFO  [main] wal.HLogSplitter$OutputSink(925): Waiting 
for split writer threads to finish
2013-07-24 15:32:10,962 INFO  [IPC Server handler 8 on 65470] 
namenode.FSNamesystem(169): ugi=stack     ip=/127.0.0.1   cmd=mkdirs      
src=/hbase/t1/bbb/recovered.edits       dst=null        
perm=stack:supergroup:rwxr-xr-x
2013-07-24 15:32:10,963 INFO  [IPC Server handler 6 on 65470] 
namenode.FSNamesystem(169): ugi=stack     ip=/127.0.0.1   cmd=mkdirs      
src=/hbase/t1/ccc/recovered.edits       dst=null        
perm=stack:supergroup:rwxr-xr-x
2013-07-24 15:32:11,078 DEBUG [WriterThread-1] 
wal.HLogSplitter$LogRecoveredEditsOutputSink(1200): Creating writer 
path=/hbase/t1/bbb/recovered.edits/0000000000000000001.temp region=bbb
2013-07-24 15:32:11,078 DEBUG [WriterThread-2] 
wal.HLogSplitter$LogRecoveredEditsOutputSink(1200): Creating writer 
path=/hbase/t1/ccc/recovered.edits/0000000000000000002.temp region=ccc
2013-07-24 15:32:11,082 FATAL [WriterThread-2] 
wal.HLogSplitter$LogRecoveredEditsOutputSink(1234):  Got while writing log 
entry to log
java.io.IOException: Injected
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 15:32:11,082 FATAL [WriterThread-1] 
wal.HLogSplitter$LogRecoveredEditsOutputSink(1234):  Got while writing log 
entry to log
java.io.IOException: Injected
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 15:32:11,082 ERROR [WriterThread-1] 
wal.HLogSplitter$WriterThread(793): Error in log splitting write thread
java.io.IOException: Injected
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 15:32:11,082 ERROR [WriterThread-2] 
wal.HLogSplitter$WriterThread(793): Error in log splitting write thread
java.io.IOException: Injected
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 15:32:11,087 INFO  [split-log-closeStream-2] 
wal.HLogSplitter$LogRecoveredEditsOutputSink$2(1042): Closed path 
/hbase/t1/ccc/recovered.edits/0000000000000000002.temp (wrote 0 edits in 0ms)
2013-07-24 15:32:11,087 INFO  [split-log-closeStream-1] 
wal.HLogSplitter$LogRecoveredEditsOutputSink$2(1042): Closed path 
/hbase/t1/bbb/recovered.edits/0000000000000000001.temp (wrote 0 edits in 0ms)

{code}

It is not easy to read but notice how we do the 'Creating writer' lines one 
after the other.

Here is an example fail.

{code}
2013-07-24 20:52:06,520 INFO  [Thread-297] wal.HLogSplitter(334): Finishing 
writing output logs and closing down.
2013-07-24 20:52:06,520 INFO  [Thread-297] wal.HLogSplitter$OutputSink(925): 
Waiting for split writer threads to finish
2013-07-24 20:52:06,529 DEBUG [WriterThread-2] 
wal.HLogSplitter$WriterThread(799): Writer thread 
Thread[WriterThread-2,5,main]: starting
2013-07-24 20:52:06,542 INFO  [IPC Server handler 9 on 50391] 
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true      ugi=ec2-user 
(auth:SIMPLE)      ip=/127.0.0.1   cmd=getfileinfo src=/hbase/t1/bbb       
dst=null        perm=null
2013-07-24 20:52:06,543 INFO  [IPC Server handler 1 on 50391] 
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true      ugi=ec2-user 
(auth:SIMPLE)      ip=/127.0.0.1   cmd=getfileinfo src=/hbase/t1/ccc       
dst=null        perm=null
2013-07-24 20:52:06,544 INFO  [IPC Server handler 3 on 50391] 
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true      ugi=ec2-user 
(auth:SIMPLE)      ip=/127.0.0.1   cmd=getfileinfo 
src=/hbase/t1/bbb/recovered.edits       dst=null        perm=null
2013-07-24 20:52:06,546 INFO  [IPC Server handler 4 on 50391] 
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true      ugi=ec2-user 
(auth:SIMPLE)      ip=/127.0.0.1   cmd=getfileinfo 
src=/hbase/t1/bbb/recovered.edits       dst=null        perm=null
2013-07-24 20:52:06,546 INFO  [IPC Server handler 2 on 50391] 
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true      ugi=ec2-user 
(auth:SIMPLE)      ip=/127.0.0.1   cmd=getfileinfo 
src=/hbase/t1/ccc/recovered.edits       dst=null        perm=null
2013-07-24 20:52:06,548 INFO  [IPC Server handler 5 on 50391] 
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true      ugi=ec2-user 
(auth:SIMPLE)      ip=/127.0.0.1   cmd=mkdirs      
src=/hbase/t1/bbb/recovered.edits       dst=null        
perm=ec2-user:supergroup:rwxr-xr-x
2013-07-24 20:52:06,549 INFO  [IPC Server handler 8 on 50391] 
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true      ugi=ec2-user 
(auth:SIMPLE)      ip=/127.0.0.1   cmd=getfileinfo 
src=/hbase/t1/ccc/recovered.edits       dst=null        perm=null
2013-07-24 20:52:06,550 INFO  [IPC Server handler 6 on 50391] 
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true      ugi=ec2-user 
(auth:SIMPLE)      ip=/127.0.0.1   cmd=getfileinfo 
src=/hbase/t1/bbb/recovered.edits/0000000000000000001.temp      dst=null        
perm=null
2013-07-24 20:52:06,552 INFO  [IPC Server handler 0 on 50391] 
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true      ugi=ec2-user 
(auth:SIMPLE)      ip=/127.0.0.1   cmd=mkdirs      
src=/hbase/t1/ccc/recovered.edits       dst=null        
perm=ec2-user:supergroup:rwxr-xr-x
2013-07-24 20:52:06,552 DEBUG [WriterThread-1] 
wal.HLogSplitter$LogRecoveredEditsOutputSink(1200): Creating writer 
path=/hbase/t1/bbb/recovered.edits/0000000000000000001.temp region=bbb
2013-07-24 20:52:06,553 FATAL [WriterThread-1] 
wal.HLogSplitter$LogRecoveredEditsOutputSink(1234):  Got while writing log 
entry to log
java.io.IOException: Injected
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 20:52:06,553 INFO  [IPC Server handler 7 on 50391] 
namenode.FSNamesystem$DefaultAuditLogger(5567): allowed=true      ugi=ec2-user 
(auth:SIMPLE)      ip=/127.0.0.1   cmd=getfileinfo 
src=/hbase/t1/ccc/recovered.edits/0000000000000000002.temp      dst=null        
perm=null
2013-07-24 20:52:06,553 ERROR [WriterThread-1] 
wal.HLogSplitter$WriterThread(793): Error in log splitting write thread
java.io.IOException: Injected
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 20:52:06,554 DEBUG [WriterThread-2] 
wal.HLogSplitter$LogRecoveredEditsOutputSink(1200): Creating writer 
path=/hbase/t1/ccc/recovered.edits/0000000000000000002.temp region=ccc
2013-07-24 20:52:06,555 FATAL [WriterThread-2] 
wal.HLogSplitter$LogRecoveredEditsOutputSink(1234):  Got while writing log 
entry to log
java.io.IOException: Injected
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 20:52:06,555 ERROR [WriterThread-2] 
wal.HLogSplitter$WriterThread(793): Error in log splitting write thread
java.io.IOException: Injected
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1225)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:829)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:821)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:791)
2013-07-24 20:52:06,557 INFO  [split-log-closeStream-1] 
wal.HLogSplitter$LogRecoveredEditsOutputSink$2(1042): Closed path 
/hbase/t1/bbb/recovered.edits/0000000000000000001.temp (wrote 0 edits in 0ms)

{code}

There is a bit of extra ipc logging going on but you can see that the 'Creating 
writers' is staggered and that we fail writing the first writer before we open 
the second.  Failed write is FATAL so we are shutting down and on our way out.

We are leaving the just-opened writer around on our exit.

Let me try and get a thread dump in here.  It is tough figuring out what we are 
blocked on.
                
> TestHLogSplit.testIOEOnOutputThread fails
> -----------------------------------------
>
>                 Key: HBASE-9022
>                 URL: https://issues.apache.org/jira/browse/HBASE-9022
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>         Attachments: 9022.txt, 9022.txt, 9022.txt
>
>
> https://builds.apache.org/job/PreCommit-HBASE-Build/6428//testReport/org.apache.hadoop.hbase.regionserver.wal/TestHLogSplit/testIOEOnOutputThread/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to