[ 
https://issues.apache.org/jira/browse/HBASE-10922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963144#comment-13963144
 ] 

Jimmy Xiang commented on HBASE-10922:
-------------------------------------

[~jeffreyz], yes, I should have uploaded the whole log file. Thanks Stack put 
it up here. Would you like to take this issue since you'd like to enhance the 
status/markComplete too?

[~stack], I meant this IOE:

{code}
Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException):
 org.apache.hadoop.hbase.NotServingRegionException: Region 
d900a9b5ab41c96cf9ad0fb5f039fa8b is not online on 
e1418.halxg.cloudera.com,36020,1396841264248
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2454)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2436)
        at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:1295)
{code}

My understanding is that e1120 was splitting the log. However, it took some 
time and the task was taken by another worker. So the HLogSplitter wanted to 
error-out. In closing the output sink,  it got this IOE so it didn't have 
chance to close the status.  The IOE was logged in the caller: SplitLogWorker.

> Log splitting status should always be closed
> --------------------------------------------
>
>                 Key: HBASE-10922
>                 URL: https://issues.apache.org/jira/browse/HBASE-10922
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>            Reporter: Jimmy Xiang
>            Priority: Minor
>         Attachments: log-splitting_hang.png, master-log-grep.txt, 
> master.log.gz
>
>
> With distributed log replay enabled by default, I ran into an issue that log 
> splitting hasn't completed after 13 hours. It seems to hang somewhere.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to