[
https://issues.apache.org/jira/browse/HBASE-10922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963144#comment-13963144
]
Jimmy Xiang commented on HBASE-10922:
-------------------------------------
[~jeffreyz], yes, I should have uploaded the whole log file. Thanks Stack put
it up here. Would you like to take this issue since you'd like to enhance the
status/markComplete too?
[~stack], I meant this IOE:
{code}
Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.NotServingRegionException):
org.apache.hadoop.hbase.NotServingRegionException: Region
d900a9b5ab41c96cf9ad0fb5f039fa8b is not online on
e1418.halxg.cloudera.com,36020,1396841264248
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2454)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2436)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.replay(RSRpcServices.java:1295)
{code}
My understanding is that e1120 was splitting the log. However, it took some
time and the task was taken by another worker. So the HLogSplitter wanted to
error-out. In closing the output sink, it got this IOE so it didn't have
chance to close the status. The IOE was logged in the caller: SplitLogWorker.
> Log splitting status should always be closed
> --------------------------------------------
>
> Key: HBASE-10922
> URL: https://issues.apache.org/jira/browse/HBASE-10922
> Project: HBase
> Issue Type: Bug
> Components: wal
> Reporter: Jimmy Xiang
> Priority: Minor
> Attachments: log-splitting_hang.png, master-log-grep.txt,
> master.log.gz
>
>
> With distributed log replay enabled by default, I ran into an issue that log
> splitting hasn't completed after 13 hours. It seems to hang somewhere.
--
This message was sent by Atlassian JIRA
(v6.2#6252)