[
https://issues.apache.org/jira/browse/HBASE-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ted Yu updated HBASE-8246:
--------------------------
Hadoop Flags: Reviewed
Status: Patch Available (was: In Progress)
> Backport HBASE-6318 to 0.94 where SplitLogWorker exits due to
> ConcurrentModificationException
> ---------------------------------------------------------------------------------------------
>
> Key: HBASE-8246
> URL: https://issues.apache.org/jira/browse/HBASE-8246
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.94.6
> Reporter: Jeffrey Zhong
> Assignee: Ted Yu
> Fix For: 0.94.7
>
> Attachments: 8246-0.94.txt, 8246-0.94-v2.txt
>
>
> Today we found the following error in our tests. Later I found we already
> fixed the issue in trunk. I think we should backpor the fix because the
> consequence of the issue is high and the fix isn't complicated.
> {code}
> 2013-04-01 21:23:21,864 INFO
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: worker
> ip-10-143-160-121.ec2.internal,60020,1364849529986 done with task
> /hbase/splitlog/hdfs%3A%2F%2Fip-10-137-16-140.ec2.internal%3A8020%2Fapps%2Fhbase%2Fdata%2F.logs%2Fip-10-137-20-188.ec2.internal%2C60020%2C1364849530779-splitting%2Fip-10-137-20-188.ec2.internal%252C60020%252C1364849530779.1364865556657
> in 67129ms
> 2013-04-01 21:23:21,864 ERROR
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: unexpected error
> java.util.ConcurrentModificationException
> at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
> at java.util.TreeMap$ValueIterator.next(TreeMap.java:1145)
> at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$OutputSink.closeLogWriters(HLogSplitter.java:1279)
> at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$OutputSink.finishWritingAndClose(HLogSplitter.java:1170)
> at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:475)
> at
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:403)
> at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:111)
> at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:264)
> at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:195)
> at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:163)
> at java.lang.Thread.run(Thread.java:662)
> 2013-04-01 21:23:21,865 INFO
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker
> ip-10-143-160-121.ec2.internal,60020,1364849529986 exiting
> {code}
> The impact of this issue is that SplitLogWorker exits so does the region
> server recovering mechanism of HBase. If any RS failed after all
> SplitLogWorkers in te cluster exit due to the issue, you'll see a hang log
> splitting job and the failed RS won't be recovered.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira