[ https://issues.apache.org/jira/browse/HBASE-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462897#comment-13462897 ]
nkeywal commented on HBASE-6736: -------------------------------- There are multiple synchro issues. One of them is {code} @Override protected void chore() { // [...] for (Map.Entry<String, Task> e : tasks.entrySet()) { {code} As we're iterating over a set that can be modified we can have reliability issues, cf. javadoc: "If the map is modified while an iteration over the set is in progress (except through the iterator's own remove operation, or through the setValue operation on a map entry returned by the iterator) the results of the iteration are undefined." > Distributed Split: a split tasks can be mark as DONE but keep unassigned > ------------------------------------------------------------------------ > > Key: HBASE-6736 > URL: https://issues.apache.org/jira/browse/HBASE-6736 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 0.96.0 > Reporter: nkeywal > > Real cluster, scenario mentioned on HBASE-5843. > Got it once out of 5 tests on 0.96 > Didn't get it on 0.94 after 3 tests. > It seems we have a race condition on split logs: the task was nearly > simultaneously marked as done and resubmitted. Then it remained in the > unassigned state. > 2012-09-04 17:27:06,237 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: > total tasks = 1 unassigned = 0 > 2012-09-04 17:27:06,237 INFO org.apache.hadoop.hbase.master.SplitLogManager: > resubmitted 1 out of 1 tasks > 2012-09-04 17:27:06,237 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: > task not yet acquired > /hbase/splitlog/hdfs%3A%2F%2FBOX1%3A9000%2Fhbase%2F.logs%2FBOX0%2C60020%2C1346772046399-splitting%2FBOX0%252C60020%252C1346772046399.1346772046609 > ver = 7 > 2012-09-04 17:27:06,314 INFO org.apache.hadoop.hbase.master.SplitLogManager: > task /hbase/splitlog/RESCAN0000000002 entered state: DONE > BOX1,60000,1346771990737 > 2012-09-04 17:27:06,337 DEBUG > org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted > /hbase/splitlog/RESCAN0000000002 > 2012-09-04 17:27:06,337 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: > deleted task without in memory state /hbase/splitlog/RESCAN0000000002 > 2012-09-04 17:27:07,226 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: > total tasks = 1 unassigned = 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira