Enis Soztutar created HBASE-7172:
------------------------------------

             Summary: TestSplitLogManager.testVanishingTaskZNode() fails when 
run individually and is flaky
                 Key: HBASE-7172
                 URL: https://issues.apache.org/jira/browse/HBASE-7172
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 0.96.0, 0.94.4
            Reporter: Enis Soztutar
            Assignee: Enis Soztutar


TestSplitLogManager.testVanishingTaskZNode fails when run individually (run 
just that test case from eclipse). I've also noticed that it is flaky on 
windows. 

The reason is a rare race condition, which somehow does not happen that much 
when the whole class is run.

The sequence of events is smt like this:
 - we create 1 log file to split
 - we call splitLogDistributed() in its own thread. 
 - splitLogDistributed() is waiting in waitForSplittingCompletion() since there 
are no splitlogworkers, it keep waiting.
 - we delete the task znode from zk
 - SplitLogManager receives the zk callback from GetDataAsyncCallback, which 
will call setDone() and mark the task as success. 
 - However, meanwhile the waitForSplittingCompletion() loops sees that 
remainingInZK == 0, and calls return concurrently to the above. 
 - on return from waitForSplittingCompletion(), splitLogDistributed() fails 
because the znode delete callback has not completed yet. 

This race only happens when the last task is deleted from zk, and normally only 
the SplitLogManager deletes the task znodes after processing it, so I don't 
think this is a production issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to