Enis Soztutar created HBASE-7172:
------------------------------------
Summary: TestSplitLogManager.testVanishingTaskZNode() fails when
run individually and is flaky
Key: HBASE-7172
URL: https://issues.apache.org/jira/browse/HBASE-7172
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.96.0, 0.94.4
Reporter: Enis Soztutar
Assignee: Enis Soztutar
TestSplitLogManager.testVanishingTaskZNode fails when run individually (run
just that test case from eclipse). I've also noticed that it is flaky on
windows.
The reason is a rare race condition, which somehow does not happen that much
when the whole class is run.
The sequence of events is smt like this:
- we create 1 log file to split
- we call splitLogDistributed() in its own thread.
- splitLogDistributed() is waiting in waitForSplittingCompletion() since there
are no splitlogworkers, it keep waiting.
- we delete the task znode from zk
- SplitLogManager receives the zk callback from GetDataAsyncCallback, which
will call setDone() and mark the task as success.
- However, meanwhile the waitForSplittingCompletion() loops sees that
remainingInZK == 0, and calls return concurrently to the above.
- on return from waitForSplittingCompletion(), splitLogDistributed() fails
because the znode delete callback has not completed yet.
This race only happens when the last task is deleted from zk, and normally only
the SplitLogManager deletes the task znodes after processing it, so I don't
think this is a production issue.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira