kpm1985 opened a new pull request #1004: FLUO-1000 OracleServer race conditions URL: https://github.com/apache/fluo/pull/1004 This pull request is motivated by issue #1000 and is a work in progress. There are two main issues here that I've identified, they are both in OracleServer. 1) isLeader has a race condition, it is a volatile var so I've set the flag at the beginning of the LeaderSelector callback method takeLeadership. 2) There are two curator frameworks in OracleServer. One comes from sharedResources and doesn't seem to cause any issues, but the one created during the start method does cause issues. Specifically when takeLeadership is called, the curatorFramework may be in a state that is not CuratorFrameworkState.STARTED. One would think blockUntilConnected() would resolve this problem, but if you dig into the curator code, the state.started is not checked. To be clear, blockUntilConnected does not solve the problem. I have found that if you spin on CuratorFrameworkState.STARTED these exceptions disappear. I'd welcome some analysis when everyone gets a little time. In the meanwhile I'll continue to post on #1000 and leave this section for the code changes.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
