jiajunwang opened a new issue #1748: URL: https://github.com/apache/helix/issues/1748
### Describe the bug In a race condition, the RoutingTableProvider shutdown process will be deadlocked. In detail, considering the following 2 threads, Thread A: Application shutdown the provider. A.1. shutting down all the provider threads. A.2. lock _lastSeenSessions in updateCurrentStatesListeners method A.3. manager.removeListener, lock the HelixManager. A.4. unlock HelixManager. A.5. unlock _lastSeenSessions A.6. Continue other shutting down logic Thread B (Event processing thread): process Current State change event, and trigger onLiveInstanceChange() of the RoutingTableProvider. B.1. Lock HelixManager for invoking the listner (RoutingTableProvider.onLiveInstanceChange) B.2. lock _lastSeenSessions in updateCurrentStatesListeners method B.3. manager.removeListener, lock the HelixManager. B.4. unlock HelixManager. B.5. unlock _lastSeenSessions B.6. continue other event processing logic. Note that on A.3. and B.2., Thread A holds _lastSeenSessions lock, and requests locking HelixManager. While Thread B holds HelixManager lock and requests locking _lastSeenSessions. This is a perfect example of deadlock. ### To Reproduce I find this bug when investigating testCurrentStatePathLeakingByAsycRemoval() test unstable. Usually, this deadlock only happens in a rare race condition. That's also why the test is not failing definitely. To guarantee reproducing the issue, we need to temporary suspend the RoutingTableProvider.onLiveInstanceChange methods after the HelixManager is locked. Then we call RoutingTableProvider.shutdown(). After the shutdown() thread is blocked, we can resume the onLiveInstanceChange event handling threads. And the test case or the 2 threads will be locked forever. ### Expected behavior We shall remove the deadlock. ### Additional context This bug directly causes https://github.com/apache/helix/issues/1284. I will close issue #1284 and use this issue to track the fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
