J-HowHuang opened a new pull request, #17494: URL: https://github.com/apache/pinot/pull/17494
## Description The current implementation of `PinotHelixTaskResourceManager` has a lot of synchronized methods, which introduces a potential global lock contention on the manager instance across all minion task related API resources (endpoints), see comments: https://github.com/apache/pinot/blob/c23a1e971dceafae424ccd1d6c5b829ecf3c8e78/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotHelixTaskResourceManager.java#L77-L81 For the API resources that call on any of these synchronized methods, it could become long-running if any other synchronized method is blocked. For example: https://github.com/apache/pinot/blob/c23a1e971dceafae424ccd1d6c5b829ecf3c8e78/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/PinotHelixTaskResourceManager.java#L284-L296 This method calls `_taskDriver.enqueueJob`, which writes to ZK with retries, and the retry mechanics is implemented by Helix. Some issues have been spotted with 24 Hrs with indefinite retry. Now any call to `PinotHelixTaskResourceManager`'s synchronized methods will be blocked. Therefore requests to these API endpoint may block the thread pool that controller used to handle API request. In this case, this thread pool: https://github.com/apache/pinot/blob/8333688e4ecec030b334f4d3239b6737ef17fdea/pinot-core/src/main/java/org/apache/pinot/core/util/ListenerConfigUtil.java#L239-L241 This would result in a total unresponsive controller, even the request of pinot UI. Therefore we need to isolate these long-running API resources from other crucial API resources. ## Change * Create a separate thread pool dedicated for potentially long-running minion task related API resources. * The size of the thread pool is the same as our current http handler thread pool (`grizzly-http-server-%d`) * Run the handler of these resources asynchronously, using Jersey's `@Suspended AsyncResponse` annotation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
