jeffkbkim opened a new pull request, #13267: URL: https://github.com/apache/kafka/pull/13267
RPCProducerIdManager initiates an async request to the controller to grab a block of producer IDs and then blocks waiting for a response from the controller. This is done in the request handler threads while holding a global lock. This means that if many producers are requesting producer IDs and the controller is slow to respond, many threads can get stuck waiting for the lock. This patch aims to: * resolve the deadlock scenario mentioned above by not waiting for a new block and returning an error immediately * remove synchronization usages in RpcProducerIdManager.generateProducerId() * handle errors returned from generateProducerId() * confirm producers backoff before retrying ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org