lucasbru opened a new pull request, #18765: URL: https://github.com/apache/kafka/pull/18765
The exception stack trace shown in the ticket can happen when we are concurrently closing the producer because of an error and doing a regular close. This is not a bug in the test, but a real race condition that can happen in the production code. The sequence is this: Thread1: Enter PENDING_ERROR Thread2: Check if state is already ERROR Thread1: Transition to ERROR Thread2: Check if state is already PENDING_ERROR Thread2: Transition to PENDING_SHUTDOWN One idea to fix this would be to synchronize the sequence performed by Thread1 using the state lock (this seems to be suggested in the ticket). However, this would require larger changes, since we cannot use the normal state transition method `setState` while owning the lock, as it calls user-defined callbacks. Calling the user-defined callback while owning the lock may create deadlocks. Also, the code would become more complex, since we cannot wait for state transitions while owning the lock - it's not as simple as putting a `synchronized(stateLock)` around everything. To avoid adding more synchronization, this change proposes to fix it by _first_ attempting to transition to PENDING_SHUTDOWN, and _then_ checking whether another thread is already attempting to shut down (states PENDING_SHUTDOWN, PENDING_ERROR, ERROR, NOT_RUNNING), if the state transition fails. This way, we can react correctly to an already shutting down an application. `setState` acts as a test-and-set operation that prevents multiple threads from entering the critical section which starts the shutdown helper. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org