jolshan opened a new pull request, #13579: URL: https://github.com/apache/kafka/pull/13579
[KAFKA-14561](https://github.com/apache/kafka/commit/56dcb837a2f1c1d8c016cfccf8268a910bb77a36) added verification to transactional produce requests to confirm an ongoing transaction. There is an edge case where the transaction is added, but the coordinator is still writing the state to the log. In this case, when verifying, we return CONCURRENT_TRANSACTIONS and retry. However, the next inflight batch is often successful because the write completes. When a partition has no entry in the PSM, it will allow any sequence number. This means if we retry the first write to the partition (or first write in a while) we will never be able to write it and get OutOfOrderSequence exceptions. This is a known issue. Since the verification makes this more common, I propose allowing verification on pending ongoing state. We will potentially have hanging transactions if the coordinator crashes before the writes complete, but this is better than endless out of order exceptions and is better than not verifying at all. (It is the best compromise) The good news is part 2 of KIP-890 will allow us to enforce that the first write for a transaction is sequence 0 and this issue will go away entirely. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org