urbandan opened a new pull request, #12392: URL: https://github.com/apache/kafka/pull/12392
…orting when a delivery timeout is encountered When a transactional batch encounters delivery timeout, it can still be in-flight. In this situation, if the transaction is aborted, the abort marker might get appended to the log earlier than the in-flight batch. This can cause the LSO of a partition to be blocked infinitely, or can violate the processing guarantees. To avoid this situation, on a delivery timeout, the transactional producer should skip aborting (EndTxnRequest), and bump the epoch instead. When the expected PRODUCER_FENCED error is received, the producer increases the epoch by 1, and retries the InitProducerID. If the 2nd init succeeds, the producer can continue with the increased epoch. Otherwise, the producer was fenced. To support this new kind of bump usage, the producer needs to make sure that the producer epoch is not close to exhaustion. The first bump, which will abort the ongoing transaction, always ends with a PRODUCER_FENCED error, thus the producer must bump a second time. If the epoch is exhausted, the producer cannot increase the epoch any further, since the coordinator is required to generate a new producer id for the transactional id. To avoid this situation, when the producer bumps the epoch without any error, it will check whether the epoch is close to exhaustion, and if it is, will try to bump the epoch enough times to force a newly generated producer id, with the epoch reset. This ensures that when the next delivery timeout occurs, the producer will be able to bump one time to continue work. *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers.* *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
