kirktrue commented on code in PR #17022: URL: https://github.com/apache/kafka/pull/17022#discussion_r2014904591
########## clients/src/main/java/org/apache/kafka/clients/producer/internals/TransactionManager.java: ########## @@ -667,14 +667,23 @@ public synchronized void maybeTransitionToErrorState(RuntimeException exception) } synchronized void handleFailedBatch(ProducerBatch batch, RuntimeException exception, boolean adjustSequenceNumbers) { - maybeTransitionToErrorState(exception); + boolean isStaleBatch = batch.producerId() == producerIdAndEpoch.producerId && batch.producerEpoch() < producerIdAndEpoch.epoch; Review Comment: Thanks for the feedback @jolshan! > I'm wondering if there are any cases where producerIdAndEpoch could have a race -- or is case there the ID and epoch are the same but the issue still happens There are a couple of bug reports with logs. I'll dig through those to see if it's happened in the wild. > btw -- maybe not super common, but could the overflow case be missed here? (new producer id and epoch resets due to epoch reaching max value) Sounds super rare ;) If an epoch overflowed, wouldn't that just be interpreted as 'not equal' to the last known epoch, and thus trigger the "stale batch" logic? Perhaps my understanding of staleness is too naive? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org