Re: [PR] [WIP] KAFKA-14830: Illegal state error in transactional producer [kafka]

via GitHub Wed, 26 Mar 2025 12:56:32 -0700


kirktrue commented on code in PR #17022:
URL: https://github.com/apache/kafka/pull/17022#discussion_r2014904591



##########
clients/src/main/java/org/apache/kafka/clients/producer/internals/TransactionManager.java:
##########
@@ -667,14 +667,23 @@ public synchronized void 
maybeTransitionToErrorState(RuntimeException exception)
     }
 
     synchronized void handleFailedBatch(ProducerBatch batch, RuntimeException 
exception, boolean adjustSequenceNumbers) {
-        maybeTransitionToErrorState(exception);
+        boolean isStaleBatch = batch.producerId() == 
producerIdAndEpoch.producerId && batch.producerEpoch() < 
producerIdAndEpoch.epoch;

Review Comment:
   Thanks for the feedback @jolshan!
   
   > I'm wondering if there are any cases where producerIdAndEpoch could have a 
race -- or is case there the ID and epoch are the same but the issue still 
happens
   
   There are a couple of bug reports with logs. I'll dig through those to see 
if it's happened in the wild.
   
   > btw -- maybe not super common, but could the overflow case be missed here? 
(new producer id and epoch resets due to epoch reaching max value)
   
   Sounds super rare ;)
   
   If an epoch overflowed, wouldn't that just be interpreted as 'not equal' to 
the last known epoch, and thus trigger the "stale batch" logic? Perhaps my 
understanding of staleness is too naive?
   
   Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [WIP] KAFKA-14830: Illegal state error in transactional producer [kafka]

Reply via email to