kirktrue commented on code in PR #13591:
URL: https://github.com/apache/kafka/pull/13591#discussion_r1173167778


##########
clients/src/main/java/org/apache/kafka/clients/producer/internals/TransactionManager.java:
##########
@@ -968,13 +968,31 @@ private void transitionTo(State target) {
     }
 
     private void transitionTo(State target, RuntimeException error) {
+        transitionTo(target, error, false);

Review Comment:
   > I think changing the default behavior to not throw can cause issues in 
some calls:
   > 
   > 1. TransactionManager.InitProducerIdHandler#handleResponse on line 1303 - 
lastError is explicitly set to null (which shouldn't be done at all, as 
transitionTo already does that if the state transition is valid), which will 
clear the latest error. I think to make this work, that lastError = null should 
be removed from line 1303
   
   I've removed that line. I am not sure why it's there, but `git blame` says 
it's been there since 2017 😬 
   
   > 2. This is a call chain where we transition on direct user action, 
shouldn't this be throwing? KafkaProducer.send -> KafkaProducer.doSend -> 
maybeTransitionToErrorState -> transitionToAbortableError -> transitionTo
   
   Since `maybeTransitionToErrorState` is being called from inside a `catch` 
block, if we did throw an exception, it would mask the root issue 
(`ApiException`), right?
   
   > 3. In TransactionManager.TxnOffsetCommitHandler#handleResponse, there are 
multiple
   > 
   > ```
   > abortableError(...);
   > break;
   > ```
   > 
   > blocks. If abortableError does not throw on invalid state transition 
anymore, the txn commit will be retried, even when in a failed state, which 
doesn't seem correct.
   
   After the `abortableError` call, if the state transition was invalid, then 
`currentState` would be `FATAL_ERROR`. That has the same effect as the last two 
branches of the `if` statement that call the `fatalError` method, right?
   
   Would simply skipping the reenqueue on fatal errors be sufficient?
   
   ```
   if (result.isCompleted()) {
       pendingTxnOffsetCommits.clear();
   } else if (pendingTxnOffsetCommits.isEmpty()) {
       result.done();
   } else {
       if (!hasFatalError()) {
           // Retry the commits which failed with a retriable error
           reenqueue();
       }
   }
   ```
   
   I don't yet understand why we'd want to re-enqueue after those calls to 
`fatalError` anyway 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to