Jason Gustafson created KAFKA-10080:
---------------------------------------
Summary: IllegalStateException after duplicate CompleteCommit
append to transaction log
Key: KAFKA-10080
URL: https://issues.apache.org/jira/browse/KAFKA-10080
Project: Kafka
Issue Type: Bug
Reporter: Jason Gustafson
Assignee: Jason Gustafson
We noticed this exception in the logs:
{code}
java.lang.IllegalStateException: TransactionalId foo completing transaction
state transition while it does not have a pending state
at
kafka.coordinator.transaction.TransactionMetadata.$anonfun$completeTransitionTo$1(TransactionMetadata.scala:357)
at
kafka.coordinator.transaction.TransactionMetadata.completeTransitionTo(TransactionMetadata.scala:353)
at
kafka.coordinator.transaction.TransactionStateManager.$anonfun$appendTransactionToLog$3(TransactionStateManager.scala:595)
at
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at
kafka.coordinator.transaction.TransactionMetadata.inLock(TransactionMetadata.scala:188)
at
kafka.coordinator.transaction.TransactionStateManager.$anonfun$appendTransactionToLog$15$adapted(TransactionStateManager.scala:587)
at kafka.server.DelayedProduce.onComplete(DelayedProduce.scala:126)
at
kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:70)
at kafka.server.DelayedProduce.tryComplete(DelayedProduce.scala:107)
at
kafka.server.DelayedOperation.maybeTryComplete(DelayedOperation.scala:121)
at
kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:378)
at
kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:280)
at
kafka.cluster.DelayedOperations.checkAndCompleteAll(Partition.scala:122)
at
kafka.cluster.Partition.tryCompleteDelayedRequests(Partition.scala:1023)
at kafka.cluster.Partition.updateFollowerFetchState(Partition.scala:740)
{code}
After inspection, we found that there were two CompleteCommit entries in the
transaction state log which explains the failed transition. Indeed the logic
for writing the CompleteCommit message does seem prone to race conditions.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)