[ https://issues.apache.org/jira/browse/KAFKA-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16045170#comment-16045170 ]
ASF GitHub Bot commented on KAFKA-5415: --------------------------------------- GitHub user apurvam opened a pull request: https://github.com/apache/kafka/pull/3286 KAFKA-5415: Remove timestamp check in completeTransitionTo This assertion is hard to get right because the system time can roll backward on a host due to NTP (as shown in the ticket), and also because a transaction can start on one host and complete on another. Getting precise clock times across hosts is virtually impossible, and this check makes things fragile. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apurvam/kafka KAFKA-5415-avoid-timestamp-check-in-completeTransition Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3286.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3286 ---- commit ccf5217d5a5985e7e88b2794c5fe43ff5b1d8a58 Author: Apurva Mehta <apu...@confluent.io> Date: 2017-06-09T22:51:31Z Remove timestamp check in completeTransitionTo ---- > TransactionCoordinator doesn't complete transition to PrepareCommit state > ------------------------------------------------------------------------- > > Key: KAFKA-5415 > URL: https://issues.apache.org/jira/browse/KAFKA-5415 > Project: Kafka > Issue Type: Bug > Reporter: Apurva Mehta > Assignee: Apurva Mehta > Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > Attachments: 6.tgz > > > This has been revealed by the system test failures on jenkins. > The transaction coordinator seems to get into a path during the handling of > the EndTxnRequest where it returns an error (possibly a NOT_COORDINATOR or > COORDINATOR_NOT_AVAILABLE error, to be revealed by > https://github.com/apache/kafka/pull/3278) to the client. However, due to > network instability, the producer is disconnected before it receives this > error. > As a result, the transaction remains in a `PrepareXX` state, and future > `EndTxn` requests sent by the client after reconnecting result in a > `CONCURRENT_TRANSACTION` error code. Hence the client gets stuck and the > transaction never finishes, as expiration isn't done from a PrepareXX state. -- This message was sent by Atlassian JIRA (v6.3.15#6346)