Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11070 )
Change subject: WIP: Atomicize OpId and Timestamp assignment ...................................................................... Patch Set 8: (1 comment) Noting the root of our timeouts. http://gerrit.cloudera.org:8080/#/c/11070/8/src/kudu/tablet/transactions/transaction_driver.cc File src/kudu/tablet/transactions/transaction_driver.cc: http://gerrit.cloudera.org:8080/#/c/11070/8/src/kudu/tablet/transactions/transaction_driver.cc@354 PS8, Line 354: return s; The changes from PS6 to PS8 are due to the fact that: - returning any sort of error will have HandleFailure called on it externally by the TransactionDriver::ExecuteAsync() - if the ReplicationFinished() callback is called between Replicate() and L372, the replication state copy may be REPLICATED, in which case ApplyAsync applies the txn, or REPLICATION_FAILED (eg because the tablet is shutting down), in which case ApplyAsync handles the error, unregistering the txn from the tracker This meant that if ReplicationFinished() was called but set the replication state to REPLICATION_FAILED, we were left with a transaction on the transaction tracker, and upon waiting for all transactions to finish (e.g. when stopping / deleting the tablet), we would wait forever. -- To view, visit http://gerrit.cloudera.org:8080/11070 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0d369423dd5c96b653b424c29744507edd874357 Gerrit-Change-Number: 11070 Gerrit-PatchSet: 8 Gerrit-Owner: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Comment-Date: Wed, 01 Aug 2018 00:19:54 +0000 Gerrit-HasComments: Yes
