Hello Tidy Bot, Mike Percy, David Ribeiro Alves, Kudu Jenkins, Todd Lipcon,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/7439
to look at the new patch set (#22).
Change subject: mvcc: allow tablet shutdown without completing txs
......................................................................
mvcc: allow tablet shutdown without completing txs
Currently, the only way to stop an Applying transaction is to wait for
it to finish and Commit it. This constraint was put in place to
guarantee on-disk correctness, but is sometimes too strict. E.g. if the
tablet is shutting down, the Apply doesn't need to finish.
This patch adds the ability to "stop" a tablet by freezing its MVCC
manager. Once this happens, new Applies will return and not move to the
Commit phase, and any methods waiting for the tablet's Applies to Commit
(e.g. new snapshot scans, FlushMRS) will respond with an error
immediately. Applies that are already underway may still Commit, but
these Committed operations are inconsequential w.r.t. consistency;
having some in-flight transactions Commit and others not is consistent
with the server crashing in between the Commits of two transactions.
Additionally, once the MVCC manager is frozen, new transactions will
abort immediately before even reaching the Prepare phase.
This will be particularly useful in handling disk failures: if a tablet
needs to be shut down due to disk failure, its MVCC manager can be
frozen immediately, allowing currently-Applying transactions to abort
without Committing.
This patch only includes the behavior when shutting down a tablet, with
the assumption that a tablet will only be shut down when it's being
deleted and we don't care too much about its in-flight transactions
Committing. Code paths that previously crashed Kudu if Applies did not
succeed will now not crash if the MVCC manager is frozen and log a
warning instead.
Testing is done by adding the following:
- a test in mvcc-test to shut down MVCC and delete an Applying
transaction, ensuring that there are no errors when it leaves scope.
- a test in mvcc-test to wait on an Applying transaction, shut down
MVCC, and ensure that any waiters will return with an error.
- a new test stop_tablet-itest is added to ensure stopped leaders don't
complete writes and stopped followers do, and stopped tablets don't
prevent fault-tolerant scans
Change-Id: I983620f27e7226806a2cca253db7619731914d42
---
M src/kudu/integration-tests/CMakeLists.txt
A src/kudu/integration-tests/stop_tablet-itest.cc
M src/kudu/tablet/local_tablet_writer.h
M src/kudu/tablet/mvcc-test.cc
M src/kudu/tablet/mvcc.cc
M src/kudu/tablet/mvcc.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica_mm_ops.cc
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
M src/kudu/tablet/transactions/write_transaction.cc
M src/kudu/tserver/tablet_service.cc
15 files changed, 507 insertions(+), 80 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/39/7439/22
--
To view, visit http://gerrit.cloudera.org:8080/7439
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I983620f27e7226806a2cca253db7619731914d42
Gerrit-Change-Number: 7439
Gerrit-PatchSet: 22
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>