Hello Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/7030
to look at the new patch set (#10).
Change subject: WIP disk failure: coordinate disk failure handling
......................................................................
WIP disk failure: coordinate disk failure handling
This patch adds the logic required to prevent a crash on disk failure.
Disk failure handling happens in a few places:
- block/container-level functions that call env functions that may
result in disk failure can run callbacks to fail/shutdown tablets in
the parent data dir
- tablet-level functions that CHECK for failures are ended early if the
tablet is known to have data on a bad disk
- transactions can now be canceled to force a shutdown of a tablet
replica instead of waiting for transactions to complete
- tablets in FAILED or the new FAILED_AND_SHUTDOWN state will trigger
replication
- failure at startup (IO to instance files) is covered in a later patch
A set of basic tests are added in ts_disk_failure-itest.
TODO:
- crash if tablet metadata dir is bad
Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
---
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/fs/error_manager.h
M src/kudu/fs/file_block_manager.cc
M src/kudu/fs/log_block_manager.cc
M src/kudu/master/sys_catalog.cc
M src/kudu/tablet/delta_tracker.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/mvcc.cc
M src/kudu/tablet/mvcc.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_replica.cc
M src/kudu/tablet/tablet_replica.h
M src/kudu/tablet/tablet_replica_mm_ops.cc
M src/kudu/tablet/transactions/alter_schema_transaction.h
M src/kudu/tablet/transactions/transaction.h
M src/kudu/tablet/transactions/transaction_driver.cc
M src/kudu/tablet/transactions/transaction_driver.h
M src/kudu/tablet/transactions/transaction_tracker-test.cc
M src/kudu/tablet/transactions/transaction_tracker.cc
M src/kudu/tablet/transactions/transaction_tracker.h
M src/kudu/tablet/transactions/write_transaction.cc
M src/kudu/tablet/transactions/write_transaction.h
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/CMakeLists.txt
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
A src/kudu/tserver/ts_disk_failure-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
M src/kudu/util/status.h
32 files changed, 511 insertions(+), 101 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/7030/10
--
To view, visit http://gerrit.cloudera.org:8080/7030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia03bfb711a1b022d7516f4adb37fe9fb28ec949c
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>