Hello Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/7440
to look at the new patch set (#4).
Change subject: disk failure: reassign failed tablets
......................................................................
disk failure: reassign failed tablets
Tablets are marked tablet::FAILED in a number of places (e.g. failing to
bootstrap) and will be marked tablet::FAILED_AND_SHUTDOWN upon disk
failure.
Currently, a tablet in either state is left alone and will respond to
heartbeats with TABLET_NOT_RUNNING messages, which indicate it is
responsive, but the tablet itself is not used for anything.
This patch changes this behavior to ensure that tablets in either state
will respond with the new TABLET_FAILED response that does not indicate
the tablet is responsive, promoting eviction.
Additionally, prior to this patch, tablets were set to FAILED when they
failed to delete metadata. This is no longer the case. Since error
statuses during deletion are only returned during IO to the metadata
directory, and because the metadata directory is a single point of
failure, failures in this codepath are made fatal for now. Once this is
no longer the case, these failures should be made benign, as proper
error handling should make files on the failed metadata directory
unreachable. This ensures the tablets that were meant to be deleted are
not reassigned.
This patch is a part of a series of patches to handle disk failure. See
section 2.5 in this doc:
https://docs.google.com/document/d/1zZk-vb_ETKUuePcZ9ZqoSK2oPvAAaEV1sjDXes8Pxgk/edit
Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
---
M src/kudu/client/scanner-internal.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/tserver.proto
8 files changed, 76 insertions(+), 40 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/40/7440/4
--
To view, visit http://gerrit.cloudera.org:8080/7440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f61585b02fbe270d215bf7f49c0d390ceee3345
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>