Hello David Ribeiro Alves, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6066

to look at the new patch set (#18).

Change subject: KUDU-1330: Add a tool to unsafely recover from loss of majority 
replicas
......................................................................

KUDU-1330: Add a tool to unsafely recover from loss of majority replicas

This patch adds an API to allow unsafe config change via an external
recovery tool 'kudu remote_replica unsafe_change_config'.

This tool lets us replace a N-replica config on a tablet server with a
new config containing N or less replicas. This is particularly useful
when we have majority of the replicas down and for some reason we are not
able to bring the tablet back online using other recovery tools like
'kudu remote_replica copy'. We can use this tool to force a new config on the
surviving replica providing all the replica uuids of the new config from
the cli tool. As a result of the forced config change, the automatic leader
election kicks in via raft mechanisms and the re-replication is triggered
from master (if needed due to under-replicated tablet) to bring the replica
count of the tablet back upto N.

How does the tool bring tablet back online with new config:
a) The tool acts as a 'fake' leader and generates the consensus update with
   a bumped up term along with the new config. The surviving node (leader or
   follower) accepts the request and replicates the request and goes through
   a pre-election phase in which a leader is elected among the nodes provided
   in the config. If the new config provides enough VOTERs to win an election,
   the leader election succeeds and the new config will be committed.
   Master can eventually recognize this consensus state change and make sure
   tablet is re-replicated back to healthy count if it finds the tablet
   under-replicated.
b) Assumption is that, the dead nodes are not coming back during this recovery,
   so master will very likely choose the new healthy live servers for
   re-replication if needed. If the dead nodes come back after master is
   updated with new unsafely forced config, master will delete the replicas
   on those dead nodes via DeleteTablet RPC because they are no longer part
   of the tablet config.

Also, the UnsafeChangeConfig() API adds a flag to append another change_config
op while there is one pending config in the log. This flag lifts the safety
net around pending configs which states that there can be only one pending
config at the max for a given replica.

This patch is a first in series for unsafe config changes, and assumes that
the dead servers are not coming back while the new config change is taking
effect. The future revs of this patch should weaken this assumption and build
more safety guarantees around situations dead nodes coming back during the
unsafe change config operations on the cluster.

Tests associated with this patch:
- Unsafe config change when there is one follower survivor in the cluster.
- Unsafe config change when there is one leader survivor in the cluster.
- Unsafe config change when the unsafe config contains 2 replicas.
- Unsafe config change on a 5-replica config with 2 replicas in the new config.
- Unsafe config change when there is a pending config on the surviving leader.
- Unsafe config change when there is a pending config on a surviving follower.
- Unsafe config change when there are back to back pending configs on WAL,
  and verify that tablet bootstraps fine.
- Test back to back unsafe config changes when there are multiple pending
  configs present with the replica and the one with 'sane' new config will
  bring the tablet back to online state.

TODO:
1) Test exercising all the error cases in the UnsafeChangeConfig API.
2) Test the UnsafeChangeConfig RPC directly without going via external tool.

Change-Id: I908d8c981df74d56dbd034e72001d379fb314700
---
M src/kudu/consensus/consensus.h
M src/kudu/consensus/consensus.proto
M src/kudu/consensus/consensus_meta.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/metadata.proto
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_state.cc
M src/kudu/consensus/time_manager.h
M src/kudu/integration-tests/cluster_itest_util.cc
M src/kudu/integration-tests/cluster_itest_util.h
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/tools/kudu-admin-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_remote_replica.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
17 files changed, 1,216 insertions(+), 111 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/66/6066/18
-- 
To view, visit http://gerrit.cloudera.org:8080/6066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I908d8c981df74d56dbd034e72001d379fb314700
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dinesh Bhat <dinesha...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <dral...@apache.org>
Gerrit-Reviewer: Dinesh Bhat <dinesha...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Tidy Bot

Reply via email to