Mike Percy has posted comments on this change. Change subject: KUDU-1330: Add a tool to unsafely recover from loss of majority replicas ......................................................................
Patch Set 14: (11 comments) Mostly minor feedback, still looking carefully at the tests but wanted to give some earlier feedback. http://gerrit.cloudera.org:8080/#/c/6066/14/src/kudu/consensus/metadata.proto File src/kudu/consensus/metadata.proto: Line 93: optional bool unsafe_config_change = 4; do we need to give this a default of false? http://gerrit.cloudera.org:8080/#/c/6066/14/src/kudu/consensus/raft_consensus.cc File src/kudu/consensus/raft_consensus.cc: Line 616: if (!new_config.unsafe_config_change()) { I think we need a default for this field or we first need to check has_unsafe_config_change() everywhere we use it. PS14, Line 1612: Peer uuid $0 is not found on original Peer with UUID $0 is not in the committed PS14, Line 1613: the node this replica Line 1617: if (!IsRaftConfigVoter(peer_uuid, committed_config)) { I don't think this voter check is necessary. What we should do here is check that the *local* replica is a voter in a final 'new_config' that we construct below. PS14, Line 1653: This ensures that current node replicates the new config proposed, This makes this request appear to come from a new leader that the local replica doesn't know about yet. PS14, Line 1654: node local replica PS14, Line 1654: it would step down before replicating this will cause it to step down http://gerrit.cloudera.org:8080/#/c/6066/14/src/kudu/consensus/raft_consensus_state.cc File src/kudu/consensus/raft_consensus_state.cc: PS14, Line 257: FLAGS_fault_crash_before_append_commit Don't reuse this unrelated fault flag for this test. That flag is for the WAL. Just define a new one. Also, do we really need the fault flag to be inside this conditional for the test or can we remove the 'if' around this? If so, I think a good place to put it would be in ConsensusMetadata::Flush() and we could call it fault_crash_before_flush_consensus_meta perhaps. http://gerrit.cloudera.org:8080/#/c/6066/14/src/kudu/consensus/time_manager.h File src/kudu/consensus/time_manager.h: Line 132: friend class RaftConsensus; Intentional? http://gerrit.cloudera.org:8080/#/c/6066/14/src/kudu/tools/tool_action_remote_replica.cc File src/kudu/tools/tool_action_remote_replica.cc: PS14, Line 431: Enforce raft config on the tablet server with a new config Using the word enforce in this context sounds strange to me. How about: Force the specified replica to adopt a new Raft config. The members of the new Raft config must be a subset of (or the same as) the members of the existing committed Raft config on that replica. -- To view, visit http://gerrit.cloudera.org:8080/6066 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I908d8c981df74d56dbd034e72001d379fb314700 Gerrit-PatchSet: 14 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dinesh Bhat <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Dinesh Bhat <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
