Dinesh Bhat has posted comments on this change.

Change subject: WIP KUDU-1330: Add a tool to unsafely recover from loss of 
majority replicas
......................................................................


Patch Set 2:

(12 comments)

I am still addressing some refactoring/testing related to David's earlier 
feedbacks, so this update has addressed only Mike's review comments so far.

http://gerrit.cloudera.org:8080/#/c/6066/2//COMMIT_MSG
Commit Message:

PS2, Line 13: perticularly
> particularly
Done


Line 22: a) The API/tool acts as a fake leader mimicking the actual leader the 
node
> As mentioned in comments on raft_consensus.cc the tool should not impersona
I will probably post an update here after a quick testing, is the concern here 
that we may be picking an old leader if we rely on survivor's committed 
config's 'voted_for' field ? Even if we did, since we are bumping up the term, 
we know that this node is going to be elected as leader (assuming dead nodes 
not coming back).


Line 26: a) Assumption is that, the dead nodes are not coming back with a 
higher term,
> s/a/b/
Actually, by 'retained' I meant the the elected leader after election triggered 
as a result of config change in step a). I reworded this now.


Line 43: 2) Add a test case for when the node has a pending config change,
> +1 on this
I hadn't updated this list after my last commit, pls note that some of the test 
cases are already added, I updated the commit msg now to reflect current state.


http://gerrit.cloudera.org:8080/#/c/6066/2/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

PS2, Line 1559: client_cb
> No need to plumb in a callback if this is not an asynchronous method
Done


PS2, Line 1560: error_code
> Would be nice to use this
Done


Line 1600:   consensus_req.set_caller_uuid(leader_uuid);
> This should be the client UUID or a special value to indicate that this was
Do you have a concern on this approach ? Whether survivor is a single follower 
or a single leader, both of them would have had voted for the this leader. This 
approach works in both use cases.  I thought this is a simpler idea than using 
a new/fake uuid from tool. Note that whole change assumes that the dead nodes 
aren't coming back while this is in progress.


Line 1601:   consensus_req.set_caller_term(current_term);
> Use current term + 1
I didn't bump this to single leader use case, because the existing leader steps 
down and bails out from pre-election here without committing a new config: 
https://github.com/apache/kudu/blob/master/src/kudu/consensus/raft_consensus.cc#L401
However, I need to debug little more what other changes are needed if we were 
to bump up the term for single leader survivor. Do you think it's necessary to 
bump up the term for single leader survivor ?


Line 1603:   consensus_req.set_committed_index(last_op_id.index() + 1);
> Don't increment the committed index; leave it at whatever value it previous
Done


PS2, Line 1613: last_op_id.index() + 1
> This is repeated a few times in this function. Let's extract a variable to 
Done


PS2, Line 1615: time_manager_->GetSafeTime().value()
> Hmm. I would like David's input on this
I will poke David here, I don't exactly recall why I had to set here, perhaps 
due a CHECK failure down the line in UpdateReplica. It somewhere has a check 
for timestamp on a consensus request.


http://gerrit.cloudera.org:8080/#/c/6066/2/src/kudu/consensus/time_manager.cc
File src/kudu/consensus/time_manager.cc:

Line 121
> I don't think you need to remove this
In the case of single leader leader propgating the new config, he calls 
UpdateReplica on himself, hence I had to remove couple of DCHECKs as part of 
that. I also removed one on consensus_queue.cc


-- 
To view, visit http://gerrit.cloudera.org:8080/6066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I908d8c981df74d56dbd034e72001d379fb314700
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dinesh Bhat <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Dinesh Bhat <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

Reply via email to