Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16830 )

Change subject: [consensus] KUDU-2181 Allow sending status msgs to FAILED peers
......................................................................


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16830/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16830/4//COMMIT_MSG@28
PS4, Line 28: - Shutdown the new ma
> My earlier plan was to split the Raft change and the test procedure. But th
I haven't looked in much depth, but I grepped around for FAILED_UNRECOVERABLE 
and some some tests in consensus_queue-test that test this state. Did you look 
into testing it there? Were those unit tests not a good fit for the targeted 
consensus-related functionality we're introducing here?


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/consensus/consensus_queue.cc@74
PS4, Line 74:
            : DEFINE_bool(consensus_allow_status_msg_for_failed_peer, false,
            :             "Allows status only Raft messages to be sent to a 
peer in FAILED_UNRECOVERABLE state.");
            : TAG_FLAG(consensus_allow_status_msg_for_failed_peer, 
experimental);
            : TAG_FLAG(consensus_allow_status_msg_for_failed_peer, runtime)
> Is the intention to restrict turning on this flag only for masters?

Yes, or more strongly, don't expose it as a flag at all, and instead pass it as 
an argument to RaftConsensus, and have only masters set that argument.


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc
File src/kudu/master/dynamic_multi_master-test.cc:

http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@139
PS4, Line 139:   // has been GC'ed.
> With external mini cluster the test will be very similar to the actual proc
Ah I see. Yeah I think the difficulty with IMC is providing different flags to 
each process. I do think it's possible in that both mini cluster types have 
Shutdown() and Restart() methods, but the flags may be more difficult to 
wrangle -- especially the runtime ones.


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@154
PS4, Line 154:
> Can't use ASSERT_OK() with a lambda that returns a non-void data type.
Ah, indeed I overlooked that this was a lambda.


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@416
PS4, Line 416:  new_master_->Shutdown();
             :     cluster_.reset();
             :
             :     LOG(INFO) << "Bringing up the migrated cluster";
             :     opts_.num_masters = orig_num_masters_ + 1;
             :     opts_.master_rpc_addresses = master_hps;
             :     ExternalMiniCluster migrated_cluster(opts_);
             :     ASSERT_OK(migrated_cluster.Start());
             :     for (int i = 0; i < migrated_cluster.num_masters(); i++) {
             :       
ASSERT_OK(migrated_cluster.master(i)->WaitForCatalogManager());
             :     }
             :
             :     // Verify the cluster still has the same 3 masters.
             :     {
             :       ListMastersResponsePB resp;
             :       NO_FATALS(RunListMaste
> This verification function is called after verifying that the new master ha
Right, though this iteration of the test isn't checking any non-system tablets 
either. Maybe I missed it -- why not use the ClusterVerifier? This code seems 
like it could be replaced with ClusterVerifier::CheckRowCount().



--
To view, visit http://gerrit.cloudera.org:8080/16830
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I142c1dec442ec72c38c5be9d62cdf270e441d6e3
Gerrit-Change-Number: 16830
Gerrit-PatchSet: 5
Gerrit-Owner: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 15 Dec 2020 21:56:15 +0000
Gerrit-HasComments: Yes

Reply via email to