[kudu-CR] [master][consensus] Procedure for copying system catalog

Bankim Bhavsar (Code Review) Tue, 15 Dec 2020 13:12:00 -0800

Bankim Bhavsar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16830 )


Change subject: [master][consensus] Procedure for copying system catalog
......................................................................


Patch Set 4:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/16830/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16830/4//COMMIT_MSG@28
PS4, Line 28: One functional change
> nit: it seems this patch should focus more on this aspect, rather than bein
My earlier plan was to split the Raft change and the test procedure. But the 
test procedure looks like the best way to test the Raft change.
I'll update the summary and update the description to highlight the Raft change.


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/consensus/consensus_queue.cc@74
PS4, Line 74:
            : DEFINE_bool(consensus_allow_status_msg_for_failed_peer, false,
            :             "Allows status only Raft messages to be sent to a 
peer in FAILED_UNRECOVERABLE state.");
            : TAG_FLAG(consensus_allow_status_msg_for_failed_peer, hidden);
            : TAG_FLAG(consensus_allow_status_msg_for_failed_peer, unsafe);
> Do we actually intend on using this flag for our documented procedure to 
> recover masters? If so, it shouldn't be unsafe. Also mark it runtime?

Ack.

> Alternatively, would it make sense to pass some option to the Raft 
> implementation from the master that enables this?

I didn't understand this. Is the intention to restrict turning on this flag 
only for masters?
One way I can think about doing is honoring this flag only if running in master 
or clobbering it to false if not running in a master.


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc
File src/kudu/master/dynamic_multi_master-test.cc:

http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@139
PS4, Line 139:   void StartClusterWithSysCatalogGCed(vector<HostPort>* 
master_hps,
> I seem to recall discussing the need to GC in order to have the replica _no
With external mini cluster the test will be very similar to the actual 
procedure of adding/removing masters. Also I remember when starting to work on 
the test, things like adding a new master once the IMC is up, restarting a 
subset of processes didn't look straightforward with IMC.

So apart from the roundabout way of forcing GC, overall external mini cluster 
does look like a better option.


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@154
PS4, Line 154: CHECK_OK
> nit: ASSERT_OK?
Can't use ASSERT_OK() with a lambda that returns a non-void data type.


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@171
PS4, Line 171:     int64_t time_left_to_sleep_msecs = 2000;
> nit: more idiomatic to use a MonoTime as a deadline and use MonoTime::Now()
Done


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@177
PS4, Line 177: ASSERT_GT(time_left_to_sleep_msecs, 0)
             :       << "Timed out waiting for system catalog WAL to be GC'ed";
> nit: how about assert on the GC count? Since, as long as we GCed, it's not
Done


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@343
PS4, Line 343:     ASSERT_EVENTUALLY([&] {
> nit: I typically find it cleaner to have callers do the ASSERT_EVENTUALLY()
Done


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@416
PS4, Line 416:  LOG(INFO) << "Verifying the first table";
             :     // Verify the table created before adding the new master.
             :     shared_ptr<KuduTable> table;
             :     ASSERT_OK(client->OpenTable(kTableName, &table));
             :     KuduScanner scanner1(table.get());
             :     size_t row_count = 0;
             :     ASSERT_OK(client::CountRowsWithRetries(&scanner1, 
&row_count));
             :     ASSERT_EQ(0, row_count);
             :
             :     LOG(INFO) << "Creating and verifying the second table";
             :     // Perform an operation that requires replication to masters.
             :     ASSERT_OK(CreateTable(&migrated_cluster, "second_table"));
             :     ASSERT_OK(client->OpenTable("second_table", &table));
             :     KuduScanner scanner2(table.get());
             :     ASSERT_OK(client::CountRowsWithRetries(&scanner2, 
&row_count));
             :     ASSERT_EQ(0, row_count);
> We're still not necessarily testing that the new master is caught up with t
This verification function is called after verifying that the new master has 
been promoted to a VOTER. So yeah perhaps not completely caught but good enough 
from Raft point of view.

I took a look at ClusterVerifier the log verifier step currently only checks 
for non-system tablets on tablet servers. So could potentially be extended.

>From the user point of view it's important that the new master can become 
>leader, so instead opted for transferring leadership to the new master.


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@433
PS4, Line 433:     LOG(INFO) << "Pausing and resuming individual masters";
> nit: move this into the if block?
Done


http://gerrit.cloudera.org:8080/#/c/16830/4/src/kudu/master/dynamic_multi_master-test.cc@588
PS4, Line 588:   SKIP_IF_SLOW_NOT_ALLOWED();
             :
             :   vector<HostPort> master_hps;
             :   NO_FATALS(StartClusterWithSysCatalogGCed(&master_hps,
             :                                            
{"--consensus_allow_status_msg_for_failed_peer"}));
             :   ASSERT_OK(CreateTable(cluster_.get(), kTableName));
             :   // Bring up the new master and add to the cluster.
             :   master_hps.emplace_back(reserved_hp_);
             :   NO_FATALS(StartNewMaster(master_hps));
             :   ASSERT_OK(AddMasterToCluster(reserved_hp_));
             :
             :   // Newly added master will be added to the master Raft config 
but won't be caught up
             :   // from the WAL and hence remain as a NON_VOTER.
             :   ListMastersResponsePB list_resp;
             :   NO_FATALS(RunListMasters(&list_resp));
             :   ASSERT_EQ(orig_num_masters_ + 1, list_resp.masters_size());
             :
             :   for (const auto& master : list_resp.masters()) {
             :     ASSERT_EQ(1, master.registration().rpc_addresses_size());
             :     auto hp = 
HostPortFromPB(master.registration().rpc_addresses(0));
             :     if (hp == reserved_hp_) {
             :       ASSERT_EQ(consensus::RaftPeerPB::NON_VOTER, 
master.member_type());
             :       ASSERT_TRUE(master.role() == 
consensus::RaftPeerPB::LEARNER);
             :     }
             :   }
             :
             :   // Double check by directly contacting the new master.
             :   NO_FATALS(VerifyNewMasterDirectly({ 
consensus::RaftPeerPB::LEARNER },
             :                                     
consensus::RaftPeerPB::NON_VOTER));
             :
             :   // Verify new master is in FAILED_UNRECOVERABLE state.
             :   NO_FATALS(VerifyNewMasterInFailedUnrecoverableState());
> If this is for the most part a copy TestAddMasterCatchupFromWALNotPossible,
That's a good point, the TestAddMasterCatchupFromWALNotPossible can now be 
removed.



--
To view, visit http://gerrit.cloudera.org:8080/16830
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I142c1dec442ec72c38c5be9d62cdf270e441d6e3
Gerrit-Change-Number: 16830
Gerrit-PatchSet: 4
Gerrit-Owner: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 15 Dec 2020 21:11:46 +0000
Gerrit-HasComments: Yes

[kudu-CR] [master][consensus] Procedure for copying system catalog

Reply via email to