Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14222 )

Change subject: KUDU-2069 p4: stop replication from failed servers in 
maintenance mode
......................................................................


Patch Set 3:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/14222/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14222/2//COMMIT_MSG@10
PS2, Line 10: llowed to be in a "ba
> nit: maybe 'can be in a bad state (e.g. due to in maintenance mode)'? To ex
Done

Also worth noting that maintenance mode is not itself a bad state. It is 
protection against being considered a bad state.


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util-test.cc
File src/kudu/consensus/quorum_util-test.cc:

http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util-test.cc@636
PS2, Line 636:     EXPECT_FALSE(ShouldAddReplica(config, 5, { "A", "B" }));
> What about tests scenarios for ShouldRemoveReplica() when some of the table
This is somewhat white-box knowledge, but the "whitelist" isn't even part of 
the `ShouldAddReplica()` API, so I'm not sure testing it here makes sense.


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util-test.cc@648
PS2, Line 648: ig, "B", V, '+
> server with unknown health
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util-test.cc@658
PS2, Line 658: // should honor the replacement and try to add
> nit: should this comment be moved to L664?
Ack


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util-test.cc@689
PS2, Line 689:     EXPECT_FALSE(ShouldAddReplica(config, 5, { "A", "B" }));
> Does it make sense to add the following cases for replication factor 3:
Sure, I'll add those cases.

Added some coverage for non-voters.


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util.h
File src/kudu/consensus/quorum_util.h:

http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util.h@122
PS2, Line 122: set the UUID of the best candidate
> there is a quorum
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util.h@132
PS2, Line 132:
> nit: would '{}' suffice here instead of 'std::unordered_set<std::string>()'
I get an error when I try that: "chosen constructor is explicit in 
copy-initialization".


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util.cc
File src/kudu/consensus/quorum_util.cc:

http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/consensus/quorum_util.cc@456
PS2, Line 456: ) << Substitute("ignor
> Ah, just another observation.  Probably, I should have mentioned that earli
I updated my design doc with some notes on this:

Option 1: the rebalancing tool and the move tool will continue to work even 
during maintenance mode
REPLACE replicas _will_ continue to count towards underreplication when 
deciding to add replicas, even when in maintenance mode.
* There would be no changes to determining whether we should evict. In this 
way, maintenance mode does not affect the move tool at all.
* The downside here is that it may mean replication traffic. Using maintenance 
mode is generally a signal that something is wrong or needs to be fixed in a 
cluster/tserver. It thus behooves us to make sure there are fewer moving parts 
where we can.

Option 2: the rebalancing tool and the move tool will set REPLACE and PROMOTE, 
but those will not be honored by the master -- the master err on the side of 
not evicting and not adding replicas.
* When determining whether to add replicas, replicas marked REPLACE will not 
count towards being underreplicated, while those marked PROMOTE will count if 
healthy.
* When determining whether to remove replicas, replicas marked REPLACE will be 
prioritized as healthy (or even lower, so we truly never evict nodes in 
maintenance mode)
* The major downside here is that while the goal itself is clear, the exact 
semantics aren’t quite clear to articulate, especially when considering whether 
to evict (see ShouldEvictReplica() in quorum_util.cc)

Verdict: There is a clearer implementation path towards Option 1, and the 
semantics seem less surprising, when considering that operators may run tools 
manually. If we really want to avoid rebalancing during maintenance mode, it 
doesn’t seem unreasonable to implement checks in the rebalancer to check for 
maintenance mode.


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc
File src/kudu/integration-tests/maintenance_mode-itest.cc:

http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc@19
PS2, Line 19: #include <memory>
> nit: remove the extra line
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc@87
PS2, Line 87: pts.num_
> nit: 'Perform' to align with the rest.
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc@99
PS2, Line 99:     TServerStateChangePB* state_change = req.mutable_change();
> This is a proxy for the question "did we rereplicate anything", right? Woul
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc@151
PS2, Line 151:         for (int p = 0; p < committed_config.peers_size(); p++) {
> Don't need this?
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc@156
PS2, Line 156:
> Does it make sense to add a scenario to verify that it's possible to move r
Changed to a different name.


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc@156
PS2, Line 156:
> To clarify, my comment was about the manual movement of replicas from the n
Ack


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc@195
PS2, Line 195:   NO_FATALS(SetUpCluster(3));
> Another reasons for re-replication are:
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc@208
PS2, Line 208:   SKIP_IF_SLOW_NOT_ALLOWED();
> Nit: don't need to initialize this.
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/integration-tests/maintenance_mode-itest.cc@229
PS2, Line 229: st string& maintenance_uuid = maintenance_ts->uuid();
             :   ASSERT_OK(ChangeTServerState(
> Should we add a test that re-replication can still happen to other tservers
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/master/catalog_manager.cc@4190
PS2, Line 4190:       // health, we ignore reports from non-leaders in this 
case. Also, making
              :       // the changes recommended by Should{Add,Evict}Replica() 
assumes
> +1
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/master/ts_manager.h
File src/kudu/master/ts_manager.h:

http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/master/ts_manager.h@92
PS2, Line 92: ement(TSDescriptorVector* descs) const;
            :
> nit: can you elaborate a bit in which cases this should happen.
Done


http://gerrit.cloudera.org:8080/#/c/14222/2/src/kudu/master/ts_manager.h@94
PS2, Line 94:   // Return any tablet servers UUIDs that can be in a failed 
state without
> If O(num tservers in maintenance mode) is relatively high, GetWhitelistedUu
We chatted a bit offline about this and the locking argument goes away given 
the changes in part 1 of this patch set (now, it's a single lock acquisition to 
generate the ignored set).

I updated the name to "UuidsToIgnoreForUnderreplication", in an attempt to 
decouple consensus code from maintenance mode.



--
To view, visit http://gerrit.cloudera.org:8080/14222
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9a63b55011d16900c0d27eac0eb75880074204db
Gerrit-Change-Number: 14222
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 25 Sep 2019 23:59:50 +0000
Gerrit-HasComments: Yes

Reply via email to