Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8619 )
Change subject: WIP KUDU-1097 (patch 2&4): added 3-4-3 behavior for catalog manager ...................................................................... Patch Set 6: (14 comments) http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/consensus/quorum_util.h File src/kudu/consensus/quorum_util.h: http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/consensus/quorum_util.h@91 PS6, Line 91: // Check if the given Raft configuration needs an extra non-voter replica, : // targeted for future promotion to a voter replica in anticipation to replace : // failed or missing voter replicas in accordance with the specified replication : // factor. Return 'true' iff an extra non-voter is needed. I think this documentation can be reduced a lot because it's very specific about the implementation instead of the interface. How about something like: Returns true if the current cluster is under-replicated given the raft configuration and the included health status of the members. http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/consensus/quorum_util.h@95 PS6, Line 95: NeedNonVoterReplica nit: how about renaming this IsUnderReplicated() ? The idea is to detect that we are under-replicated so that we can prepare some remedy for that if it's detected. http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/consensus/quorum_util.h@101 PS6, Line 101: ReadyToEvictReplica nit: how about renaming this to CanEvictReplica() ? http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/consensus/quorum_util.cc File src/kudu/consensus/quorum_util.cc: http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/consensus/quorum_util.cc@411 PS6, Line 411: if ((peer.has_health_report() && : peer.health_report().has_overall_health() && : peer.health_report().overall_health() We can safely check peer.health_report().overall_health() without doing anything else. If it's not set it will default to UNKNOWN. So this block can be much more concise: case RaftPeerPB::VOTER: ++num_voters_total; if (peer.health_report().overall_health() == HealthReportPB::FAILED) || peer.attrs().replace()) { ++num_voters_need_replacement; } break; These same rules can apply all throughout this file. http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/consensus/quorum_util.cc@414 PS6, Line 414: peer.has_attrs() && no need to check has_attrs() http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/consensus/quorum_util.cc@432 PS6, Line 432: return replication_factor > (num_voters_total - num_voters_need_replacement) + are you counting non-voters with promote = true as potential voters for the under-replication check? I don't see it. Per the implementation doc, this is the invariant we want the master to try to enforce: replication factor == # healthy voters + # healthy non-voters with PROMOTE=true http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/master/catalog_manager.cc@228 PS6, Line 228: evicts nit: I'd suggest evict instead of evicts here, like giving the master a command to evict extra replicas also, this is a minor suggestion but how about excess instead of extra: catalog_manager_evict_excess_replicas excess implies waste which is how we're thinking about them in this context http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/master/catalog_manager.cc@3483 PS6, Line 3483: FLAGS_raft_prepare_replacement_before_eviction shouldn't this be negated? In fact, I think this should be a big if / else based on that flag, since we have system that is going in while the old one is still in place, like: // 7d(iv) if (!FLAGS_raft_prepare_replacement_before_eviction) { // We are using the traditional re-replication system. if (FLAGS_master_add_server_when_underreplicated && CountVoters(cstate.committed_config()) < table->metadata().state().pb.num_replicas()) { rpcs.emplace_back(new AsyncAddServerTask(master_, tablet, cstate, &rng_)); } } else { // We are using 3-4-3 re-replication. ... } http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/master/catalog_manager.cc@3495 PS6, Line 3495: 7d(v) Numbering this as 7d(v) to me implies it is a step after 7d(iv) which is misleading because AFAICT they should be mutually exclusive. http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/master/catalog_manager.cc@3498 PS6, Line 3498: New mode avoid the use of "new" because someday someone is going to read this code when it's 10 years old, and it will no longer be new :) http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/master/catalog_manager.cc@3501 PS6, Line 3501: evict already replaces failed replica evict an already-replaced failed replica http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/master/catalog_manager.cc@3502 PS6, Line 3502: as replacement as a replacement http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/master/catalog_manager.cc@3503 PS6, Line 3503: // The leader replica will promote the newly added non-voters once : // they are in good enough shape for that. Doesn't it seem like we are adding unnecessary detail about the overall design in this file? Perhaps we should tell readers to read the design doc for the details. http://gerrit.cloudera.org:8080/#/c/8619/6/src/kudu/master/catalog_manager.cc@3512 PS6, Line 3512: ReadyToEvictReplica(config, replication_factor, &to_evict)) { I think we want to evict first, if we can, since that means we are over-replicated -- To view, visit http://gerrit.cloudera.org:8080/8619 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6f0469ac641bf7a03dbef01eaa3f1b58a5bf5d27 Gerrit-Change-Number: 8619 Gerrit-PatchSet: 6 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Thu, 23 Nov 2017 03:14:28 +0000 Gerrit-HasComments: Yes
