Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/13868 )
Change subject: IMPALA-8339: Add local executor blacklist to coordinators ...................................................................... Patch Set 2: (31 comments) http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/runtime/coordinator-backend-state.h@284 PS1, Line 284: ExecQu > ExecQueryFInstance() Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/runtime/coordinator.cc@394 PS1, Line 394: // The Exec() rpc failed, so blacklist the executor. : LOG(INFO) << "Blacklisting " : << TNetworkAddressToString(backend_state->impalad_address() > One interesting case is what if the coordinator cannot RPC to itself. I sup Good point. Probably makes sense to have a special case for not blacklisting the local backend. http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h File be/src/scheduling/cluster-membership-mgr.h: http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h@96 PS1, Line 96: Only > typo Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h@168 PS1, Line 168: void UpdateMembership(const StatestoreSubscriber::TopicDeltaMap& incoming_topic_deltas, > nit: Adds, Updates. Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h@170 PS1, Line 170: > Should this method accept an IP address instead? Discussed on another comment. http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h@206 PS1, Line 206: cutorGroups& executor_gr > Is there any chance this mutex may be held at the same time as various othe Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h@206 PS1, Line 206: const ExecutorGroups& executor_groups, const ExecutorBlacklist& executor_blacklist); > Does this also need the mutable modifier like the other two locks ? No, 'mutable' is used to allow a field to be modified in a function that's marked 'const'. update_membership_lock_ isn't taken in any const functions. http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc File be/src/scheduling/cluster-membership-mgr.cc: http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@119 PS1, Line 119: : // Check if the local backend is up and needs updating. > I suppose this relies on StateStore being alive and keeps sending periodic That's a good point. This approach (maintaining the blacklist on the statestore update thread) is what we had put in the design doc, but since I've gone with a design that requires the lock 'update_membership_lock_' anyways, it might be nice to just put this in its own thread. That would also make it easier to change the frequency that we check the blacklist for updates - we receive statestore updates every 100ms by default (most of which of course are no-ops) which might be more frequent than we really need to do this check (by default, the shortest time a node is blacklisted for is 12s anyways, probably not a big deal if we only do this update once per second or so) It could remove having to reason about how to handle nodes that were deleted/updated/added in the same update as we decide to unblacklist them, which you had questions about below. http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@168 PS1, Line 168: if (recovering_ > How would this work if the deleted item is in the list unblacklisted ? I think this and your other similar concerns below are addressed by taking Lars's suggestion of only constructing 'unblacklisted' after the statestore update has been fully processed, since we call 'Remove' on the blacklist here. http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@175 PS1, Line 175: rship_); > Otherwise, won't it eventually get into the > probation state and get added back with the stale BE descriptor ? NodeBlacklist::Remove() completely removes executors from the blacklist, it doesn't put them on probation. http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@186 PS1, Line 186: for (const TTopicItem& item : update.topic_entries) { > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@265 PS1, Line 265: new_backend_map->insert(make_pair(item.key, be_desc)) > How would this work if the cluster membership in the past indicates that th Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@271 PS1, Line 271: } > I think the overall control flow could be simpler if we apply all SS update Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@330 PS1, Line 330: ClusterMembershipMgr::Bl > Maybe spell out explicitly why it doesn't apply? Done http://gerrit.cloudera.org:8080/#/c/13868/2/be/src/scheduling/cluster-membership-mgr.cc File be/src/scheduling/cluster-membership-mgr.cc: http://gerrit.cloudera.org:8080/#/c/13868/2/be/src/scheduling/cluster-membership-mgr.cc@468 PS2, Line 468: const ExecutorGroups& executor_groups, const ExecutorBlacklist& executor_blacklist) { > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h File be/src/scheduling/node-blacklist.h: http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@30 PS1, Line 30: > Can you please describe the thread safety of this class ? Is there any assu Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@38 PS1, Line 38: > typo Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@48 PS1, Line 48: > On the other hand the CMM is already pretty large Yeah, I tend to agree with Lars that CMM would be a really big file, then. Don't feel strongly, though. http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@57 PS1, Line 57: > Maybe FindAndRemove would make the usage more clear when calling this? Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@78 PS1, Line 78: : : : > IMHO, this may be clearer if we use a enum instead to represent the state i Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@92 PS1, Line 92: > Have you considered blacklisting whole hosts? Of course, in a typical deployment its functionally equivalent since there's a 1 to 1 mapping, but it would be a little more efficient. You could imagine users who want to have a setup with multiple impalads on each node who would want blacklisting to be on a per-impalad basis, so that if one impalad on a node crashes the others aren't also blacklisted. That's probably not an important or supported use case though. One big reason to do it this way is it makes testing of blacklisting on the minicluster much easier. I renamed it ExecutorBlacklist to make it clearer (though I'm willing to explore ways to do testing with per-node blacklisting, eg. with the dockerized minicluster, if people think its better) One thing we might consider is adding a concept of a backend_id, eg, a TUniqueId, that gets generated on impalad startup. That gives us a couple things: - something nice to key off of here and elsewhere, eg. ExecutorGroup - A way to distinguish the case when an impalad goes down and a new one is started at the same host:port http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.cc File be/src/scheduling/node-blacklist.cc: http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.cc@25 PS1, Line 25: > What's the reasoning behind having this timeout padding ? May be worth docu Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.cc@42 PS1, Line 42: : > Is it possible for multiple queries to attempt to blacklist the same backen Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.cc@83 PS1, Line 83: > May be helpful to define this as a constant with meaningful name. Done http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.cc@112 PS1, Line 112: > std::move() or return values through pointer Done http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py File tests/custom_cluster/test_blacklist.py: http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@25 PS2, Line 25: class TestBlacklist(CustomClusterTestSuite): > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@36 PS2, Line 36: > flake8: E203 whitespace before ',' Done http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@36 PS2, Line 36: = > flake8: E711 comparison to None should be 'if cond is None:' Done http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@52 PS2, Line 52: > flake8: E203 whitespace before ',' Done http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@61 PS2, Line 61: > flake8: E203 whitespace before ',' Done http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@61 PS2, Line 61: = > flake8: E711 comparison to None should be 'if cond is None:' Done -- To view, visit http://gerrit.cloudera.org:8080/13868 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iacb6e73b84042c33cd475b82470a975d04ee9b74 Gerrit-Change-Number: 13868 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Comment-Date: Tue, 23 Jul 2019 05:20:10 +0000 Gerrit-HasComments: Yes
