Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13868 )

Change subject: IMPALA-8339: Add local executor blacklist to coordinators
......................................................................


Patch Set 2:

(31 comments)

http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/runtime/coordinator-backend-state.h@284
PS1, Line 284: ExecQu
> ExecQueryFInstance()
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/runtime/coordinator.cc@394
PS1, Line 394:       // The Exec() rpc failed, so blacklist the executor.
             :       LOG(INFO) << "Blacklisting "
             :                 << 
TNetworkAddressToString(backend_state->impalad_address()
> One interesting case is what if the coordinator cannot RPC to itself. I sup
Good point. Probably makes sense to have a special case for not blacklisting 
the local backend.


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h@96
PS1, Line 96: Only
> typo
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h@168
PS1, Line 168:   void UpdateMembership(const 
StatestoreSubscriber::TopicDeltaMap& incoming_topic_deltas,
> nit: Adds, Updates.
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h@170
PS1, Line 170:
> Should this method accept an IP address instead?

Discussed on another comment.


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h@206
PS1, Line 206: cutorGroups& executor_gr
> Is there any chance this mutex may be held at the same time as various othe
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.h@206
PS1, Line 206:       const ExecutorGroups& executor_groups, const 
ExecutorBlacklist& executor_blacklist);
> Does this also need the mutable modifier like the other two locks ?

No, 'mutable' is used to allow a field to be modified in a function that's 
marked 'const'. update_membership_lock_ isn't taken in any const functions.


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@119
PS1, Line 119:
             :   // Check if the local backend is up and needs updating.
> I suppose this relies on StateStore being alive and keeps sending periodic
That's a good point. This approach (maintaining the blacklist on the statestore 
update thread) is what we had put in the design doc, but since I've gone with a 
design that requires the lock 'update_membership_lock_' anyways, it might be 
nice to just put this in its own thread.

That would also make it easier to change the frequency that we check the 
blacklist for updates - we receive statestore updates every 100ms by default 
(most of which of course are no-ops) which might be more frequent than we 
really need to do this check (by default, the shortest time a node is 
blacklisted for is 12s anyways, probably not a big deal if we only do this 
update once per second or so)

It could remove having to reason about how to handle nodes that were 
deleted/updated/added in the same update as we decide to unblacklist them, 
which you had questions about below.


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@168
PS1, Line 168:     if (recovering_
> How would this work if the deleted item is in the list unblacklisted ?
I think this and your other similar concerns below are addressed by taking 
Lars's suggestion of only constructing 'unblacklisted' after the statestore 
update has been fully processed, since we call 'Remove' on the blacklist here.


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@175
PS1, Line 175: rship_);
> Otherwise, won't it eventually get into the
 > probation state and get added back with the stale BE descriptor ?

NodeBlacklist::Remove() completely removes executors from the blacklist, it 
doesn't put them on probation.


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@186
PS1, Line 186:   for (const TTopicItem& item : update.topic_entries) {
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@265
PS1, Line 265:     new_backend_map->insert(make_pair(item.key, be_desc))
> How would this work if the cluster membership in the past indicates that th
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@271
PS1, Line 271:       }
> I think the overall control flow could be simpler if we apply all SS update
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/cluster-membership-mgr.cc@330
PS1, Line 330: ClusterMembershipMgr::Bl
> Maybe spell out explicitly why it doesn't apply?
Done


http://gerrit.cloudera.org:8080/#/c/13868/2/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/13868/2/be/src/scheduling/cluster-membership-mgr.cc@468
PS2, Line 468:       const ExecutorGroups& executor_groups, const 
ExecutorBlacklist& executor_blacklist) {
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h
File be/src/scheduling/node-blacklist.h:

http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@30
PS1, Line 30:
> Can you please describe the thread safety of this class ? Is there any assu
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@38
PS1, Line 38:
> typo
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@48
PS1, Line 48:
> On the other hand the CMM is already pretty large
Yeah, I tend to agree with Lars that CMM would be a really big file, then. 
Don't feel strongly, though.


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@57
PS1, Line 57:
> Maybe FindAndRemove would make the usage more clear when calling this?
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@78
PS1, Line 78:
            :
            :
            :
> IMHO, this may be clearer if we use a enum instead to represent the state i
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.h@92
PS1, Line 92:
> Have you considered blacklisting whole hosts?
Of course, in a typical deployment its functionally equivalent since there's a 
1 to 1 mapping, but it would be a little more efficient.

You could imagine users who want to have a setup with multiple impalads on each 
node who would want blacklisting to be on a per-impalad basis, so that if one 
impalad on a node crashes the others aren't also blacklisted. That's probably 
not an important or supported use case though.

One big reason to do it this way is it makes testing of blacklisting on the 
minicluster much easier.

I renamed it ExecutorBlacklist to make it clearer (though I'm willing to 
explore ways to do testing with per-node blacklisting, eg. with the dockerized 
minicluster, if people think its better)

One thing we might consider is adding a concept of a backend_id, eg, a 
TUniqueId, that gets generated on impalad startup. That gives us a couple 
things:
- something nice to key off of here and elsewhere, eg. ExecutorGroup
- A way to distinguish the case when an impalad goes down and a new one is 
started at the same host:port


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.cc
File be/src/scheduling/node-blacklist.cc:

http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.cc@25
PS1, Line 25:
> What's the reasoning behind having this timeout padding ? May be worth docu
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.cc@42
PS1, Line 42:
            :
> Is it possible for multiple queries to attempt to blacklist the same backen
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.cc@83
PS1, Line 83:
> May be helpful to define this as a constant with meaningful name.
Done


http://gerrit.cloudera.org:8080/#/c/13868/1/be/src/scheduling/node-blacklist.cc@112
PS1, Line 112:
> std::move() or return values through pointer
Done


http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py
File tests/custom_cluster/test_blacklist.py:

http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@25
PS2, Line 25: class TestBlacklist(CustomClusterTestSuite):
> flake8: E302 expected 2 blank lines, found 1
Done


http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@36
PS2, Line 36:
> flake8: E203 whitespace before ','
Done


http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@36
PS2, Line 36: =
> flake8: E711 comparison to None should be 'if cond is None:'
Done


http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@52
PS2, Line 52:
> flake8: E203 whitespace before ','
Done


http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@61
PS2, Line 61:
> flake8: E203 whitespace before ','
Done


http://gerrit.cloudera.org:8080/#/c/13868/2/tests/custom_cluster/test_blacklist.py@61
PS2, Line 61: =
> flake8: E711 comparison to None should be 'if cond is None:'
Done



--
To view, visit http://gerrit.cloudera.org:8080/13868
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacb6e73b84042c33cd475b82470a975d04ee9b74
Gerrit-Change-Number: 13868
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Comment-Date: Tue, 23 Jul 2019 05:20:10 +0000
Gerrit-HasComments: Yes

Reply via email to