[
https://issues.apache.org/jira/browse/IMPALA-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031305#comment-17031305
]
ASF subversion and git services commented on IMPALA-9342:
---------------------------------------------------------
Commit 9800f95c7efea0c613d02a4603c60696f3d3bf00 in impala's branch
refs/heads/master from Sahil Takiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=9800f95 ]
IMPALA-9342: Membership updates should only remove quiescing nodes from the
blacklist
Currently, the ClusterMembershipMgr will remove a node from the
blacklist whenever there is an "update" for a backend from the statestore.
Updates are typically restricted to updates about the quiescing status
of a node. The ClusterMembershipMgr should un-blacklist quiescing nodes
since quiescing nodes are not part of any executor groups and will
eventually be removed from the cluster membership. Thus, there is no
reason they need to remain on the blacklist.
However, other updates to a backend (e.g. updates that are not related
to the quiescing status of a node) should not cause that node to be
un-blacklisted. Doing so could cause a node to be un-blacklisted, but
not added back to any executor groups, creating a state where a node is
part of the cluster membership, but not part of any executor groups (or
the blacklist).
This patch fixes the aforementioned issue by only un-blacklisting an
updated node in ClusterMembershipMgr::UpdateMembership when the node
starts quiescing. Added some DCHECKs to ensure the consistency of the
blacklist and the list of executor groups.
Testing:
* Ran core tests
* Ran test_executor_groups.p, test_restart_services.py,
and test_blacklist.py with --exploration_strategy=exhaustive
locally
Change-Id: Id062e51df86315ac214d30db882736dbb7948a77
Reviewed-on: http://gerrit.cloudera.org:8080/15137
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Membership updates should only remove quiescing nodes from the blacklist
> ------------------------------------------------------------------------
>
> Key: IMPALA-9342
> URL: https://issues.apache.org/jira/browse/IMPALA-9342
> Project: IMPALA
> Issue Type: Sub-task
> Components: Backend
> Reporter: Sahil Takiar
> Assignee: Sahil Takiar
> Priority: Major
>
> {{ClusterMembershipMgr::UpdateMembership}} will remove a node from the
> blacklist (if it is on the blacklist) if the method receives an update from
> the Statestore about the node. Currently, the Statestore *should* only send
> an update about the node if the node starts quiescing. If a node starts
> quiescing, it should be removed from the blacklist since it quiescing nodes
> aren't part of any executor groups anyway (no queries should be scheduled on
> them).
> After running some experiments locally, it seems there are some other cases
> where the Statestore sends the {{ClusterMembershipMgr}} an update about a
> node even if it's quiescing state has not changed. Unfortunately, I haven't
> been able to fully track down what is triggering this, so far it only happens
> on cluster start up.
> The {{ClusterMembershipMgr}} should only un-blacklist a node if that node is
> quiescing, currently it un-blacklists a node on any update to the node.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]