Github user LucaCanali commented on the issue:
https://github.com/apache/spark/pull/19039
Thanks @jiangxb1987 for the review. I have tried to address the comments in
a new commit, in particular adding the configuration to internal/config and
building a private function to handle processing of the node list in
`spark.blacklist.alwaysBlacklistedNodes`. As for setting `_nodeBlacklist` I
think it makes sense to use
`_nodeBlacklist.set(nodeIdToBlacklistExpiryTime.keySet.toSet)` to keep it
consistent with the rest of the code in BlacklistTracker. Also
`nodeIdToBlacklistExpiryTime` needs to be initialized with the blacklisted
nodes.
As for the usefulness of the feature, I understand your comment and I have
added some comments in SPARK-21829. The need for this feature for me comes from
a production issue, which I realize is not very common, but I guess can happen
again in my environment and maybe in others'.
What we have is a shared YARN cluster and we have a workload that runs slow
on a couple of nodes, however the nodes are fine to run other types of jobs, so
we want to have them in the cluster. The actual problem comes from reading from
an external file system, and apparently only for this specific workload (which
is only one of many workloads that run on that cluster). What I have done as a
workaround to make the job run faster so far is just killing the executors on
the 2 "slow nodes" and the job could finish faster as it avoided the painfully
slow long tail of execution on the affected nodes. The proposed patch/feature
is an attempt to address this case in a more structured way than just going on
the nodes and killing executors.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]