Github user LucaCanali commented on the issue:

    https://github.com/apache/spark/pull/19039
  
    Thanks @jiangxb1987 for the review. I have tried to address the comments in 
a new commit, in particular adding the configuration to internal/config and 
building a private function to handle processing of the node list in 
`spark.blacklist.alwaysBlacklistedNodes`. As for setting `_nodeBlacklist` I 
think it makes sense to use 
`_nodeBlacklist.set(nodeIdToBlacklistExpiryTime.keySet.toSet)` to keep it 
consistent with the rest of the code in BlacklistTracker. Also 
`nodeIdToBlacklistExpiryTime` needs to be initialized with the blacklisted 
nodes.
    
    As for the usefulness of the feature, I understand your comment and I have 
added some comments in SPARK-21829. The need for this feature for me comes from 
a production issue, which I realize is not very common, but I guess can happen 
again in my environment and maybe in others'.
    What we have is a shared YARN cluster and we have a workload that runs slow 
on a couple of nodes, however the nodes are fine to run other types of jobs, so 
we want to have them in the cluster. The actual problem comes from reading from 
an external file system, and apparently only for this specific workload (which 
is only one of many workloads that run on that cluster). What I have done as a 
workaround to make the job run faster so far is just killing the executors on 
the 2 "slow nodes" and the job could finish faster as it avoided the painfully 
slow long tail of execution on the affected nodes. The proposed patch/feature 
is an attempt to address this case in a more structured way than just going on 
the nodes and killing executors.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to