[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

shivaram Fri, 05 Jun 2015 14:54:51 -0700

Github user shivaram commented on the pull request:

    https://github.com/apache/spark/pull/6652#issuecomment-109449637
  
    @sryza I'm not sure why the same 5 nodes should get chosen for all the 
reduce tasks ? Are you concerned this will happen because we use stable sort ? 
We could randomize among nodes with equal size inputs I guess. Also the default 
locality wait is 3s, so for more non-trivial shuffles, it should fall back to 
the default case. 
    
    FWIW I don't mind putting this behind a flag, especially if there are other 
workloads which people would want to test this with. I just want to avoid 
having too many flags, thresholds that make it complex to reason about what the 
code is doing etc.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-2774] Set preferred locations for reduc...

Reply via email to