Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19046
  
    This has been a long time request from our YARN team; they've had users run 
into issues with YARN and traced it back to Spark making a large number of 
container requests. You can argue that it's an issue in YARN and you'd be 
correct, but it doesn't mean Spark cannot try to help.
    
    The MR AM does [something 
similar](https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/rm/RMContainerAllocator.java).
    
    >  looking at adding a config to limit # of tasks in parallel.
    
    That could help because it would limit the number of containers Spark would 
request indirectly, but it hardly feels like a substitute for this, since I 
expect very few people to use that.
    
    > I definitely want a config for it and prefer off by default
    
    Why? The only downside you mention is that if the cluster is completely 
full, then another app may be making requests and may get a container sooner 
than YARN. That sounds like an edge case where you really should be isolating 
those apps into separate queues if such resource contention is a problem.
    
    One thing that would probably be ok is to use 
`spark.dynamicAllocation.maxExecutors` as the upper limity if it's explicitly 
set, instead of this. But the default for that config is `Int.MaxValue`, so by 
default I think it makes sense to have a way to automatically limit these 
requests without users having to make guesses about what values they have to 
set.
    
    > This change isn't going to help anything if your cluster is actually 
large enough to get say 50000 containers. 
    
    If all of those containers are actually allocated to apps, or your 
application's queue actually limit the number of containers your app can get to 
a much smaller number, then yes it can help.
    
    > I'm also not sure why we just don't ask for all the containers up front.
    
    Wouldn't that mean that YARN would actually allocate resources you don't 
need? That sounds to me like a middle between dynamic allocation and fixed 
allocation, where you allocate the containers but don't actually start 
executors, and I'm not sure what that would help with.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to