[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

tgravescs Wed, 06 Sep 2017 14:42:20 -0700

Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/19046
  
    Unfortunately that isn't clear to me as to the cause or what the issue is 
with the yarn side.
    I'm not sure what he means by "AM is releasing and then acquiring the 
reservations again and again until it has enough to run all tasks that it needs"
    
    Also do you know if "Due to the sustained backlog trigger doubling the 
request size each time it floods the scheduler with requests."  Is talking 
about the exponential increase in the container reqeusts that spark does?  ie 
if we did all the requests at once up front would it be better.
    
    I would also like to know what issue this causes in yarn.
    
    His last statement is also not always true.  MR AM only takes headroom into 
account with slowstart and preempting reduces to run maps.  It may very well be 
that you are using slow start and if you have it configured very aggressive 
(meaning start reduces very early) it could be spread out quite a bit but you 
are also possibly wasting resources for the tradeoff of maybe finishing sooner 
depending on the job.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #19046: [SPARK-18769][yarn] Limit resource requests based on RM'...

Reply via email to