Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19046 Unfortunately that isn't clear to me as to the cause or what the issue is with the yarn side. I'm not sure what he means by "AM is releasing and then acquiring the reservations again and again until it has enough to run all tasks that it needs" Also do you know if "Due to the sustained backlog trigger doubling the request size each time it floods the scheduler with requests." Is talking about the exponential increase in the container reqeusts that spark does? ie if we did all the requests at once up front would it be better. I would also like to know what issue this causes in yarn. His last statement is also not always true. MR AM only takes headroom into account with slowstart and preempting reduces to run maps. It may very well be that you are using slow start and if you have it configured very aggressive (meaning start reduces very early) it could be spread out quite a bit but you are also possibly wasting resources for the tradeoff of maybe finishing sooner depending on the job.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org