This is because Yarn's AM client does not remove fulfilled container request from its MAP until the application's AM specifically calls removeContainerRequest for fulfilled container requests.
Spark-1.2 : Spark's yarn AM does not call removeContainerRequest for fulfilled container request. Spark-1.3 : yarn AM calls removeContainerRequest for the container requests it can map to be fulfilled. Tried the same test case of killing one executor with spark-1.3 and the ask[] in this case was for 1 container. As long as the cluster size is large enough to allocate the bloated container requests, containers are sent to spark yarn allocator in allocate response, spark yarn allocator uses missing number of container to launch new executors and release the extra allocated containers. The problem magnifies in case of long running jobs with large executor memory requirements. In this case when ever a executor gets killed, the next ask to yarn Resource manager (RM) is of n+1 containers (n being count of already requested containers), which might be served by the RM if it still has enough resources, else RM starts reserving cluster resources for a containers which are not even required by spark in the first place. This causes resource crunch for other applications, and inefficient resource utilization of cluster resources. I was not able to find a thread talking specifically about this. Is there any known use case due to which removeContainerRequest is not done in spark 1.2? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-1-2-yarn-allocator-does-not-remove-container-request-for-allocated-container-resulting-in-a-bl-tp13508.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org