prakhar jauhari created SPARK-11181: ---------------------------------------
Summary: Spark Yarn : Spark reducing total executors count even when Dynamic Allocation is disabled. Key: SPARK-11181 URL: https://issues.apache.org/jira/browse/SPARK-11181 Project: Spark Issue Type: Bug Components: Scheduler, Spark Core, YARN Affects Versions: 1.3.1 Environment: Spark-1.3.1 on hadoop-yarn-2.4.0 cluster. All servers in cluster running Linux version 2.6.32. Job in yarn-client mode. Reporter: prakhar jauhari Fix For: 1.3.2, 1.5.2 Spark driver reduces total executors count even when Dynamic Allocation is not enabled. To reproduce this: 1. A 2 node yarn setup : each DN has ~ 20GB mem and 4 cores. 2. When the application launches and gets it required executors, One of the DN's losses connectivity and is timed out. 3. Spark issues a killExecutor for the executor on the DN which was timed out. 4. Even with dynamic allocation off, spark's scheduler reduces the "targetNumExecutors". 5. Thus the job runs with reduced executor count. Note : The severity of the issue increases : If some of the DN that were running my job's executors lose connectivity intermittently, spark scheduler reduces "targetNumExecutors", thus not asking for new executors on any other nodes, causing the job to hang. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org