Shixiong Zhu created SPARK-4951: ----------------------------------- Summary: A busy executor may be killed when dynamicAllocation is enabled Key: SPARK-4951 URL: https://issues.apache.org/jira/browse/SPARK-4951 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Shixiong Zhu
If a task runs more than `spark.dynamicAllocation.executorIdleTimeout`, the executor which runs this task will be killed. The following steps (yarn-client mode) can reproduce this bug: 1. Start `spark-shell` using {code} ./bin/spark-shell --conf "spark.shuffle.service.enabled=true" \ --conf "spark.dynamicAllocation.minExecutors=1" \ --conf "spark.dynamicAllocation.maxExecutors=4" \ --conf "spark.dynamicAllocation.enabled=true" \ --conf "spark.dynamicAllocation.executorIdleTimeout=30" \ --master yarn-client \ --driver-memory 512m \ --executor-memory 512m \ --executor-cores 1 {code} 2. Wait more than 30 seconds until there is only one executor. 3. Run the following code (a task needs at least 50 seconds to finish) {code} val r = sc.parallelize(1 to 1000, 20).map{t => Thread.sleep(1000); t}.groupBy(_ % 2).collect() {code} 4. Executors will be killed and allocated all the time, which makes the Job fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org