Shixiong Zhu created SPARK-4951:
-----------------------------------
Summary: A busy executor may be killed when dynamicAllocation is
enabled
Key: SPARK-4951
URL: https://issues.apache.org/jira/browse/SPARK-4951
Project: Spark
Issue Type: Bug
Components: Spark Core
Reporter: Shixiong Zhu
If a task runs more than `spark.dynamicAllocation.executorIdleTimeout`, the
executor which runs this task will be killed.
The following steps (yarn-client mode) can reproduce this bug:
1. Start `spark-shell` using
{code}
./bin/spark-shell --conf "spark.shuffle.service.enabled=true" \
--conf "spark.dynamicAllocation.minExecutors=1" \
--conf "spark.dynamicAllocation.maxExecutors=4" \
--conf "spark.dynamicAllocation.enabled=true" \
--conf "spark.dynamicAllocation.executorIdleTimeout=30" \
--master yarn-client \
--driver-memory 512m \
--executor-memory 512m \
--executor-cores 1
{code}
2. Wait more than 30 seconds until there is only one executor.
3. Run the following code (a task needs at least 50 seconds to finish)
{code}
val r = sc.parallelize(1 to 1000, 20).map{t => Thread.sleep(1000); t}.groupBy(_
% 2).collect()
{code}
4. Executors will be killed and allocated all the time, which makes the Job
fail.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]