[
https://issues.apache.org/jira/browse/SPARK-9097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629673#comment-14629673
]
KaiXinXIaoLei edited comment on SPARK-9097 at 7/16/15 12:57 PM:
----------------------------------------------------------------
I run a big job. During running tasks, five tasks failed. Then executors are
killed. But there are many tasks to run. The log info:
2015-07-08 15:03:30,583 | WARN |
[sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1568.0 in stage
167.0 (TID 25557, linux-174): ExecutorLostFailure (executor 52 lost)
2015-07-08 15:03:30,584 | WARN |
[sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1549.0 in stage
167.0 (TID 25538, linux-174): ExecutorLostFailure (executor 52 lost)
2015-07-08 15:03:30,584 | WARN |
[sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1552.0 in stage
167.0 (TID 25541, linux-174): ExecutorLostFailure (executor 52 lost)
2015-07-08 15:03:30,584 | WARN |
[sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1569.0 in stage
167.0 (TID 25558, linux-174): ExecutorLostFailure (executor 52 lost)
2015-07-08 15:03:30,584 | WARN |
[sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1548.0 in stage
167.0 (TID 25537, linux-174): ExecutorLostFailure (executor 52 lost)
2015-07-08 15:03:30,584 | INFO | [dag-scheduler-event-loop] | Executor lost:
52 (epoch 29)
2015-07-08 15:03:30,584 | INFO | [kill-executor-thread] | Requesting to kill
executor(s) 52
2015-07-08 15:03:30,585 | INFO |
[sparkDriver-akka.actor.default-dispatcher-30] | Trying to remove executor 52
from BlockManagerMaster.
2015-07-08 15:03:30,585 | INFO |
[sparkDriver-akka.actor.default-dispatcher-30] | Removing block manager
BlockManagerId(52, 9.91.8.174, 23424)
2015-07-08 15:03:30,585 | INFO | [dag-scheduler-event-loop] | Removed 52
successfully in removeExecutor
2015-07-08 15:03:30,585 | INFO | [dag-scheduler-event-loop] | Host added was
in lost list earlier: hostname
Then I can't find executors to add, and not find failed task to re-submit in
log. Thanks.
was (Author: kaixinxiaolei):
I run a big job. During running tasks, five tasks failed. Then executors are
killed. But there are many tasks to run. The log info:
2015-07-08 15:03:30,583 | WARN |
[sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1568.0 in stage
167.0 (TID 25557, linux-174): ExecutorLostFailure (executor 52 lost)
2015-07-08 15:03:30,584 | WARN |
[sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1549.0 in stage
167.0 (TID 25538, linux-174): ExecutorLostFailure (executor 52 lost)
2015-07-08 15:03:30,584 | WARN |
[sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1552.0 in stage
167.0 (TID 25541, linux-174): ExecutorLostFailure (executor 52 lost)
2015-07-08 15:03:30,584 | WARN |
[sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1569.0 in stage
167.0 (TID 25558, linux-174): ExecutorLostFailure (executor 52 lost)
2015-07-08 15:03:30,584 | WARN |
[sparkDriver-akka.actor.default-dispatcher-43] | Lost task 1548.0 in stage
167.0 (TID 25537, linux-174): ExecutorLostFailure (executor 52 lost)
2015-07-08 15:03:30,584 | INFO | [dag-scheduler-event-loop] | Executor lost:
52 (epoch 29)
2015-07-08 15:03:30,584 | INFO | [kill-executor-thread] | Requesting to kill
executor(s) 52
2015-07-08 15:03:30,585 | INFO |
[sparkDriver-akka.actor.default-dispatcher-30] | Trying to remove executor 52
from BlockManagerMaster.
2015-07-08 15:03:30,585 | INFO |
[sparkDriver-akka.actor.default-dispatcher-30] | Removing block manager
BlockManagerId(52, 9.91.8.174, 23424)
2015-07-08 15:03:30,585 | INFO | [dag-scheduler-event-loop] | Removed 52
successfully in removeExecutor
2015-07-08 15:03:30,585 | INFO | [dag-scheduler-event-loop] | Host added was
in lost list earlier: hostname
Then I can't find executors to add, and not find failed task to re-submit.
Thanks.
> Tasks are not completed but the number of executor is zero
> ----------------------------------------------------------
>
> Key: SPARK-9097
> URL: https://issues.apache.org/jira/browse/SPARK-9097
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.4.0
> Reporter: KaiXinXIaoLei
> Attachments: number of executor is zero.png, tasks are not
> completed.png
>
>
> I set the value of "spark.dynamicAllocation.enabled" is true. I submit tasks
> to run. Tasks are not completed, but the number of executor is zero.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]