[
https://issues.apache.org/jira/browse/SPARK-9209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
KaiXinXIaoLei updated SPARK-9209:
---------------------------------
Description:
I set "spark.dynamicAllocation.enabled = true”, and run a big job. In driver,
a executor is asked to remove, and it's remove successfully, and the process of
this executor is not exist. But it exists in ExecutorsPage of the web ui.
The log in driver :
2015-07-17 11:48:14,543 | INFO | [sparkDriver-akka.actor.default-dispatcher-3]
| Removing block manager BlockManagerId(264, 172.1.1.8, 23811)
2015-07-17 11:48:14,543 | INFO | [dag-scheduler-event-loop] | Removed 264
successfully in removeExecutor
2015-07-17 11:48:21,226 | INFO | [sparkDriver-akka.actor.default-dispatcher-3]
| Registering block manager 172.1.1.8:23811 with 10.4 GB RAM,
BlockManagerId(264, 172.1.1.8, 23811)
2015-07-17 11:48:21,228 | INFO | [sparkDriver-akka.actor.default-dispatcher-3]
| Added broadcast_781_piece0 in memory on 172.1.1.8:23811 (size: 38.6 KB, free:
10.4 GB)
2015-07-17 11:48:35,277 | ERROR |
[sparkDriver-akka.actor.default-dispatcher-16] | Lost executor 264 on
datasight-195: remote Rpc client disassociated
2015-07-17 11:48:35,277 | WARN | [sparkDriver-akka.actor.default-dispatcher-4]
| Association with remote system [akka.tcp://sparkExecutor@datasight-195:23929]
has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-07-17 11:48:35,277 | INFO |
[sparkDriver-akka.actor.default-dispatcher-16] | Re-queueing tasks for 264 from
TaskSet 415.0
2015-07-17 11:48:35,804 | INFO | [SparkListenerBus] | Existing executor 264
has been removed (new total is 10)
was:
I set "spark.dynamicAllocation.enabled = true”, and run a big job. In driver,
a executors are asked to remove, and it's remove successfully, and the process
of this executor is not exist. But it exists in ExecutorPage of the web ui.
The log in driver :
2015-07-17 11:48:14,543 | INFO | [sparkDriver-akka.actor.default-dispatcher-3]
| Removing block manager BlockManagerId(264, 172.1.1.8, 23811)
2015-07-17 11:48:14,543 | INFO | [dag-scheduler-event-loop] | Removed 264
successfully in removeExecutor
2015-07-17 11:48:21,226 | INFO | [sparkDriver-akka.actor.default-dispatcher-3]
| Registering block manager 172.1.1.8:23811 with 10.4 GB RAM,
BlockManagerId(264, 172.1.1.8, 23811)
2015-07-17 11:48:21,228 | INFO | [sparkDriver-akka.actor.default-dispatcher-3]
| Added broadcast_781_piece0 in memory on 172.1.1.8:23811 (size: 38.6 KB, free:
10.4 GB)
2015-07-17 11:48:35,277 | ERROR |
[sparkDriver-akka.actor.default-dispatcher-16] | Lost executor 264 on
datasight-195: remote Rpc client disassociated
2015-07-17 11:48:35,277 | WARN | [sparkDriver-akka.actor.default-dispatcher-4]
| Association with remote system [akka.tcp://sparkExecutor@datasight-195:23929]
has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-07-17 11:48:35,277 | INFO |
[sparkDriver-akka.actor.default-dispatcher-16] | Re-queueing tasks for 264 from
TaskSet 415.0
2015-07-17 11:48:35,804 | INFO | [SparkListenerBus] | Existing executor 264
has been removed (new total is 10)
> Using executor allocation, a executor is removed but it exists in
> ExecutorsPage of the web ui
> ----------------------------------------------------------------------------------------------
>
> Key: SPARK-9209
> URL: https://issues.apache.org/jira/browse/SPARK-9209
> Project: Spark
> Issue Type: Bug
> Components: Web UI
> Affects Versions: 1.4.1
> Reporter: KaiXinXIaoLei
> Fix For: 1.5.0
>
> Attachments: A Executor exists in web.png, executor is removed.png
>
>
> I set "spark.dynamicAllocation.enabled = true”, and run a big job. In
> driver, a executor is asked to remove, and it's remove successfully, and the
> process of this executor is not exist. But it exists in ExecutorsPage of the
> web ui.
> The log in driver :
> 2015-07-17 11:48:14,543 | INFO |
> [sparkDriver-akka.actor.default-dispatcher-3] | Removing block manager
> BlockManagerId(264, 172.1.1.8, 23811)
> 2015-07-17 11:48:14,543 | INFO | [dag-scheduler-event-loop] | Removed 264
> successfully in removeExecutor
> 2015-07-17 11:48:21,226 | INFO |
> [sparkDriver-akka.actor.default-dispatcher-3] | Registering block manager
> 172.1.1.8:23811 with 10.4 GB RAM, BlockManagerId(264, 172.1.1.8, 23811)
> 2015-07-17 11:48:21,228 | INFO |
> [sparkDriver-akka.actor.default-dispatcher-3] | Added broadcast_781_piece0 in
> memory on 172.1.1.8:23811 (size: 38.6 KB, free: 10.4 GB)
> 2015-07-17 11:48:35,277 | ERROR |
> [sparkDriver-akka.actor.default-dispatcher-16] | Lost executor 264 on
> datasight-195: remote Rpc client disassociated
> 2015-07-17 11:48:35,277 | WARN |
> [sparkDriver-akka.actor.default-dispatcher-4] | Association with remote
> system [akka.tcp://sparkExecutor@datasight-195:23929] has failed, address is
> now gated for [5000] ms. Reason is: [Disassociated].
> 2015-07-17 11:48:35,277 | INFO |
> [sparkDriver-akka.actor.default-dispatcher-16] | Re-queueing tasks for 264
> from TaskSet 415.0
> 2015-07-17 11:48:35,804 | INFO | [SparkListenerBus] | Existing executor 264
> has been removed (new total is 10)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]