[ 
https://issues.apache.org/jira/browse/SPARK-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liujianhui updated SPARK-18976:
-------------------------------
    Description: 
h2. scene
when executor expired by HeartbeatReceiver in driver, driver will mark that 
executor as not live, task scheduler will not assign tasks to that executor, 
but that executor's status will always be running and take up cores, the 
executor 18 was expired and no task running, the task time far less than the 
normal executor 142, but in app page, the executor is running
!screenshot-1.png!
!screenshot-2.png!
!screenshot-3.png!
h2.process:
# exeuctor expired by HearbeatReceiver because the last heartbeat execeed the 
executor timeout
# executor will be removed in CoarseGrainedSchdulerBackend.killExecutors, so 
that executor will marked as dead, it will not scheduled as offer since now 
because it in executorsPendingToRemove
# status of that executor is running because the CoarseGrainedExecutorBackend 
processor is also exist and it register block manager to the driver every 10s, 
log as 
{log}16/12/22 17:04:26 INFO Executor: Told to re-register on heartbeat
16/12/22 17:04:26 INFO BlockManager: BlockManager re-registering with master
16/12/22 17:04:26 INFO BlockManagerMaster: Trying to register BlockManager
16/12/22 17:04:26 INFO BlockManagerMaster: Registered BlockManager
16/12/22 17:04:26 INFO BlockManager: Reporting 0 blocks to the master.
16/12/22 17:04:36 INFO Executor: Told to re-register on heartbeat
16/12/22 17:04:36 INFO BlockManager: BlockManager re-registering with master
16/12/22 17:04:36 INFO BlockManagerMaster: Trying to register BlockManager
16/12/22 17:04:36 INFO BlockManagerMaster: Registered BlockManager
16/12/22 17:04:36 INFO BlockManager: Reporting 0 blocks to the master.
16/12/22 17:04:46 INFO Executor: Told to re-register on heartbeat
16/12/22 17:04:46 INFO BlockManager: BlockManager re-registering with master
16/12/22 17:04:46 INFO BlockManagerMaster: Trying to register BlockManager
16/12/22 17:04:46 INFO BlockManagerMaster: Registered BlockManager
16/12/22 17:04:46 INFO BlockManager: Reporting 0 blocks to the master.
16/12/22 17:04:56 INFO Executor: Told to re-register on heartbeat
16/12/22 17:04:56 INFO BlockManager: BlockManager re-registering with master
16/12/22 17:04:56 INFO BlockManagerMaster: Trying to register BlockManager
16/12/22 17:04:56 INFO BlockManagerMaster: Registered BlockManager
16/12/22 17:04:56 INFO BlockManager: Reporting 0 blocks to the master. {log}

  was:
h2. scene
when executor expired by HeartbeatReceiver in driver, driver will mark that 
executor as not live, task scheduler will not assign tasks to that executor, 
but that executor's status will always be running and take up cores, the 
executor 18 was expired and no task running, the task time far less than the 
normal executor 142, but in app page, the executor is running
!screenshot-1.png!
!screenshot-2.png!
!screenshot-3.png!
h2.process:
# exeuctor expired by HearbeatReceiver because the last heartbeat execeed the 
executor timeout
# executor will be removed in CoarseGrainedSchdulerBackend.killExecutors, so 
that executor will marked as dead, it will not scheduled as offer since now 
because it in executorsPendingToRemove
# status of that executor is running because the CoarseGrainedExecutorBackend 
processor is also exist and it register block manager to the driver every 10s, 
log as 
{{16/12/22 17:04:26 INFO Executor: Told to re-register on heartbeat
16/12/22 17:04:26 INFO BlockManager: BlockManager re-registering with master
16/12/22 17:04:26 INFO BlockManagerMaster: Trying to register BlockManager
16/12/22 17:04:26 INFO BlockManagerMaster: Registered BlockManager
16/12/22 17:04:26 INFO BlockManager: Reporting 0 blocks to the master.
16/12/22 17:04:36 INFO Executor: Told to re-register on heartbeat
16/12/22 17:04:36 INFO BlockManager: BlockManager re-registering with master
16/12/22 17:04:36 INFO BlockManagerMaster: Trying to register BlockManager
16/12/22 17:04:36 INFO BlockManagerMaster: Registered BlockManager
16/12/22 17:04:36 INFO BlockManager: Reporting 0 blocks to the master.
16/12/22 17:04:46 INFO Executor: Told to re-register on heartbeat
16/12/22 17:04:46 INFO BlockManager: BlockManager re-registering with master
16/12/22 17:04:46 INFO BlockManagerMaster: Trying to register BlockManager
16/12/22 17:04:46 INFO BlockManagerMaster: Registered BlockManager
16/12/22 17:04:46 INFO BlockManager: Reporting 0 blocks to the master.
16/12/22 17:04:56 INFO Executor: Told to re-register on heartbeat
16/12/22 17:04:56 INFO BlockManager: BlockManager re-registering with master
16/12/22 17:04:56 INFO BlockManagerMaster: Trying to register BlockManager
16/12/22 17:04:56 INFO BlockManagerMaster: Registered BlockManager
16/12/22 17:04:56 INFO BlockManager: Reporting 0 blocks to the master. }}


> in standlone mode,executor expired by HeartbeanReceiver that still take up 
> cores but no tasks assigned to 
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-18976
>                 URL: https://issues.apache.org/jira/browse/SPARK-18976
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy
>    Affects Versions: 1.6.1
>         Environment: jdk1.8.0_77 Red Hat 4.4.7-11
>            Reporter: liujianhui
>             Fix For: 1.6.1
>
>         Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> h2. scene
> when executor expired by HeartbeatReceiver in driver, driver will mark that 
> executor as not live, task scheduler will not assign tasks to that executor, 
> but that executor's status will always be running and take up cores, the 
> executor 18 was expired and no task running, the task time far less than the 
> normal executor 142, but in app page, the executor is running
> !screenshot-1.png!
> !screenshot-2.png!
> !screenshot-3.png!
> h2.process:
> # exeuctor expired by HearbeatReceiver because the last heartbeat execeed the 
> executor timeout
> # executor will be removed in CoarseGrainedSchdulerBackend.killExecutors, so 
> that executor will marked as dead, it will not scheduled as offer since now 
> because it in executorsPendingToRemove
> # status of that executor is running because the CoarseGrainedExecutorBackend 
> processor is also exist and it register block manager to the driver every 
> 10s, log as 
> {log}16/12/22 17:04:26 INFO Executor: Told to re-register on heartbeat
> 16/12/22 17:04:26 INFO BlockManager: BlockManager re-registering with master
> 16/12/22 17:04:26 INFO BlockManagerMaster: Trying to register BlockManager
> 16/12/22 17:04:26 INFO BlockManagerMaster: Registered BlockManager
> 16/12/22 17:04:26 INFO BlockManager: Reporting 0 blocks to the master.
> 16/12/22 17:04:36 INFO Executor: Told to re-register on heartbeat
> 16/12/22 17:04:36 INFO BlockManager: BlockManager re-registering with master
> 16/12/22 17:04:36 INFO BlockManagerMaster: Trying to register BlockManager
> 16/12/22 17:04:36 INFO BlockManagerMaster: Registered BlockManager
> 16/12/22 17:04:36 INFO BlockManager: Reporting 0 blocks to the master.
> 16/12/22 17:04:46 INFO Executor: Told to re-register on heartbeat
> 16/12/22 17:04:46 INFO BlockManager: BlockManager re-registering with master
> 16/12/22 17:04:46 INFO BlockManagerMaster: Trying to register BlockManager
> 16/12/22 17:04:46 INFO BlockManagerMaster: Registered BlockManager
> 16/12/22 17:04:46 INFO BlockManager: Reporting 0 blocks to the master.
> 16/12/22 17:04:56 INFO Executor: Told to re-register on heartbeat
> 16/12/22 17:04:56 INFO BlockManager: BlockManager re-registering with master
> 16/12/22 17:04:56 INFO BlockManagerMaster: Trying to register BlockManager
> 16/12/22 17:04:56 INFO BlockManagerMaster: Registered BlockManager
> 16/12/22 17:04:56 INFO BlockManager: Reporting 0 blocks to the master. {log}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to