[ 
https://issues.apache.org/jira/browse/MESOS-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308926#comment-15308926
 ] 

Gilbert Song commented on MESOS-5395:
-------------------------------------

[~Mengkui], Thanks for reporting this issue. Could you reproduce this issue and 
see whether restarting the slave process resolve the issue?

BTW, could you verify https://issues.apache.org/jira/browse/MESOS-5482 is 
identical to this issue? Thanks. :)

> Task getting stuck in staging state if launch it on a rebooted slave.
> ---------------------------------------------------------------------
>
>                 Key: MESOS-5395
>                 URL: https://issues.apache.org/jira/browse/MESOS-5395
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.28.0
>         Environment: mesos/marathon cluster,  3 maters/4 slaves
> Mesos: 0.28.0 ,  Marathon 0.15.2
>            Reporter: Mengkui gong
>         Attachments: mesos-log.zip
>
>
> if rebooting a slave, after that,  using Marathon to launch a task,  the task 
> can start on other slaves without problem.  But if launch it on the rebooted 
> slave, the task will be stuck. From Mesos UI shows it in staging state from 
> active tasks list.  From Marathon UI shows it in deploying state. It can 
> keeping in stuck state for more than 2 hours.  After that time, Marathon will 
> automatically launch the task on this rebooted slave or other slave as 
> normal. So the rebooted slave be recovered as well after that time.   
> From Mesos log,  I can see "telling slave to kill task" all the time.
> I0517 15:25:27.207237 20568 master.cpp:3826] Telling slave 
> 282745ab-423a-4350-a449-3e8cdfccfb93-S1 at slave(1)@10.254.234.236:5050 
> (mesos-slave-3) to kill task 
> project-hub_project-hub-frontend.b645f24b-1c1f-11e6-bb25-d00d2cce797e of 
> framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 (marathon) at 
> [email protected]:56757.
> From rebooted slave log, I can see:
> May 17 15:28:37 euca-10-254-234-236 mesos-slave[829]: I0517 15:28:37.206831   
> 916 slave.cpp:1891] Asked to kill task 
> project-hub_project-hub-frontend.b645f24b-1c1f-11e6-bb25-d00d2cce797e of 
> framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000
> May 17 15:28:37 euca-10-254-234-236 mesos-slave[829]: W0517 15:28:37.206866   
> 916 slave.cpp:2018] Ignoring kill task 
> project-hub_project-hub-frontend.b645f24b-1c1f-11e6-bb25-d00d2cce797e because 
> the executor 
> 'project-hub_project-hub-frontend.b645f24b-1c1f-11e6-bb25-d00d2cce797e' of 
> framework 17cd3756-1d59-4dfc-984d-3fe09f6b5730-0000 is terminating/terminated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to