[ 
https://issues.apache.org/jira/browse/MESOS-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350648#comment-15350648
 ] 

chenqiang edited comment on MESOS-5718 at 6/27/16 8:45 AM:
-----------------------------------------------------------

yes, it's still an issue, I changed to unassigned. I will assign it back to me 
if I have the solution.


was (Author: chenqiang):
yes, it's still an issue, I changed to unsigned. I will assign it back to me if 
I have the solution.

> Mesos UI shows "Taks is in RUNNING status" but can't find it in the mesos 
> Agent.
> --------------------------------------------------------------------------------
>
>                 Key: MESOS-5718
>                 URL: https://issues.apache.org/jira/browse/MESOS-5718
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: chenqiang
>
> Now, we find an issue that a task launched by marathon with docker container 
> shows "Task is in RUNNING status" in Mesos UI, but can't find it in the mesos 
> Agent host. Namely, the docker container doesn't exist but the Task is shown 
> As RUNNING in Mesos UI.  so interesting...
> Parts log is attached as belows:
> ```
> I0627 14:31:30.239467  3913 slave.cpp:1912] Asked to kill task 
> tanmenggang.router-web.jylt-online02.532b8817-391f-11e6-93b3-56847afe9799 of 
> framework 20141201-145651-1900714250-5050-3484-0000
> W0627 14:31:30.239547  3913 slave.cpp:2025] Ignoring kill task 
> tanmenggang.router-web.jylt-online02.532b8817-391f-11e6-93b3-56847afe9799 
> because the executor 
> 'tanmenggang.router-web.jylt-online02.532b8817-391f-11e6-93b3-56847afe9799' 
> of framework 20141201-145651-1900714250-5050-3484-0000 at 
> executor(1)@10.153.96.22:14578 is terminating/terminated
> I0624 14:46:04.398646  3921 slave.cpp:4511] Sending reconnect request to 
> executor 
> 'tanmenggang.router-web.jylt-online02.532b8817-391f-11e6-93b3-56847afe9799' 
> of framework 20141201-145651-1900714250-5050-3484-0000 at 
> executor(1)@10.153.96.22:14578
> I0624 14:46:06.399073  3899 slave.cpp:2991] Killing un-reregistered executor 
> 'tanmenggang.router-web.jylt-online02.532b8817-391f-11e6-93b3-56847afe9799' 
> of framework 20141201-145651-1900714250-5050-3484-0000 at 
> executor(1)@10.153.96.22:14578
> I0624 14:46:06.399183  3899 slave.cpp:4571] Finished recovery
> I0624 14:46:06.399375  3902 docker.cpp:1724] Destroying container 
> 'fa37fc7c-7ef1-478a-81a2-cae38ab3e4cb'
> I0624 14:46:06.399431  3902 docker.cpp:1852] Running docker stop on container 
> 'fa37fc7c-7ef1-478a-81a2-cae38ab3e4cb'
> ``` 
> What's the root cause ? It seems executor of that task is terminated, but the 
> task is ignored kill by slave.
> FIX: After restart mesos-slave, the RUNNING task becomes  in FAILED status, 
> and we can see it is launched again in other Agent, the task restores to 
> normal...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to