[
https://issues.apache.org/jira/browse/MESOS-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Niemitz updated MESOS-2215:
---------------------------------
Description:
Once the slave restarts and recovers the task, I see this error in the log for
all tasks that were recovered every second or so. Note, these were NOT docker
tasks:
W0113 16:01:00.790323 773142 monitor.cpp:213] Failed to get resource usage for
container 7b729b89-dc7e-4d08-af97-8cd1af560a21 for executor
thermos-1421085237813-slipstream-prod-agent-3-8f769514-1835-4151-90d0-3f55dcc940dd
of framework 20150109-161713-715350282-5050-290797-0000: Failed to 'docker
inspect mesos-7b729b89-dc7e-4d08-af97-8cd1af560a21': exit status = exited with
status 1 stderr = Error: No such image or container:
mesos-7b729b89-dc7e-4d08-af97-8cd1af560a21
However the tasks themselves are still healthy and running.
The slave was launched with --containerizers=mesos,docker
was:
Once the slave restarts and recovers the task, I see this error in the log for
all tasks that were recovered every second or so. Note, these were NOT docker
tasks:
W0113 16:01:00.790323 773142 monitor.cpp:213] Failed to get resource usage for
container 7b729b89-dc7e-4d08-af97-8cd1af560a21 for executor
thermos-1421085237813-slipstream-prod-agent-3-8f769514-1835-4151-90d0-3f55dcc940dd
of framework 20150109-161713-715350282-5050-290797-0000: Failed to 'docker
inspect mesos-7b729b89-dc7e-4d08-af97-8cd1af560a21': exit status = exited with
status 1 stderr = Error: No such image or container:
mesos-7b729b89-dc7e-4d08-af97-8cd1af560a21
However the tasks themselves are still healthy and running.
> If checkpointing is enabled on a framework, recovered tasks are no longer
> monitored once the slave restarts
> -----------------------------------------------------------------------------------------------------------
>
> Key: MESOS-2215
> URL: https://issues.apache.org/jira/browse/MESOS-2215
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 0.21.0
> Reporter: Steve Niemitz
>
> Once the slave restarts and recovers the task, I see this error in the log
> for all tasks that were recovered every second or so. Note, these were NOT
> docker tasks:
> W0113 16:01:00.790323 773142 monitor.cpp:213] Failed to get resource usage
> for container 7b729b89-dc7e-4d08-af97-8cd1af560a21 for executor
> thermos-1421085237813-slipstream-prod-agent-3-8f769514-1835-4151-90d0-3f55dcc940dd
> of framework 20150109-161713-715350282-5050-290797-0000: Failed to 'docker
> inspect mesos-7b729b89-dc7e-4d08-af97-8cd1af560a21': exit status = exited
> with status 1 stderr = Error: No such image or container:
> mesos-7b729b89-dc7e-4d08-af97-8cd1af560a21
> However the tasks themselves are still healthy and running.
> The slave was launched with --containerizers=mesos,docker
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)