[
https://issues.apache.org/jira/browse/MESOS-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brenden Matthews updated MESOS-2583:
------------------------------------
Description:
Tasks occasionally become stuck in the `TASK_STAGING` state after launching. It
appears that this affects both Docker and non-Docker tasks, especially those
which start up and fail immediately. Attached is a sample of the slave log as
well as screenshots from a testing cluster showing the tasks which are stuck in
staging, and then a number of failed tasks which occurs after restarting the
slave process. Justin Bieber is provided for scale.
This may be related to MESOS-1837, and quite possibly the same issue, but it
remains unclear.
was:
Tasks occasionally become stuck in the `TASK_STAGING` state after launching. It
appears that this affects both Docker and non-Docker tasks, especially those
which start up and fail immediately. Attached is a sample of the slave log as
well as screenshots from a testing cluster showing the tasks which are stuck in
staging, and then a number of failed tasks which occurs after restarting the
slave process.
This may be related to MESOS-1837, and quite possibly the same issue, but it
remains unclear.
> Tasks getting stuck in staging
> ------------------------------
>
> Key: MESOS-2583
> URL: https://issues.apache.org/jira/browse/MESOS-2583
> Project: Mesos
> Issue Type: Bug
> Components: slave
> Affects Versions: 0.22.0
> Reporter: Brenden Matthews
> Attachments:
> Justin-Bieber_The-Beliebers-Want-to-Believe-2-650x406.jpg, Screen Shot
> 2015-03-26 at 11.59.33 AM.png, Screen Shot 2015-03-30 at 2.04.14 PM.png,
> log.txt
>
>
> Tasks occasionally become stuck in the `TASK_STAGING` state after launching.
> It appears that this affects both Docker and non-Docker tasks, especially
> those which start up and fail immediately. Attached is a sample of the slave
> log as well as screenshots from a testing cluster showing the tasks which are
> stuck in staging, and then a number of failed tasks which occurs after
> restarting the slave process. Justin Bieber is provided for scale.
> This may be related to MESOS-1837, and quite possibly the same issue, but it
> remains unclear.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)