[
https://issues.apache.org/jira/browse/MESOS-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383515#comment-15383515
]
aiminlei commented on MESOS-5859:
---------------------------------
staged task: rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a
mesos-slave log:
[root@mesos-cluster-10 log]# cat
mesos-slave.mesos-cluster-10.37.2.35.invalid-user.log.* | grep
"rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a"
I0719 09:17:31.280827 130624 slave.cpp:1360] Got assigned task
rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a for framework
34b58730-aaa6-4660-8e38-1148bf738459-0000
I0719 09:17:31.292891 130624 slave.cpp:1479] Launching task
rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a for framework
34b58730-aaa6-4660-8e38-1148bf738459-0000
I0719 09:17:31.293154 130624 paths.cpp:472] Trying to chown
'/opt/sncc/data/mesos-slave/slaves/8aecba0e-3d6d-492d-8908-d31e2d019343-S4/frameworks/34b58730-aaa6-4660-8e38-1148bf738459-0000/executors/rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a/runs/fdbc2f1f-2240-4187-bf33-d2793dc86995'
to user 'root'
I0719 09:17:31.298758 130624 slave.cpp:5281] Launching executor
rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a of framework
34b58730-aaa6-4660-8e38-1148bf738459-0000 with resources cpus(*):0.1; mem(*):32
in work directory
'/opt/sncc/data/mesos-slave/slaves/8aecba0e-3d6d-492d-8908-d31e2d019343-S4/frameworks/34b58730-aaa6-4660-8e38-1148bf738459-0000/executors/rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a/runs/fdbc2f1f-2240-4187-bf33-d2793dc86995'
I0719 09:17:31.301000 130624 slave.cpp:1697] Queuing task
'rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a' for executor
'rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a' of framework
34b58730-aaa6-4660-8e38-1148bf738459-0000
I0719 09:17:31.305286 130601 docker.cpp:803] Starting container
'fdbc2f1f-2240-4187-bf33-d2793dc86995' for task
'rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a' (and executor
'rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a') of framework
'34b58730-aaa6-4660-8e38-1148bf738459-0000'
I0719 09:17:32.140924 130632 docker.cpp:409] Checkpointing pid 86399 to
'/opt/sncc/data/mesos-slave/meta/slaves/8aecba0e-3d6d-492d-8908-d31e2d019343-S4/frameworks/34b58730-aaa6-4660-8e38-1148bf738459-0000/executors/rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a/runs/fdbc2f1f-2240-4187-bf33-d2793dc86995/pids/forked.pid'
I0719 09:17:32.172355 130626 slave.cpp:2642] Got registration for executor
'rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a' of framework
34b58730-aaa6-4660-8e38-1148bf738459-0000 from executor(1)@10.37.2.35:36139
I0719 09:17:32.178768 130624 slave.cpp:1862] Sending queued task
'rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a' to executor
'rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a' of framework
34b58730-aaa6-4660-8e38-1148bf738459-0000 at executor(1)@10.37.2.35:36139
I0719 09:20:10.351904 130629 slave.cpp:1890] Asked to kill task
rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a of framework
34b58730-aaa6-4660-8e38-1148bf738459-0000
I0719 09:20:15.369649 130628 slave.cpp:1890] Asked to kill task
rep-test49.90b9c489-4d4e-11e6-b8f7-02427baebb1a of framework
34b58730-aaa6-4660-8e38-1148bf738459-0000
docker.log: can not find "fdbc2f1f-2240-4187-bf33-d2793dc86995"
[root@mesos-cluster-10 log]# cat docker.log | grep
"fdbc2f1f-2240-4187-bf33-d2793dc86995"
[root@mesos-cluster-10 log]#
mesos-docker-executor process stack:
#0 0x00007f4cd67316d5 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1 0x00007f4cd64cf9ec in
std::condition_variable::wait(std::unique_lock<std::mutex>&) () from
/lib64/libstdc++.so.6
#2 0x00007f4cd7b8f610 in process::ProcessManager::wait(process::UPID const&)
() from /lib/libmesos-0.27.2.so
#3 0x00007f4cd7b8fc97 in process::wait(process::UPID const&, Duration const&)
() from /lib/libmesos-0.27.2.so
#4 0x00007f4cd7b65491 in process::Latch::await(Duration const&) () from
/lib/libmesos-0.27.2.so
#5 0x00007f4cd733e5a7 in mesos::MesosExecutorDriver::join() () from
/lib/libmesos-0.27.2.so
#6 0x0000000000419562 in main ()
stderr:
#0 0x00007f4cd67316d5 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1 0x00007f4cd64cf9ec in
std::condition_variable::wait(std::unique_lock<std::mutex>&) () from
/lib64/libstdc++.so.6
#2 0x00007f4cd7b8f610 in process::ProcessManager::wait(process::UPID const&)
() from /lib/libmesos-0.27.2.so
#3 0x00007f4cd7b8fc97 in process::wait(process::UPID const&, Duration const&)
() from /lib/libmesos-0.27.2.so
#4 0x00007f4cd7b65491 in process::Latch::await(Duration const&) () from
/lib/libmesos-0.27.2.so
#5 0x00007f4cd733e5a7 in mesos::MesosExecutorDriver::join() () from
/lib/libmesos-0.27.2.so
#6 0x0000000000419562 in main ()
> some tasks always in staged
> ---------------------------
>
> Key: MESOS-5859
> URL: https://issues.apache.org/jira/browse/MESOS-5859
> Project: Mesos
> Issue Type: Bug
> Components: docker
> Affects Versions: 0.27.2
> Environment: mesos+marathon+docker
> Reporter: aiminlei
> Priority: Critical
>
> when i create 30*2 apps through marathon api in a mesos-slave node. most
> tasks create sucess. there was two task always in staged stat.
> in mesos-slave, mesos-docker-executor process is running,but docker container
> was not created, docker did not receive the message of creating containter
> through looking up docker.log,.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)