chenqiuhao created MESOS-2706:
---------------------------------
Summary: when the docker-tasks grow more and more ,the time spare
between Queuing task and Starting container grows
Key: MESOS-2706
URL: https://issues.apache.org/jira/browse/MESOS-2706
Project: Mesos
Issue Type: Bug
Components: docker
Affects Versions: 0.22.0
Environment: My Environment info:
Mesos 0.22.0 & Marathon 0.82-RC1 both running in one host-server.
Every docker-task require 0.02 CPU and 128MB ,and the server has 8 cpus and 24G
mems.
So Mesos can launch thousands of task in theory.
And the docker-task is very light-weight to launch a sshd service .
Reporter: chenqiuhao
At the beginning, Marathon can launch docker-task very fast,but when the number
of tasks in the only-one mesos-slave host reached 50,It seemed Marathon lauch
docker-task slow.
So I check the mesos-slave log,and I found that the time spare between Queuing
task and Starting container grew .
For example,
launch the 1st docker task, it takes about 0.008s
[root@CNSH231434 mesos-slave]# tail -f slave.out |egrep 'Queuing task|Starting
container'
I0508 15:54:00.188350 225779 slave.cpp:1378] Queuing task
'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' for executor
dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b of framework
'20150202-112355-2684495626-5050-26153-0000
I0508 15:54:00.196832 225781 docker.cpp:581] Starting container
'd0b0813a-6cb6-4dfd-bbce-f1b338744285' for task
'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b' (and executor
'dev-rhel-sf.631d454d-f557-11e4-b4f4-628e0a30542b') of framework
'20150202-112355-2684495626-5050-26153-0000'
launch the 50th docker task, it takes about 4.9s
I0508 16:12:10.908596 225781 slave.cpp:1378] Queuing task
'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' for executor
dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b of framework
'20150202-112355-2684495626-5050-26153-0000
I0508 16:12:15.801503 225778 docker.cpp:581] Starting container
'482dd47f-b9ab-4b09-b89e-e361d6f004a4' for task
'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b' (and executor
'dev-rhel-sf.ed3a6922-f559-11e4-ae87-628e0a30542b') of framework
'20150202-112355-2684495626-5050-26153-0000'
And when i launch the 100th docker task,it takes about 13s!
And I did the same test in one 24 Cpus and 256G mems server-host, it got the
same result.
Did somebody have the same experience , or Can help to do the same pressure
test ?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)