[ 
https://issues.apache.org/jira/browse/MESOS-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16442279#comment-16442279
 ] 

Andrei Budnik edited comment on MESOS-8732 at 4/24/18 12:37 PM:
----------------------------------------------------------------

After setting composing c'zer by default, some tests (e.g. 
`AgentAPITest.AttachContainerInputValidation`) started to hang due to a paused 
clocks and the use of clock-dependant methods, like `await()`, `delay()`, etc. 
by the docker library.

It hangs in 
[`Docker::validateVersion()`|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L241],
 which is called from 
[`Docker::create()`|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L145].
 After I added `Clock::resume()` before calling 
`version.await(DOCKER_VERSION_WAIT_TIMEOUT)`, tests have started to hang due to 
the hanging docker recovery: docker c'zer launches `docker ps -a` subprocess 
and [subscribes for its 
termination|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L1466-L1467].
 As a reaper process [uses 
`delay()`|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/reap.cpp#L112],
 this leads to a hanging recovery process for the docker c'zer.


was (Author: abudnik):
After setting composing c'zer by default, some tests (e.g. 
`AgentAPITest.AttachContainerInputValidation`) started to hang due to a paused 
clocks and the use of clock-dependant methods, like `await()`, `delay()`, etc. 
by the docker library.

It hangs in 
[`Docker::validateVersion()`|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L24],
 which is called from 
[`Docker::create()`|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L145].
 After I added `Clock::resume()` before calling 
`version.await(DOCKER_VERSION_WAIT_TIMEOUT)`, tests have started to hang due to 
the hanging docker recovery: docker c'zer launches `docker ps -a` subprocess 
and [subscribes for its 
termination|https://github.com/apache/mesos/blob/ca21ca82071f2c53d5817424569977728260da65/src/docker/docker.cpp#L1466-L1467].
 As a reaper process [uses 
`delay()`|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/reap.cpp#L112],
 this leads to a hanging recovery process for the docker c'zer.

> Use composing containerizer by default in tests.
> ------------------------------------------------
>
>                 Key: MESOS-8732
>                 URL: https://issues.apache.org/jira/browse/MESOS-8732
>             Project: Mesos
>          Issue Type: Task
>          Components: containerization
>            Reporter: Andrei Budnik
>            Assignee: Andrei Budnik
>            Priority: Major
>              Labels: containerizer, mesosphere, tests
>
> If we assign "docker,mesos" to the `containerizers` flag for an agent, then 
> `ComposingContainerizer` will be used for many tests that do not specify 
> `containerizers` flag. That's the goal of this task.
> I tried to do that by adding [`flags.containerizers = 
> "docker,mesos";`|https://github.com/apache/mesos/blob/master/src/tests/mesos.cpp#L273],
>  but it turned out that some tests are started to hang due to a paused 
> clocks, while docker c'zer and docker library use libprocess clocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to