[ 
https://issues.apache.org/jira/browse/MESOS-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-3760:
-------------------------------
    Component/s: libprocess

> Remove fragile sleep() from ProcessManager::settle()
> ----------------------------------------------------
>
>                 Key: MESOS-3760
>                 URL: https://issues.apache.org/jira/browse/MESOS-3760
>             Project: Mesos
>          Issue Type: Bug
>          Components: libprocess
>            Reporter: Neil Conway
>            Priority: Minor
>              Labels: mesosphere, tech-debt, testing
>
> From {{ProcessManager::settle()}}:
> {code}
>     // While refactoring in order to isolate libev behind abstractions
>     // it became evident that this os::sleep is vital for tests to
>     // pass. In particular, there are certain tests that assume too
>     // much before they attempt to do a settle. One such example is
>     // tests doing http::get followed by Clock::settle, where they
>     // expect the http::get will have properly enqueued a process on
>     // the run queue but http::get is just sending bytes on a
>     // socket. Without sleeping at the beginning of this function we
>     // can get unlucky and appear settled when in actuality the
>     // kernel just hasn't copied the bytes to a socket or we haven't
>     // yet read the bytes and enqueued an event on a process (and the
>     // process on the run queue).
>     os::sleep(Milliseconds(10));
> {code}
> Sleeping for 10 milliseconds doesn't guarantee that the kernel has done 
> anything at all; any test cases that depend on this behavior should be fixed 
> to actual perform the necessary synchronization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to