Neil Conway created MESOS-3760:
----------------------------------

             Summary: Remove fragile sleep() from ProcessManager::settle()
                 Key: MESOS-3760
                 URL: https://issues.apache.org/jira/browse/MESOS-3760
             Project: Mesos
          Issue Type: Bug
            Reporter: Neil Conway
            Priority: Minor


>From {{ProcessManager::settle()}}:

{code}
    // While refactoring in order to isolate libev behind abstractions
    // it became evident that this os::sleep is vital for tests to
    // pass. In particular, there are certain tests that assume too
    // much before they attempt to do a settle. One such example is
    // tests doing http::get followed by Clock::settle, where they
    // expect the http::get will have properly enqueued a process on
    // the run queue but http::get is just sending bytes on a
    // socket. Without sleeping at the beginning of this function we
    // can get unlucky and appear settled when in actuality the
    // kernel just hasn't copied the bytes to a socket or we haven't
    // yet read the bytes and enqueued an event on a process (and the
    // process on the run queue).
    os::sleep(Milliseconds(10));
{code}

Sleeping for 10 milliseconds doesn't guarantee that the kernel has done 
anything at all; any test cases that depend on this behavior should be fixed to 
actual perform the necessary synchronization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to