[ https://issues.apache.org/jira/browse/MESOS-8930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16699577#comment-16699577 ]
Vinod Kone commented on MESOS-8930: ----------------------------------- Still seeing this in CI. [~bmahler] Do we have any abstractions/techniques in place that allows us to ensure the http request is enqueued in a more robust matter? Sounds like the 10ms is sometimes not enough in ASF CI. Kinda unrelated bug here is that the code does a "response->body" on a (possibly pending) future causing it to hang forever. This will block the whole test suite! {code} AWAIT_EXPECT_RESPONSE_STATUS_EQ(OK().status, response); // Parse the response. Try<JSON::Object> responseJSON = JSON::parse<JSON::Object>(response->body); ASSERT_SOME(responseJSON); {code} I think we should atleast change the `AWAIT_EXPECT_*` above to `AWAIT_ASSERT` so that the rest of the test code is skipped. cc [~greggomann] [~bmahler] > THREADSAFE_SnapshotTimeout is flaky. > ------------------------------------ > > Key: MESOS-8930 > URL: https://issues.apache.org/jira/browse/MESOS-8930 > Project: Mesos > Issue Type: Bug > Components: test > Environment: Ubuntu 16.04 > Reporter: Alexander Rukletsov > Assignee: Benjamin Mahler > Priority: Major > Labels: flaky-test, mesosphere > > Observed on ASF CI, might be related to a recent test change > https://reviews.apache.org/r/66831/ > {noformat} > 18:23:31 2: [ RUN ] MetricsTest.THREADSAFE_SnapshotTimeout > 18:23:31 2: I0516 18:23:31.747611 16246 process.cpp:3583] Handling HTTP event > for process 'metrics' with path: '/metrics/snapshot' > 18:23:31 2: I0516 18:23:31.796871 16251 process.cpp:3583] Handling HTTP event > for process 'metrics' with path: '/metrics/snapshot' > 18:23:46 2: /tmp/SRC/3rdparty/libprocess/src/tests/metrics_tests.cpp:425: > Failure > 18:23:46 2: Failed to wait 15secs for response > 22:57:13 Build timed out (after 300 minutes). Marking the build as failed. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)