Meng Zhu created MESOS-9358: ------------------------------- Summary: Test `SlaveRecoveryTest.AgentReconfigurationWithRunningTask` is flaky. Key: MESOS-9358 URL: https://issues.apache.org/jira/browse/MESOS-9358 Project: Mesos Issue Type: Bug Reporter: Meng Zhu Assignee: Benno Evers Attachments: AgentReconfigurationWithRunningTask
The fails with: {noformat} ../../src/tests/slave_recovery_tests.cpp:4797 Failed to wait 15secs for offers2 {noformat} It is flaky because after partially accept the first offer, the scheduler will filter the agent for 5 seconds, and depending on the system load that may last more than 15 seconds: {noformat} I1026 14:32:40.976006 78393344 hierarchical.cpp:2388] Filtered offer with cpus:2 on agent c8805e35-97ee-44d4-9ab9-e7ea8d703c08-S0 for role * of framework c8805e35-97ee-44d4-9ab9-e7ea8d703c08-0000 I1026 14:32:40.976065 78393344 hierarchical.cpp:1566] Performed allocation for 1 agents in 197914ns I1026 14:32:41.981853 79466496 hierarchical.cpp:2388] Filtered offer with cpus:2 on agent c8805e35-97ee-44d4-9ab9-e7ea8d703c08-S0 for role * of framework c8805e35-97ee-44d4-9ab9-e7ea8d703c08-0000 I1026 14:32:41.981920 79466496 hierarchical.cpp:1566] Performed allocation for 1 agents in 185853ns I1026 14:32:42.983603 78393344 hierarchical.cpp:2388] Filtered offer with cpus:2 on agent c8805e35-97ee-44d4-9ab9-e7ea8d703c08-S0 for role * of framework c8805e35-97ee-44d4-9ab9-e7ea8d703c08-0000 I1026 14:32:42.983659 78393344 hierarchical.cpp:1566] Performed allocation for 1 agents in 179397ns I1026 14:32:57.363723 78929920 hierarchical.cpp:2388] Filtered offer with cpus:2 on agent c8805e35-97ee-44d4-9ab9-e7ea8d703c08-S0 for role * of framework c8805e35-97ee-44d4-9ab9-e7ea8d703c08-0000 I1026 14:32:57.363788 78929920 hierarchical.cpp:1566] Performed allocation for 1 agents in 466500ns ../../src/tests/slave_recovery_tests.cpp:4797: Failure Failed to wait 15secs for offers2 {noformat} I think we can either remove the expectation of the second offer (which I think is not essential to the test) or manually control the clock. Full log attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)