> On March 18, 2016, 9:47 a.m., Adam B wrote:
> > src/tests/resource_offers_tests.cpp, line 63
> > <https://reviews.apache.org/r/44989/diff/2/?file=1304722#file1304722line63>
> >
> >     Why pause so soon? You can wait until after the master is started, but 
> > just before you start calling StartSlave() in the loop
> 
> Joerg Schad wrote:
>     I agree with you, but actually we follow this pattern in many other tests 
> as well.
>     E.g. 
>     // This test ensures that allocation is done per slave. This is done
>     // by having 2 slaves and 2 frameworks and making sure each framework
>     // gets only one slave's resources during an allocation.
>     TEST_F(HierarchicalAllocatorTest, CoarseGrained)
>     {
>       // Pausing the clock ensures that the batch allocation does not
>       // influence this test.
> 
> Joerg Schad wrote:
>     TEST_F(HierarchicalAllocatorTest, CoarseGrained)
>     {
>       // Pausing the clock ensures that the batch allocation does not
>       // influence this test.
>       Clock::pause();

Is there an advantage to delaying the pause? Running the whole test with the 
clock paused (and hence pausing the clock at the beginning of the test) seems 
fine to me.


- Neil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44989/#review124159
-----------------------------------------------------------


On March 18, 2016, 4:35 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44989/
> -----------------------------------------------------------
> 
> (Updated March 18, 2016, 4:35 p.m.)
> 
> 
> Review request for mesos, Adam B and Joerg Schad.
> 
> 
> Bugs: MESOS-4849
>     https://issues.apache.org/jira/browse/MESOS-4849
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Fixed a race in the resource offers tests.
> 
> Adding HTTP credentials to `StartSlave` in 'src/tests/mesos.cpp' has exposed 
> a race condition in ResourceOffersTest.ResourceOfferWithMultipleSlaves. The 
> test quickly runs `StartSlave` 10 times to create 10 agents. Under the 
> covers, `StartSlave` writes data to disk, and it seems that with the 
> additional data being written to disk for HTTP credentials, the filesystem 
> operations for one `StartSlave` call were not completing before the next call.
> 
> By settling the clock in between each invocation of `StartSlave`, this patch 
> fixes the race. The test is slowed considerably, but it is now reliable.
> 
> 
> Diffs
> -----
> 
>   src/tests/resource_offers_tests.cpp 
> 1cf292ee7931207596f8f06677386bef5965ef15 
> 
> Diff: https://reviews.apache.org/r/44989/diff/
> 
> 
> Testing
> -------
> 
> `GTEST_FILTER="ResourceOffersTest.ResourceOfferWithMultipleSlaves" 
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure=1` was used 
> to test on both OSX and Ubuntu 14.04.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>

Reply via email to