> On April 24, 2016, 5:48 p.m., Bill Farner wrote:
> > src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java, line 51
> > <https://reviews.apache.org/r/46603/diff/2/?file=1358596#file1358596line51>
> >
> >     Does this default value effect the same behavior as before the patch?

Using a default of `0` is indeed a behaviour change. I am happy to discuss if 
we want this change or not. 

With a timeout of `5` secs (this was the former hardcoded default):

* When launching a task, Mesos will only re-offer the unused resources in the 
offer after 5 seconds. 
* When declining offers in order to merge two offers into one, Mesos will only 
re-offer resources of this slave after 5s.

With timeout of `0` secs:

* The resources can be returned instantly within the next offer-cycle of the 
Mesos allocator.

We tend to have the problem that a timeout of 5 breaks the maintenance feature 
for us. We regularly schedule jobs with #instances > #nodes in the cluster. In 
this case, all available offers are quickly depleted and Aurora begins to 
schedule onto nodes which were supposed to be put into maintenance mode. Only 
after the timeout of 5 seonds has passed, Mesos will re-offer resources to 
Aurora. I believe we might not be the only one with this problem and therefore 
think 0 is a good default.


- Stephan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46603/#review130306
-----------------------------------------------------------


On April 23, 2016, 6:35 p.m., Stephan Erb wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46603/
> -----------------------------------------------------------
> 
> (Updated April 23, 2016, 6:35 p.m.)
> 
> 
> Review request for Aurora, Maxim Khutornenko and Bill Farner.
> 
> 
> Bugs: AURORA-1658
>     https://issues.apache.org/jira/browse/AURORA-1658
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Aurora is declining Mesos offers implicitly when launching a task and 
> explicitly when compacting multiple offers of a slave into a single one.
> The filter duration instructs Mesos to return the declined resources to us 
> only after a timeout of X seconds, even if there is no other framework that 
> wants them. If no filter is supplied, the hardcoded default of 5 seconds 
> would be used.
> 
> By making this value configurable, Aurora can be tuned for either single or 
> multi-framework deployment.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md 4b810f2d808cbf0d91c753147d98d1e389106d22 
>   src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
> 1d725c03d16116257e1c4242ebf60f5931d4600f 
>   src/jmh/java/org/apache/aurora/benchmark/fakes/FakeDriver.java 
> d1bb8f29c9bed42c27624204b9d34ab1893468f7 
>   src/main/java/org/apache/aurora/scheduler/mesos/Driver.java 
> 013c50cf70fe45fc2a74c1ea5dccccfaba14225c 
>   src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverService.java 
> 7ff3e3e5dc70187066b914f7feb65d99f2145303 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 
> 452451f239a964c1b55ede3d6fbde0bd805e4b00 
>   src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java 
> 90f8abf830478ad48f9a8a62c1c42423ab0f8d57 
>   
> src/main/java/org/apache/aurora/scheduler/offers/RandomJitterReturnDelay.java 
> a52fd4e8cd5c32d9560d4d72958a54bef820d81c 
>   src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 
> 76da6d80d91221336e50d596cc2f49e890451fd1 
> 
> Diff: https://reviews.apache.org/r/46603/diff/
> 
> 
> Testing
> -------
> 
> * ./gradlew -Pq build 
> * ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
>  
> I have also conducted an (unscientific) benchmark in Vagrant and started a 
> job with 5 instances and recorded the time from `PENDING` to `RUNNING` for 
> the slowest ones:
> 
> * 7s startup time for a filter duration of 0 seconds
> * 29s startup time for the hardcoded former default of 5 seconds
> 
> 
> Thanks,
> 
> Stephan Erb
> 
>

Reply via email to