-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63199/
-----------------------------------------------------------

(Updated Oct. 30, 2017, 10:50 p.m.)


Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, Stephan 
Erb, and Bill Farner.


Changes
-------

Create `CacheBuilder` in `OfferSettings` instead, pass to `OfferManager`


Repository: aurora


Description
-------

Using the new `hold_offers_forever` option, it is possible for the 
`staticallyBannedOffers` to grow very large in size as we never release offers.

As an alternative to https://reviews.apache.org/r/63121/, I propose changing 
`staticallyBannedOffers` into a LRU cache which expires entries after 
`min_offer_hold_time` + `offer_hold_jitter_window` (referred to as 
`maxOfferHoldTime`), while also taking an option for a maximum size for the 
cache. I believe that this approach has a couple of benefits:

1. The current behavior of `staticallyBannedOffers` is (kinda) preserved. 
Entries will no longer be removed when the offer is used, but they will be 
removed within `maxOfferHoldTime`. This means cluster operators will not have 
to think about the new `offer_static_ban_cache_max_size` if they aren't 
affected by the memory leak now.
2. Cluster operators that use Aurora as a single framework and hold offers 
indefinitely can cap the size of the cache to avoid the memory leak.
3. Using an LRU cache greatly benefits quickly recurring crons and job updates.


Diffs (updated)
-----

  src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 
e0ec793cad05674fb4b65246a6d153521b28b914 
  
src/main/java/org/apache/aurora/scheduler/config/validators/NotNegativeNumber.java
 PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java 
7011a4cc9eea827cdd54698aaed1a653774bce7f 
  src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java 
e060f2073dce4d2486d1ee2bfd873fe75167c6d0 
  src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java 
e6b2c55e4f33f9a603157236766425edcaff10e7 
  src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
244422c6b3ac6a2f7b4690cdc0f3440170b2567f 
  src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java 
3d38a5929a0255a980db30eeca0f059a2029f321 


Diff: https://reviews.apache.org/r/63199/diff/6/

Changes: https://reviews.apache.org/r/63199/diff/5-6/


Testing
-------

Unit tests pass.
Deployed on a scale test cluster and saw that a) `staticallyBannedOffers` 
memory leak fixed with correct options and b) lowered assignment time for 
quickly recurring crons and rescheduled jobs.


Thanks,

Jordan Ly

Reply via email to