----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63199/#review189648 -----------------------------------------------------------
Ship it! LGTM, with the switch to `CacheBuilder` in `OfferSettings` as the finishing touch. Nice patch! - Bill Farner On Oct. 30, 2017, 11:14 a.m., Jordan Ly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/63199/ > ----------------------------------------------------------- > > (Updated Oct. 30, 2017, 11:14 a.m.) > > > Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, > Stephan Erb, and Bill Farner. > > > Repository: aurora > > > Description > ------- > > Using the new `hold_offers_forever` option, it is possible for the > `staticallyBannedOffers` to grow very large in size as we never release > offers. > > As an alternative to https://reviews.apache.org/r/63121/, I propose changing > `staticallyBannedOffers` into a LRU cache which expires entries after > `min_offer_hold_time` + `offer_hold_jitter_window` (referred to as > `maxOfferHoldTime`), while also taking an option for a maximum size for the > cache. I believe that this approach has a couple of benefits: > > 1. The current behavior of `staticallyBannedOffers` is (kinda) preserved. > Entries will no longer be removed when the offer is used, but they will be > removed within `maxOfferHoldTime`. This means cluster operators will not have > to think about the new `offer_static_ban_cache_max_size` if they aren't > affected by the memory leak now. > 2. Cluster operators that use Aurora as a single framework and hold offers > indefinitely can cap the size of the cache to avoid the memory leak. > 3. Using an LRU cache greatly benefits quickly recurring crons and job > updates. > > > Diffs > ----- > > src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java > e0ec793cad05674fb4b65246a6d153521b28b914 > > src/main/java/org/apache/aurora/scheduler/config/validators/NotNegativeNumber.java > PRE-CREATION > src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java > 7011a4cc9eea827cdd54698aaed1a653774bce7f > src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java > e060f2073dce4d2486d1ee2bfd873fe75167c6d0 > src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java > e6b2c55e4f33f9a603157236766425edcaff10e7 > src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java > 244422c6b3ac6a2f7b4690cdc0f3440170b2567f > src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java > 3d38a5929a0255a980db30eeca0f059a2029f321 > > > Diff: https://reviews.apache.org/r/63199/diff/5/ > > > Testing > ------- > > Unit tests pass. > Deployed on a scale test cluster and saw that a) `staticallyBannedOffers` > memory leak fixed with correct options and b) lowered assignment time for > quickly recurring crons and rescheduled jobs. > > > Thanks, > > Jordan Ly > >