> On Oct. 27, 2017, 10:03 p.m., Bill Farner wrote: > > src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java > > Lines 321 (patched) > > <https://reviews.apache.org/r/63199/diff/4/?file=1865170#file1865170line327> > > > > `expireAfterWrite` doesn't result in LRU. I think you mean > > `expireAfterAccess`.
LRU is maintained if the operator specifies a maximum size (https://github.com/google/guava/wiki/CachesExplained#size-based-eviction). I use `expireAfterWrite` because offers will not be valid after `minOfferHoldTime`. I believe both strategies can exist simultaneously. > On Oct. 27, 2017, 10:03 p.m., Bill Farner wrote: > > src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java > > Lines 42-43 (patched) > > <https://reviews.apache.org/r/63199/diff/4/?file=1865171#file1865171line42> > > > > How about a `CacheBuilderSpec` to bundle these? I think that using a `CacheBuilderSpec` would cause some odd string manipulation from our argument names to the ones `CacheBuilderSpec` uses. I could just build the `CacheBuilder` in `OfferModule`/`OfferSettings` and pass that into `OfferManager` instead. Which do you think would be cleaner? > On Oct. 27, 2017, 10:03 p.m., Bill Farner wrote: > > src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java > > Lines 46-50 (patched) > > <https://reviews.apache.org/r/63199/diff/4/?file=1865171#file1865171line46> > > > > We've developed a > > [stutter](https://michaelwhatcott.com/go-code-that-stutters/) over time in > > this class. Can you do some cleanup and s/offer// on parameters, fields, > > and methods? Done. > On Oct. 27, 2017, 10:03 p.m., Bill Farner wrote: > > src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java > > Lines 104 (patched) > > <https://reviews.apache.org/r/63199/diff/4/?file=1865172#file1865172line104> > > > > `0` is a valid size for guava's `CacheBuilder#maximumSize`: > > > When size is zero, elements will be evicted immediately after being > > loaded into the cache. This can be useful in testing, or to disable caching > > temporarily without a code change. > > > > Let's allow it, and introduce a `NotNegativeNumber` counterpart to > > `PositiveNumber`. Added. > On Oct. 27, 2017, 10:03 p.m., Bill Farner wrote: > > src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java > > Lines 109 (patched) > > <https://reviews.apache.org/r/63199/diff/4/?file=1865172#file1865172line109> > > > > s/Long/long/ Done. > On Oct. 27, 2017, 10:03 p.m., Bill Farner wrote: > > src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java > > Line 178 (original), 188-189 (patched) > > <https://reviews.apache.org/r/63199/diff/4/?file=1865172#file1865172line188> > > > > Is this needed only because of the default `offerStaticBanCacheMaxSize > > = Long.MAX_VALUE`? The double eviction strategy intuitively seems > > unnecessary. I suggest a finite default for `offerStaticBanCacheMaxSize` > > (say, 1k) and no time expiry. Spoke offline, but added a comment to explain the reasoning behind the dual eviction strategy. > On Oct. 27, 2017, 10:03 p.m., Bill Farner wrote: > > src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java > > Lines 247-249 (original), 253-255 (patched) > > <https://reviews.apache.org/r/63199/diff/4/?file=1865174#file1865174line253> > > > > IMHO this test case is no longer interesting now that we have to punch > > through with `cleanupStaticBans()`. The test now reads as "test that > > clearing a cache clears the cache". I suggest removing the test case. `cleanupWithStaticBans()` only ensures that the `size()` method returns the correct value by not counting evicted entries (i.e. the entry we just expired in the test will not be counted). I think this test is still useful as it ensures the eviction on expiration strategy works as intended. I added a comment to `cleanupWithStaticBans()` for clarity. - Jordan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63199/#review189488 ----------------------------------------------------------- On Oct. 30, 2017, 6:14 p.m., Jordan Ly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/63199/ > ----------------------------------------------------------- > > (Updated Oct. 30, 2017, 6:14 p.m.) > > > Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, > Stephan Erb, and Bill Farner. > > > Repository: aurora > > > Description > ------- > > Using the new `hold_offers_forever` option, it is possible for the > `staticallyBannedOffers` to grow very large in size as we never release > offers. > > As an alternative to https://reviews.apache.org/r/63121/, I propose changing > `staticallyBannedOffers` into a LRU cache which expires entries after > `min_offer_hold_time` + `offer_hold_jitter_window` (referred to as > `maxOfferHoldTime`), while also taking an option for a maximum size for the > cache. I believe that this approach has a couple of benefits: > > 1. The current behavior of `staticallyBannedOffers` is (kinda) preserved. > Entries will no longer be removed when the offer is used, but they will be > removed within `maxOfferHoldTime`. This means cluster operators will not have > to think about the new `offer_static_ban_cache_max_size` if they aren't > affected by the memory leak now. > 2. Cluster operators that use Aurora as a single framework and hold offers > indefinitely can cap the size of the cache to avoid the memory leak. > 3. Using an LRU cache greatly benefits quickly recurring crons and job > updates. > > > Diffs > ----- > > src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java > e0ec793cad05674fb4b65246a6d153521b28b914 > > src/main/java/org/apache/aurora/scheduler/config/validators/NotNegativeNumber.java > PRE-CREATION > src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java > 7011a4cc9eea827cdd54698aaed1a653774bce7f > src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java > e060f2073dce4d2486d1ee2bfd873fe75167c6d0 > src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java > e6b2c55e4f33f9a603157236766425edcaff10e7 > src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java > 244422c6b3ac6a2f7b4690cdc0f3440170b2567f > src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java > 3d38a5929a0255a980db30eeca0f059a2029f321 > > > Diff: https://reviews.apache.org/r/63199/diff/5/ > > > Testing > ------- > > Unit tests pass. > Deployed on a scale test cluster and saw that a) `staticallyBannedOffers` > memory leak fixed with correct options and b) lowered assignment time for > quickly recurring crons and rescheduled jobs. > > > Thanks, > > Jordan Ly > >
