----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63199/#review189488 -----------------------------------------------------------
src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java Lines 321 (patched) <https://reviews.apache.org/r/63199/#comment266579> `expireAfterWrite` doesn't result in LRU. I think you mean `expireAfterAccess`. src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java Lines 42-43 (patched) <https://reviews.apache.org/r/63199/#comment266577> How about a `CacheBuilderSpec` to bundle these? src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java Lines 46-50 (patched) <https://reviews.apache.org/r/63199/#comment266578> We've developed a [stutter](https://michaelwhatcott.com/go-code-that-stutters/) over time in this class. Can you do some cleanup and s/offer// on parameters, fields, and methods? src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java Lines 104 (patched) <https://reviews.apache.org/r/63199/#comment266582> `0` is a valid size for guava's `CacheBuilder#maximumSize`: > When size is zero, elements will be evicted immediately after being loaded into the cache. This can be useful in testing, or to disable caching temporarily without a code change. Let's allow it, and introduce a `NotNegativeNumber` counterpart to `PositiveNumber`. src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java Lines 109 (patched) <https://reviews.apache.org/r/63199/#comment266583> s/Long/long/ src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java Line 178 (original), 188-189 (patched) <https://reviews.apache.org/r/63199/#comment266584> Is this needed only because of the default `offerStaticBanCacheMaxSize = Long.MAX_VALUE`? The double eviction strategy intuitively seems unnecessary. I suggest a finite default for `offerStaticBanCacheMaxSize` (say, 1k) and no time expiry. src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java Lines 247-249 (original), 253-255 (patched) <https://reviews.apache.org/r/63199/#comment266585> IMHO this test case is no longer interesting now that we have to punch through with `cleanupStaticBans()`. The test now reads as "test that clearing a cache clears the cache". I suggest removing the test case. - Bill Farner On Oct. 20, 2017, 10:53 p.m., Jordan Ly wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/63199/ > ----------------------------------------------------------- > > (Updated Oct. 20, 2017, 10:53 p.m.) > > > Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, > Stephan Erb, and Bill Farner. > > > Repository: aurora > > > Description > ------- > > Using the new `hold_offers_forever` option, it is possible for the > `staticallyBannedOffers` to grow very large in size as we never release > offers. > > As an alternative to https://reviews.apache.org/r/63121/, I propose changing > `staticallyBannedOffers` into a LRU cache which expires entries after > `min_offer_hold_time` + `offer_hold_jitter_window` (referred to as > `maxOfferHoldTime`), while also taking an option for a maximum size for the > cache. I believe that this approach has a couple of benefits: > > 1. The current behavior of `staticallyBannedOffers` is (kinda) preserved. > Entries will no longer be removed when the offer is used, but they will be > removed within `maxOfferHoldTime`. This means cluster operators will not have > to think about the new `offer_static_ban_cache_max_size` if they aren't > affected by the memory leak now. > 2. Cluster operators that use Aurora as a single framework and hold offers > indefinitely can cap the size of the cache to avoid the memory leak. > 3. Using an LRU cache greatly benefits quickly recurring crons and job > updates. > > > Diffs > ----- > > src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java > 5a9099bf9dd292249d72bc3a7604fbb3394f30ea > src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java > 7011a4cc9eea827cdd54698aaed1a653774bce7f > src/main/java/org/apache/aurora/scheduler/offers/OfferSettings.java > e060f2073dce4d2486d1ee2bfd873fe75167c6d0 > src/main/java/org/apache/aurora/scheduler/offers/OffersModule.java > e6b2c55e4f33f9a603157236766425edcaff10e7 > src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java > 5b502442163581daa4d7954b09c00bdc3680a726 > src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java > 6c8434e9cfe46ef63ff10c6f059ecb99981f29a2 > > > Diff: https://reviews.apache.org/r/63199/diff/4/ > > > Testing > ------- > > Unit tests pass. > Deployed on a scale test cluster and saw that a) `staticallyBannedOffers` > memory leak fixed with correct options and b) lowered assignment time for > quickly recurring crons and rescheduled jobs. > > > Thanks, > > Jordan Ly > >
