> On Oct. 20, 2017, 3:16 p.m., Bill Farner wrote: > > Capturing some offline analysis/discussion - under certain conditions this > > patch might do more harm than good. In clusters with very high churn rates > > (e.g. services being rescheduled frequently, high cron volume), static bans > > that outlive scheduling rounds can prevent a significant amount of > > redundant scheduling work. Jordan is experimenting with using an LRU cache > > for static bans instead, which would allow us to mitigate the memory leak > > while still avoiding redundant work. > > > > I suggest we hold on this patch until Jordan's analysis yields results.
I tested an LRU cache at scale and I found that it provided noticable reduction in assignment time for reasons listed in Bill's comment (services being rescheduled frequently, high cron volume) vs removal at the end of scheduling rounds. I posted a review with my implementation: https://reviews.apache.org/r/63199/ However, it requires the addition of another option which this patch does not. - Jordan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/63121/#review188843 ----------------------------------------------------------- On Oct. 19, 2017, 12:04 a.m., Bill Farner wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/63121/ > ----------------------------------------------------------- > > (Updated Oct. 19, 2017, 12:04 a.m.) > > > Review request for Aurora and Jordan Ly. > > > Repository: aurora > > > Description > ------- > > This alleviates a (slow) memory leak in static offer bans, as entries are only > removed when an offer is removed. If a pending task group is depleted > (either by fully scheduling the group, or terminating the job), the entry > remains. This issue is exacerbated when offers are held for a longer > duration, > as is proposed in https://reviews.apache.org/r/62956/. > > > Diffs > ----- > > src/main/java/org/apache/aurora/scheduler/events/PubsubEvent.java > 0637eb7f85125cf70b588d56fa7dc88130947837 > src/main/java/org/apache/aurora/scheduler/offers/OfferManager.java > e8334310a2a46a0ccb09ee6e4122c515892d3996 > src/main/java/org/apache/aurora/scheduler/scheduling/TaskGroups.java > 2d3492d05986ef65519fd7a8c71396d055b6881f > src/test/java/org/apache/aurora/scheduler/http/AbstractJettyTest.java > 6e77857fcf209d3fe70fbd30cfd8484ea0414ee2 > src/test/java/org/apache/aurora/scheduler/offers/OfferManagerImplTest.java > 2cfdc090ff75a63111ae146c9fe7b3542e7ac83f > src/test/java/org/apache/aurora/scheduler/scheduling/TaskGroupsTest.java > b88d5f13889b81ba4b0171efaf6c759d23976a39 > > > Diff: https://reviews.apache.org/r/63121/diff/2/ > > > Testing > ------- > > > Thanks, > > Bill Farner > >
