> On Feb. 12, 2017, 12:14 a.m., Santhosh Kumar Shanmugham wrote: > > src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java, > > line 62 > > <https://reviews.apache.org/r/56575/diff/1/?file=1630791#file1630791line62> > > > > It is worthwhile to note that we are moving from a workload that was > > spread over a duration to a bursty instanteous workload (saw-tooth like), > > which can potentially make the situation worse by causing a thundering-herd > > at regular intervals. > > Mehrdad Nurolahzade wrote: > That's a valid concern; testing can better clarify this. > > I agree that the existing algorithm offers a better best/average case > behavior (due to its scheduled pruning strategy). However, I still think the > worst case behavior of this implementation is better for two reasons (1) > every task/job is evaluated only once and (2) first prune after restart is > similar to other prunes and is not burstier. The burst can better be tamed by > reducing the pruning interval (e.g., 5 minutes). > > I believe the key to get this bursty workload under control is extending > `org.apache.aurora.scheduler.base.Query` abstraction. If we add something > like `.limit(int)` then we can control the max volume of tasks retrieved == > load to be processed == garbage to be collected. > > Stephan Erb wrote: > Have you considered to use a control flow in the form of: > > for job j in all jobs: > retrive terminal tasks of j > do pruning for retrieved tasks > > This would result in less peak memory consumption as only a small portion > of terminal tasks will be worked on simultaneously. If you are concerned > about heap pressure this may be a favorable setup.
I originally did but then dropped it fearing that it causes overhead. Now in hindsight, I liked this alternative better; in addition to reducing heap pressure it also elminates the concern regarding `MemTaskStore` full-scans. Will submit a new patch following this design. - Mehrdad ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56575/#review165260 ----------------------------------------------------------- On Feb. 11, 2017, 3:12 p.m., Mehrdad Nurolahzade wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56575/ > ----------------------------------------------------------- > > (Updated Feb. 11, 2017, 3:12 p.m.) > > > Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, and > Stephan Erb. > > > Bugs: AURORA-1837 > https://issues.apache.org/jira/browse/AURORA-1837 > > > Repository: aurora > > > Description > ------- > > This patch addressed efficiency issues in the current implementation of > `TaskHistoryPruner`. The new design is similar to that of > `JobUpdateHistoryPruner`: (a) Instead of registering a `DelayExecutor` run > upon terminal task state transitions, it runs on preconfigured intervals, > finds all terminal state tasks that meet pruning criteria and deletes them. > (b) Makes the initial task history pruning delay configurable so that it does > not hamper scheduler upon start. > > The new design addressed the following two efficiecy problems: > > 1. Upon scheduler restart/failure, the in-memory state of task history > pruning scheduled with `DelayExecutor` is lost. `TaskHistoryPruner` learns > about these dead tasks upon restart when log is replayed. These expired tasks > are picked up by the second call to `executor.execute()` that performs job > level pruning immediately (i.e., without delay). Hence, most task history > pruning happens after scheduler restarts and can severely hamper scheduler > performance (or cause consecutive fail-overs on test clusters when we put > load test on scheduler). > > 2. Expired tasks can be picked up for pruning multiple times. The > asynchronous nature of `BatchWorker` which used to process task deletions > introduces some delay between delete enqueue and delete execution. As a > result, tasks already queued for deletion in a previous evaluation round > might get picked up, evaluated and enqueued for deletion again. This is > evident in `tasks_pruned` metric which reflects numbers much higher than the > actual number of expired tasks deleted. > > > Diffs > ----- > > src/main/java/org/apache/aurora/scheduler/base/Query.java > c76b365f43eb6a3b9b0b63a879b43eb04dcd8fac > src/main/java/org/apache/aurora/scheduler/pruning/PruningModule.java > 735199ac1ccccab343c24471890aa330d6635c26 > src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java > f77849498ff23616f1d56d133eb218f837ac3413 > > src/test/java/org/apache/aurora/scheduler/pruning/TaskHistoryPrunerTest.java > 14e4040e0b94e96f77068b41454311fa3bf53573 > > Diff: https://reviews.apache.org/r/56575/diff/ > > > Testing > ------- > > Manual testing under Vagrant > > > Thanks, > > Mehrdad Nurolahzade > >
