----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/51763/#review149279 -----------------------------------------------------------
Ship it! Master (f1e09a9) is green with this patch. ./build-support/jenkins/build.sh I will refresh this build result if you post a review containing "@ReviewBot retry" - Aurora ReviewBot On Sept. 16, 2016, 9:04 p.m., Maxim Khutornenko wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/51763/ > ----------------------------------------------------------- > > (Updated Sept. 16, 2016, 9:04 p.m.) > > > Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji. > > > Repository: aurora > > > Description > ------- > > This is the second part of the `BatchWorker` conversion work that moves cron > jobs to use non-blocking kill followups and reduces the number of trigger > threads. See https://reviews.apache.org/r/51759 for more background on the > `BatchWorker`. > > #####Problem > The current implementation of the cron scheduling relies on a large number of > threads (`cron_scheduler_num_threads`=100) to support cron triggering and > killing existing tasks according to `KILL_EXISTING` collision policy. This > creates large spikes of activities at synchronized intervals as users tend to > schedule their cron runs around similar schedules. Moreover, the current > implementation re-acquires write locks multiple times to deliver on > `KILL_EXISTING` policy. > > #####Remediation > Trigger level batching is still done in a blocking way but multiple cron > triggers may be bundled together to share the same write transaction. Any > followups, however, are performed in a non-blocking way by relying on a > `BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. > In order to still ensure non-concurrent execution of a given job key trigger, > a token (job key) is saved within the trigger itself. A concurrent trigger > will bail if a kill followup is still in progress (token is set AND no entry > in `killFollowups` set exists yet). > > #####Results > The above approach allowed reducing the number of cron threads to 10 and > likely can be reduced even further. See https://reviews.apache.org/r/51759 > for the lock contention results. > > > Diffs > ----- > > commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java > 8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 > commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java > bc30990d57f444f7d64805ed85c363f1302736d0 > src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java > c07551e94f9221b5b21c5dc9715e82caa290c2e8 > src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java > 155d702d68367b247dd066f773c662407f0e3b5b > > src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java > 5c64ff2994e200b3453603ac5470e8e152cebc55 > src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java > 1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f > > Diff: https://reviews.apache.org/r/51763/diff/ > > > Testing > ------- > > All types of testing including deploying to test and production clusters. > > > Thanks, > > Maxim Khutornenko > >