-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51763/#review149279
-----------------------------------------------------------


Ship it!




Master (f1e09a9) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 16, 2016, 9:04 p.m., Maxim Khutornenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51763/
> -----------------------------------------------------------
> 
> (Updated Sept. 16, 2016, 9:04 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Stephan Erb, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This is the second part of the `BatchWorker` conversion work that moves cron 
> jobs to use non-blocking kill followups and reduces the number of trigger 
> threads. See https://reviews.apache.org/r/51759 for more background on the 
> `BatchWorker`.
> 
> #####Problem
> The current implementation of the cron scheduling relies on a large number of 
> threads (`cron_scheduler_num_threads`=100) to support cron triggering and 
> killing existing tasks according to `KILL_EXISTING` collision policy. This 
> creates large spikes of activities at synchronized intervals as users tend to 
> schedule their cron runs around similar schedules. Moreover, the current 
> implementation re-acquires write locks multiple times to deliver on 
> `KILL_EXISTING` policy. 
> 
> #####Remediation
> Trigger level batching is still done in a blocking way but multiple cron 
> triggers may be bundled together to share the same write transaction. Any 
> followups, however, are performed in a non-blocking way by relying on a 
> `BatchWorker.executeWithReplay()` and the `BatchWorkCompleted` notification. 
> In order to still ensure non-concurrent execution of a given job key trigger, 
> a token (job key) is saved within the trigger itself. A concurrent trigger 
> will bail if a kill followup is still in progress (token is set AND no entry 
> in `killFollowups` set exists yet).
> 
> #####Results
> The above approach allowed reducing the number of cron threads to 10 and 
> likely can be reduced even further. See https://reviews.apache.org/r/51759 
> for the lock contention results.
> 
> 
> Diffs
> -----
> 
>   commons/src/main/java/org/apache/aurora/common/util/BackoffHelper.java 
> 8e73dd9ebc43e06f696bbdac4d658e4b225e7df7 
>   commons/src/test/java/org/apache/aurora/common/util/BackoffHelperTest.java 
> bc30990d57f444f7d64805ed85c363f1302736d0 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java 
> c07551e94f9221b5b21c5dc9715e82caa290c2e8 
>   src/main/java/org/apache/aurora/scheduler/cron/quartz/CronModule.java 
> 155d702d68367b247dd066f773c662407f0e3b5b 
>   
> src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 
> 5c64ff2994e200b3453603ac5470e8e152cebc55 
>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
> 1c0a3fa84874d7bc185b78f13d2664cb4d8dd72f 
> 
> Diff: https://reviews.apache.org/r/51763/diff/
> 
> 
> Testing
> -------
> 
> All types of testing including deploying to test and production clusters.
> 
> 
> Thanks,
> 
> Maxim Khutornenko
> 
>

Reply via email to