[ 
http://opensource.atlassian.com/projects/roller/browse/ROL-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

linda skrocki closed ROL-1446.
------------------------------

       Resolution: Fixed
    Fix Version/s: 4.0

Allen fixed in 4.0.

> Task leasing causes scheduling inconsistencies
> ----------------------------------------------
>
>                 Key: ROL-1446
>                 URL: 
> http://opensource.atlassian.com/projects/roller/browse/ROL-1446
>             Project: Roller
>          Issue Type: Bug
>    Affects Versions: 3.1
>            Reporter: Allen Gilliland
>            Assignee: Roller Unassigned
>             Fix For: 4.0
>
>
> After a bit more poking around I have realized that some of the problems I've 
> seen with the task scheduling is actually being caused by the leasing process 
> we are using.  The root of the problem is that the task scheduling is not 
> properly synchronized with the leasing process and therefore scheduling drift 
> happens.
> An example.  Assume that a task is scheduled to run once per minute starting 
> 00:00:00.50.  This will mean that the subsequent run times for the task will 
> be 00:01:00.50, 00:02:00.50, etc, etc.  Now take into account the fact that 
> in the database the leasing time of a task is defined by the time the task 
> obtained a lease on db time, and that time is some amount of time after the 
> time the actual task was started.  So lets assume for a moment that it takes 
> 700ms to obtain a lease via the db.  This means that the time the db thinks a 
> task is run is different than the time the app thinks the task is run, and in 
> our particular example the actual clock difference will be 1 second 
> (00:00:00.50 + 700ms = 00:00:01.20).  What this means is that when the 
> application runs the task the next time at 00:01:00.50 and tries to obtain a 
> new lease it will be refused because the db thinks the last run time for the 
> task was at 00:00:01.20 which is less than 60 seconds from 00:01:00.50.  So 
> this means that the additional time required to obtain a lease in the db can 
> actually cause the lease time to be off by 1 or more seconds and therefore 
> cause a subsequent run of the task to fail.
> I have seen this exact problem occur with jobs meant to run once daily where 
> the job runs at just after midnight, obtains a lease at 00:00:01.xxx seconds 
> and runs, and then the following day the task fails to run because the app 
> thinks that the interval time for the task has not yet elapsed.
> Sorting this out will require better alignment of the clocks and timestamps 
> stored in this process and this is the best option I can come up with right 
> now ...
> When a task successfully obtains a lease and runs it must keep track of the 
> exact time the task was first initiated, then when the task completes and 
> releases its lease it stores that time in db as the last time the lease was 
> acquired.  This would basically be a fairly simple attempt at properly 
> adjusting the lease time stored in the db so that it does not include the 
> additional amount of time required to process obtaining the lease.  So an 
> example would be that if a task is set to run hourly starting at 05:00 and it 
> obtains its lease at 05:01.20 then when the task completes we would subtract 
> the 01.20 seconds from the time stored in the db so that the db properly 
> reflects the time the task was run, not the time the lease was obtained.
> I am sure there are other ways to better synchronize the multiple clocks 
> involved when doing clustered task scheduling, but at the end of the day it's 
> apparent that part of the solution is going to have to involve properly 
> accounting for the extra time that gets used up to obtain a lease so that 
> scheduling doesn't drift.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://opensource.atlassian.com/projects/roller/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to