Re: [boinc_dev] 6.6.20 and work scheduling

Richard Haselgrove Sun, 26 Apr 2009 03:08:25 -0700

TarotApprentice wrote:

> Sometimes tasks take longer than expected. That may then put other tasks 
> into deadline pressure. Sure I agree there is no reason to check every 60 
> seconds. I think 5 mins is enough to allow for this scenerio assuming some 
> other event didn't already trigger the checking.


I've always been slightly perplexed by BOINC's design here. In complete 
contrast to the hyperactive checking and re-checking that we're discussing 
on this list, BOINC makes no allowance for tasks which are taking longer 
than expected. Instead, it only takes account of tasks which ***have 
taken*** - past tense - longer than expected: i.e. the calculation is only 
done at task completion, and stored as Duration Correction Factor.

I can hear the groan from Berkeley already: when else could it be done? But 
I suggest one incontrovertible case: if a task ***has already*** taken 
_longer_ than its due time, according to fpops_est, DCF and host/resource 
speed, then it sure as heck isn't going to ***complete*** in _under_ the 
expected time. If we're going to do incremental testing and checking, it 
makes sense to start nudging DCF upwards as soon as that "passed inital (or 
current) estimate" point is reached, rather than, as at present, jumping it 
up on task completion.

Tasks take longer than expected for a variety of reasons. The current 
SETI/CUDA/VLAR example has been rehearsed to death, but there are plenty of 
others: often with debug builds of new Beta apps, but also deliberate 
project decisions - was it the Einstein R3/R4 transition where they 
re-normalised fpops_est, which had become outdated through (project) code 
optimisation? Estimates can also be messed up by hardware problems, user 
micro-management, simple errors by projects - the list is endless.

There is also the current - hopefully short-term - situation where estimates 
for disparate apps share a common DCF, but should each converge to a 
separate, different, figure of their own. Roll on v6.8! But in the meantime, 
a longer runtime than expected gives the cache a big shake, often resulting 
in EDF, work-ftech inhibition, consequent debt imbalance etc. etc. But these 
only kick in ***on completion*** of what is almost by definition a 
long-running task. A lot of trouble can have started brewing in the 
meantime - and all this checking takes no account of it. 


_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] 6.6.20 and work scheduling

Reply via email to