There is another point where a task can be killed.  If it keeps resetting back 
to the same checkpoint over and over again, it is not making progress, and 
should be killed.  I would suggest that if a task hits the exact same 
checkpoint 10 times, it is never going to get past it.  Of course, if the next 
checkpoint is reached, then the task is making progress, and the counter needs 
to be reset back to 0.

It would be nice if the client would contact the server occasionally (once per 
day?, server specified with a default of once per day?, and on any other 
contact) if any tasks from that project on the client are over deadline and ask 
for instructions (i.e. continue, abort, or set a new deadline).  The client 
would have to accept a non-response as an indication to do the current (i.e. 
abort anything that is not started and not abort anything that is running).  
This would allow the server to know that the client is really working on the 
problem, and not create and send a replacement task unnecessarily.  It would 
also allow the server to kill anything that is no longer needed.

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Josef W. Segur
Sent: Saturday, December 22, 2012 4:10 PM
To: Raistmer; David Anderson; [email protected]
Subject: Re: [boinc_dev] Unrecoverable 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED...

On Sat, 22 Dec 2012 09:50:55 -0500, Raistmer <[email protected]> wrote:

>> If there were a way for a user to reset the host averages for an
>> individual app_version, that would be much better. Something like a
>> "reset" button by each app_version on a host's application details page,
>> with a confirm/cancel dialog explaining the action, might be suitable.
>> (The "reset" would of course have to be only available to the account
>> owner).
>
> It will not fix design flaw IMHO. BOINC main design (and BOINC devs always
> insisted on that) - to allow automatic operation. Such "way for user" would
> be workaround, but not proper fix. BOINC should not kill task that makes
> progress. Period. The single point when such task can be killed - when it
> besides deadline already. Then kill and recompute estimate to avoid work
> allocation, if needed.
> If any BOINC estimates tell that task too slow - adapt estimates. If task
> completion progress ticks - DON'T KILL task. That's quite simple.
>
I do agree that monitoring progress would be a good idea for many project 
applications, which is why I submitted the "check_progress option" to this list 
in April 2011. Had it been accepted and enabled for setiathome_v7 applications, 
the rsc_fpops_bound could have been set much higher with almost no chance of a 
host ever reaching the exit time limit.

I also agree that the suggested reset mechanism is a workaround. But something 
like that is needed because the et average is based on an assumption of no 
significant hardware/software changes, and IMO it's far better to give users a 
method of correcting it rather than having users quitting in disgust. OTOH, I 
would not object to a complete redesign of the mechanism which achieves 
stability plus quick adaptation to changed conditions.
-- 
                                                          Joe
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to