Raistmer wrote

> Why BOINC should trash whole GPU cache if GPU was not found at current 
> BOINC
> start ?
> Why tasks just can't be put in waiting or suspended (to not participate in
> scheduling decisions) state until GPU will be available again ?

Although this code was introduced in response to a trac ticket which 
Raistmer himself raised ;-))....

I think it was a really clumsy 'first fix' while developers were busy 
elsewhere.

In particular, it's very bad to silently discard tasks without reporting 
them back to the server for re-issue and general housekeeping. Doesn't 
matter how you flag them (abort/detach/error - anything will do for the time 
being), but let the server know they're not coming back.

Then, yes, I agree that some sort of temporary 'limbo' status is 
appropriate, in case it's a simple driver update going wrong. If that's the 
problem, normal GPU service is likely to be resumed in half-an-hour or so - 
we can afford to hang onto them for that long.

If it's a longer term outage - say a GPU failed and needed to be RMA'd, or 
the user has switched loyalty from NVIDIA to ATI - there are other 
possibilities:

- wait for user to abort tasks manually
- auto abort after a time interval (I suggest one or two days)
- keep indefinitely and allow to pass deadline (though I hate to think what 
this would do for work fetch)

I was thinking about this parallel when you implemented global use/don't use 
GPU switches recently: I think the solution chosen then for users switching 
GPUs off was exactly right. 'No New Tasks'  leaves it open for the user to 
choose an immediate manual abort as an alternative: automatic aborts *don't* 
allow 'run until dry' as an alternative. I'm all in favour of user choice 
and user control. 


_______________________________________________
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to