Although the quota system is not limiting those hosts enough,
they get frequent invalidations. The consecutive valid count is
usually zero, though I've seen a few cases of single digit
non-zero counts. So with s...@h gpu_multiplier set at 8, those hosts
are limited to not much more than 800 tasks per day per FERMI
GPU. My most recent check shows 19 hosts, though a few have two
Fermi cards so the GPU count is about 23. So the size of the
problem is reduced to under 2% of s...@h Enhanced tasks.

s...@h Enhanced tasks account for about 64% of the download
bandwidth, the much larger Astropulse v505 tasks get the
remainder. That makes the bandwidth impact on the order of 1%
currently.

Perhaps something like keeping a consecutive invalid count and
not doubling on reported "Success" when that count exceeds the
consecutive valid count would be better. But that would make
recovery from a temporary problem much slower.
-- 
                                                              Joe


On Fri, 07 Jan 2011 06:41:07 -0500, Richard Haselgrove wrote:

> OK, back-of-envelope error - that 10% overstates the size of the problem.
>
> But, more realistically, 20 hosts @ 2,000 tasks wasted each per day is close 
> to 4% of all multibeam tasks issued.
>
>   ----- Original Message -----
>   From: Richard Haselgrove
>
>
>   There is an updated version of that list at 
> http://setiathome.berkeley.edu/forum_thread.php?id=62573&nowrap=true#1062822
>
>   Given the limited number of hosts involved, in the short term the wastage 
> (something of the order of 10% of SETI's limited download bandwidth), maybe 
> could/should be stemmed by invoking 
> http://boinc.berkeley.edu/trac/wiki/BlackList for the 19 hosts identified.
>
>   This approach has been used successfully by Milo at CPDN, although there is 
> a suggestion that something (perhaps the replacement of max_results_day with 
> per_app_version equivalents) broke the blacklist facility after he started 
> using it - quotas which had been set to -1 manually became positive again. 
> The SETI situation would be a useful test of this tool while a longer-term 
> automatic solution is sought.
>
>     ----- Original Message -----
>     From: Raistmer
>
>
>     Hello.
>
>     Looks like current quota system implementation can't prevent project 
> resources waste in case of "partially" broken host.
>     For example, host with anonymous platform running FERMI incompatible CUDA 
> app on SETI.
>     It will produce incorrect overflows almost always but few specific ARs 
> that will be processed correctly and recive validation.
>     This small nomber of validations + "GPU" status of app (GPU has greatly 
> relaxed limits) allows continuous task trashing. Current quota system 
> implementation can't prevent massive task trashing in this situation.
>
>     But now more historical info about host behavior stored on servers, on 
> per app version basis.
>     Maybe smth new can be implemented that will take into account not only 
> last successive validation but host history too?
>     The testcases are known, SETI community has list of such bad-behaving 
> hosts already: 
> http://setiathome.berkeley.edu/forum_thread.php?id=62573&nowrap=true#1061788
>
>     The aim should be to reduce their throughput to 1 task per day for NV GPU 
> app until their owners reinstall GPU app.
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to