The trusted computers would not have any quotas applied at all.
jm7
-----<[email protected]> wrote: -----
To: "Josef W. Segur" <[email protected]>, <[email protected]>
From: Raistmer <[email protected]>
Sent by: <[email protected]>
Date: 01/07/2011 04:22PM
cc: <[email protected]>, <[email protected]>
Subject: Re: [boinc_dev] Quota system inefficiency, something better?
This implies same quota mechanism as used today but just with different
numbers.
For GPU idle time will be long enough to make recovering from single act
of task trashing (it happens sometimes) very painful.
Maybe better to slightly change quota system to take into account history
of host behavior: what % of invalids this host has? And base speed of
quota recovery on this percent.
There was some idea of "trusted" hosts in regards not to do task
replication if it is sent to "trusted" host.
Though I think it's bad idea, "trusted host" conception can be still used
for quota-related calculations IMHO.
----- Original Message -----
From: [email protected]
To: Josef W. Segur
Cc: [email protected] ; [email protected]
Sent: Friday, January 07, 2011 10:22 PM
Subject: Re: [boinc_dev] Quota system inefficiency, something better?
I believe that doubling is way too fast. My thought is that when you
return a good one, you should get a replacement and your quota should go
up
by one. Yes, the recovery will be slower, but tomorrow you will be able
to
fetch the count of successful ones that you returned today at the
beginning
of the day. If you are always on, you will still be able to keep your
CPUs
busy. If you are recovering from a temporary problem, either you can
nursemaid your computer through a couple of days or you can just accept a
slightly idle CPU for a while if you are not always attached to the
internet.
jm7
"Josef W. Segur"
<jse...@westelcom
.com> To
Sent by: <[email protected]>
<boinc_dev-bounce cc
[email protected]
u> Subject
Re: [boinc_dev] Quota system
inefficiency, something better?
01/07/2011 02:04
PM
Although the quota system is not limiting those hosts enough,
they get frequent invalidations. The consecutive valid count is
usually zero, though I've seen a few cases of single digit
non-zero counts. So with s...@h gpu_multiplier set at 8, those hosts
are limited to not much more than 800 tasks per day per FERMI
GPU. My most recent check shows 19 hosts, though a few have two
Fermi cards so the GPU count is about 23. So the size of the
problem is reduced to under 2% of s...@h Enhanced tasks.
s...@h Enhanced tasks account for about 64% of the download
bandwidth, the much larger Astropulse v505 tasks get the
remainder. That makes the bandwidth impact on the order of 1%
currently.
Perhaps something like keeping a consecutive invalid count and
not doubling on reported "Success" when that count exceeds the
consecutive valid count would be better. But that would make
recovery from a temporary problem much slower.
--
Joe
On Fri, 07 Jan 2011 06:41:07 -0500, Richard Haselgrove wrote:
> OK, back-of-envelope error - that 10% overstates the size of the
problem.
>
> But, more realistically, 20 hosts @ 2,000 tasks wasted each per day is
close to 4% of all multibeam tasks issued.
>
> ----- Original Message -----
> From: Richard Haselgrove
>
>
> There is an updated version of that list at
[1]http://setiathome.berkeley.edu/forum_thread.php?id=62573&nowrap=true#10
62822
>
> Given the limited number of hosts involved, in the short term the
wastage (something of the order of 10% of SETI's limited download
bandwidth), maybe could/should be stemmed by invoking
[2]http://boinc.berkeley.edu/trac/wiki/BlackListfor the 19 hosts
identified.
>
> This approach has been used successfully by Milo at CPDN, although
there is a suggestion that something (perhaps the replacement of
max_results_day with per_app_version equivalents) broke the blacklist
facility after he started using it - quotas which had been set to -1
manually became positive again. The SETI situation would be a useful test
of this tool while a longer-term automatic solution is sought.
>
> ----- Original Message -----
> From: Raistmer
>
>
> Hello.
>
> Looks like current quota system implementation can't prevent project
resources waste in case of "partially" broken host.
> For example, host with anonymous platform running FERMI incompatible
CUDA app on SETI.
> It will produce incorrect overflows almost always but few specific
ARs that will be processed correctly and recive validation.
> This small nomber of validations + "GPU" status of app (GPU has
greatly relaxed limits) allows continuous task trashing. Current quota
system implementation can't prevent massive task trashing in this
situation.
>
> But now more historical info about host behavior stored on servers,
on per app version basis.
> Maybe smth new can be implemented that will take into account not
only last successive validation but host history too?
> The testcases are known, SETI community has list of such
bad-behaving
hosts already:
[3]http://setiathome.berkeley.edu/forum_thread.php?id=62573&nowrap=true#10
61788
>
> The aim should be to reduce their throughput to 1 task per day for
NV
GPU app until their owners reinstall GPU app.
_______________________________________________
boinc_dev mailing list
[email protected]
[4]http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.
_______________________________________________
boinc_dev mailing list
[email protected]
[5]http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.
_______________________________________________
boinc_dev mailing list
[email protected]
[6]http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.
References
1.
http://setiathome.berkeley.edu/forum_thread.php?id=62573&nowrap=true#1062822
2. http://boinc.berkeley.edu/trac/wiki/BlackList
3.
http://setiathome.berkeley.edu/forum_thread.php?id=62573&nowrap=true#1061788
4. http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
5. http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
6. http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.