Re: [boinc_dev] Quota system inefficiency, something better?

Richard Haselgrove Fri, 07 Jan 2011 11:15:44 -0800

I did a task check on one of the double-fermi hosts this morning, and got on to 
the offset=1920 page for 24 hours. So the problem can be a bit bigger than 800 
tasks/gpu/day.


  ----- Original Message ----- 
  From: Josef W. Segur 


  Although the quota system is not limiting those hosts enough,
  they get frequent invalidations. The consecutive valid count is
  usually zero, though I've seen a few cases of single digit
  non-zero counts. So with s...@h gpu_multiplier set at 8, those hosts
  are limited to not much more than 800 tasks per day per FERMI
  GPU. My most recent check shows 19 hosts, though a few have two
  Fermi cards so the GPU count is about 23. So the size of the
  problem is reduced to under 2% of s...@h Enhanced tasks.

  s...@h Enhanced tasks account for about 64% of the download
  bandwidth, the much larger Astropulse v505 tasks get the
  remainder. That makes the bandwidth impact on the order of 1%
  currently.

  Perhaps something like keeping a consecutive invalid count and
  not doubling on reported "Success" when that count exceeds the
  consecutive valid count would be better. But that would make
  recovery from a temporary problem much slower.
  -- 
                                                                Joe


  On Fri, 07 Jan 2011 06:41:07 -0500, Richard Haselgrove wrote:

  > OK, back-of-envelope error - that 10% overstates the size of the problem.
  >
  > But, more realistically, 20 hosts @ 2,000 tasks wasted each per day is 
close to 4% of all multibeam tasks issued.
  >
  >   ----- Original Message -----
  >   From: Richard Haselgrove
  >
  >
  >   There is an updated version of that list at 
http://setiathome.berkeley.edu/forum_thread.php?id=62573&nowrap=true#1062822
  >
  >   Given the limited number of hosts involved, in the short term the wastage 
(something of the order of 10% of SETI's limited download bandwidth), maybe 
could/should be stemmed by invoking 
http://boinc.berkeley.edu/trac/wiki/BlackList for the 19 hosts identified.
  >
  >   This approach has been used successfully by Milo at CPDN, although there 
is a suggestion that something (perhaps the replacement of max_results_day with 
per_app_version equivalents) broke the blacklist facility after he started 
using it - quotas which had been set to -1 manually became positive again. The 
SETI situation would be a useful test of this tool while a longer-term 
automatic solution is sought.
  >
  >     ----- Original Message -----
  >     From: Raistmer
  >
  >
  >     Hello.
  >
  >     Looks like current quota system implementation can't prevent project 
resources waste in case of "partially" broken host.
  >     For example, host with anonymous platform running FERMI incompatible 
CUDA app on SETI.
  >     It will produce incorrect overflows almost always but few specific ARs 
that will be processed correctly and recive validation.
  >     This small nomber of validations + "GPU" status of app (GPU has greatly 
relaxed limits) allows continuous task trashing. Current quota system 
implementation can't prevent massive task trashing in this situation.
  >
  >     But now more historical info about host behavior stored on servers, on 
per app version basis.
  >     Maybe smth new can be implemented that will take into account not only 
last successive validation but host history too?
  >     The testcases are known, SETI community has list of such bad-behaving 
hosts already: 
http://setiathome.berkeley.edu/forum_thread.php?id=62573&nowrap=true#1061788
  >
  >     The aim should be to reduce their throughput to 1 task per day for NV 
GPU app until their owners reinstall GPU app.
  _______________________________________________
  boinc_dev mailing list
  [email protected]
  http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
  To unsubscribe, visit the above URL and
  (near bottom of page) enter your email address.
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] Quota system inefficiency, something better?

Reply via email to