Re: [boinc_dev] Quota system inefficiency, something better?

Josef W. Segur Thu, 13 Jan 2011 19:32:26 -0800

I believe you're wrong. The max_jobs_per_day criteria is applied to all 
host/app versions. The only relationship to quota is trust requires at least 10 
consecutive valid which means a higher quota than a new host or one which has 
been recently punished.
-- 
                                                      Joe


On Wed, 12 Jan 2011 13:52:47 -0500, <[email protected]> wrote:

>
>    The trusted computers would not have any quotas applied at all.
>    jm7
>    -----<[email protected]> wrote: -----
>
>      To: "Josef W. Segur" <[email protected]>, <[email protected]>
>      From: Raistmer <[email protected]>
>      Sent by: <[email protected]>
>      Date: 01/07/2011 04:22PM
>      cc: <[email protected]>, <[email protected]>
>      Subject: Re: [boinc_dev] Quota system inefficiency, something better?
>      This implies same quota mechanism as used today but just with different
>      numbers.
>      For GPU idle time will be long enough to make recovering from single act
>      of task trashing (it happens sometimes) very painful.
>      Maybe better to slightly change quota system to take into account history
>      of host behavior: what % of invalids this host has? And base speed of
>      quota recovery on this percent.
>      There  was  some  idea of "trusted" hosts in regards not to do task
>      replication if it is sent to "trusted" host.
>      Though I think it's bad idea, "trusted host" conception can be still used
>      for quota-related calculations IMHO.
>      ----- Original Message -----
>      From: [email protected]
>      To: Josef W. Segur
>      Cc: [email protected] ; [email protected]
>      Sent: Friday, January 07, 2011 10:22 PM
>      Subject: Re: [boinc_dev] Quota system inefficiency, something better?
>      I believe that doubling is way too fast.  My thought is that when you
>      return a good one, you should get a replacement and your quota should go
>      up
>      by one.  Yes, the recovery will be slower, but tomorrow you will be able
>      to
>      fetch  the  count of successful ones that you returned today at the
>      beginning
>      of the day.  If you are always on, you will still be able to keep your
>      CPUs
>      busy.  If you are recovering from a temporary problem, either you can
>      nursemaid your computer through a couple of days or you can just accept a
>      slightly idle CPU for a while if you are not always attached to the
>      internet.
>      jm7
>
>                  "Josef W. Segur"
>                  <jsegur@westelcom
>                  .com>                                                      To
>                  Sent by:                  <[email protected]>
>                  <boinc_dev-bounce                                          cc
>                  [email protected]
>                  u>                                                    Subject
>                                            Re: [boinc_dev] Quota system
>                                            inefficiency, something better?
>                  01/07/2011 02:04
>                  PM
>
>
>
>
>      Although the quota system is not limiting those hosts enough,
>      they get frequent invalidations. The consecutive valid count is
>      usually zero, though I've seen a few cases of single digit
>      non-zero counts. So with S@H gpu_multiplier set at 8, those hosts
>      are limited to not much more than 800 tasks per day per FERMI
>      GPU. My most recent check shows 19 hosts, though a few have two
>      Fermi cards so the GPU count is about 23. So the size of the
>      problem is reduced to under 2% of S@H Enhanced tasks.
>      S@H Enhanced tasks account for about 64% of the download
>      bandwidth, the much larger Astropulse v505 tasks get the
>      remainder. That makes the bandwidth impact on the order of 1%
>      currently.
>      Perhaps something like keeping a consecutive invalid count and
>      not doubling on reported "Success" when that count exceeds the
>      consecutive valid count would be better. But that would make
>      recovery from a temporary problem much slower.
>      --
>                                                                   Joe
>      On Fri, 07 Jan 2011 06:41:07 -0500, Richard Haselgrove wrote:
>      >  OK, back-of-envelope error - that 10% overstates the size of the
>      problem.
>      >
>      > But, more realistically, 20 hosts @ 2,000 tasks wasted each per day is
>      close to 4% of all multibeam tasks issued.
>      >
>      >   ----- Original Message -----
>      >   From: Richard Haselgrove
>      >
>      >
>      >   There is an updated version of that list at
>      
> [1]http://setiathome.berkeley.edu/forum_thread.php?id=62573&nowrap=true#10
>      62822
>      >
>      >   Given the limited number of hosts involved, in the short term the
>      wastage (something of the order of 10% of SETI's limited download
>      bandwidth), maybe could/should be stemmed by invoking
>      [2]http://boinc.berkeley.edu/trac/wiki/BlackListfor  the  19  hosts
>      identified.
>      >
>      >   This approach has been used successfully by Milo at CPDN, although
>      there is a suggestion that something (perhaps the replacement of
>      max_results_day with per_app_version equivalents) broke the blacklist
>      facility after he started using it - quotas which had been set to -1
>      manually became positive again. The SETI situation would be a useful test
>      of this tool while a longer-term automatic solution is sought.
>      >
>      >     ----- Original Message -----
>      >     From: Raistmer
>      >
>      >
>      >     Hello.
>      >
>      >     Looks like current quota system implementation can't prevent 
> project
>      resources waste in case of "partially" broken host.
>      >     For example, host with anonymous platform running FERMI 
> incompatible
>      CUDA app on SETI.
>      >     It will produce incorrect overflows almost always but few specific
>      ARs that will be processed correctly and recive validation.
>      >     This small nomber of validations + "GPU" status of app (GPU has
>      greatly relaxed limits) allows continuous task trashing. Current quota
>      system implementation can't prevent massive task trashing in this
>      situation.
>      >
>      >     But now more historical info about host behavior stored on servers,
>      on per app version basis.
>      >     Maybe smth new can be implemented that will take into account not
>      only last successive validation but host history too?
>      >        The  testcases  are known, SETI community has list of such
>      bad-behaving
>      hosts already:
>      
> [3]http://setiathome.berkeley.edu/forum_thread.php?id=62573&nowrap=true#10
>      61788
>      >
>      >     The aim should be to reduce their throughput to 1 task per day for
>      NV
>      GPU app until their owners reinstall GPU app.
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] Quota system inefficiency, something better?

Reply via email to