Addition to work fetch policy needed.

With more than 50 projects attached to my computers, there is no way that
all projects can have work on any single machine at the same time, not even
on my fastest machine.  There has to be some definition for work fetch cut
off for those projects that have used too much resource time recently.  If
all projects have work on the host, some of it is guaranteed to be returned
late (in some cases very late).  Resource usage will be driven not by the
specified resource share, but by how tight the deadlines are set by the
project administrators.

This cutoff must be done per device type because on some machines, GPU
projects will always be getting much more than their share of FLOPs, and we
really don't want to cut off work fetch for the GPU while its FLOPS usage
is recovering.  Which in turn means that the host needs to know what
resource types are supported by each project.  Which means that the servers
need to provide such a list.

Additional note about the decay rate:

I made a mistake in my previous calculation about the decay rate.  It has
to be long in comparison to the maximum of (longest recorded task run time
(P) / resource fraction (P)) across all projects P.  In other words, if
CPDN has a resource fraction of 0.1 and a maximum run time of a year, the
decay rate needs to be long in comparison to 10 years.  A decay rate of a
month is going to give CPDN much too a high a resource usage at least in
this case.  For that machine, the decay rate has to be at least 80 years
(or pretty much a non-existent decay rate).  In the example given, CPDN
should get its next task downloaded 9 years after it finished the first
task.  It should be working on other projects in the mean time. A decay
rate with a half life of a month pretty much guarantees that CPDN will get
a new task downloaded within a couple of months, not years later as it
should.

Question:  Why does the calculation have any decay at all?

jm7


                                                                           
             David Anderson                                                
             <[email protected]                                             
             ey.edu>                                                    To 
             Sent by:                  BOINC Developers Mailing List       
             <boinc_dev-bounce         <[email protected]>        
             [email protected]                                          cc 
             u>                                                            
                                                                   Subject 
                                       [boinc_dev] proposed scheduling     
             10/26/2010 05:13          policy changes                      
             PM                                                            
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Experiments with the client simulator using Richard's scenario
made it clear that the current scheduling framework
(based on STD and LTD for separate processor types) is fatally flawed:
it may divide resources among projects in a way that makes no sense
and doesn't respect resource shares.

In particular, resource shares, as some have already pointed out,
should apply to total work (as measured by credit)
rather than to individual processor types.
If two projects have equal resource shares,
they should ideally have equal RAC,
even if that means that one of them gets 100% of a particular processor
type.

I think it's possible to do this,
although there are difficulties due to delayed credit granting.
I wrote up a design for this:
http://boinc.berkeley.edu/trac/wiki/ClientSchedOctTen
Comments are welcome.

BTW, the new mechanisms would be significantly simpler than the old ones.
This is always a good sign.

-- David
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.



_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to