Re: [boinc_dev] proposed scheduling policy changes

John . McLeod Wed, 27 Oct 2010 06:16:28 -0700

I see a major flaw with using RAC.  Suppose we have a project (say CPDN)
that takes several months on a particular computer, granting credit all the
way, and in constant high priority.  (Yes, I know, a somewhat slow
computer, but they still exist).  At the end of that time the RAC for CPDN
is well established but then starts to decay, and it will not be that long
before a few tasks from the other project (say Spinhenge with < 1/2 day
tasks on the same computer) attached are completed and validated.  This
will generate a spike in RAC for Spin Henge and another CPDN task will be
downloaded.  The instant conclusion is that the half-life of the RAC for
long term scheduling has to be much longer than the length of the longest
task on a particular computer for it to make any sense at all.


Let's say the CPDN RAC at the end of that task is 100.  And the RAC for
Spinhenge is 0.  At the end of a week of running Spinhenge only, the RAC
for Spinhenge should be approaching 100 and the RAC for CPDN is 50...

Using server side data requires an update to fetch the data.
Unfortunately, a project that has a high reported RAC at a client is
unlikely to be contacted for any reason.  It is entirely possible that a
situation like having the validators off line for a week could permanently
turn off the project once they come back online.  A computer reports a few
tasks, and is told that the RAC right now is 100,000 because a weeks worth
of work has just been validated in the last minute.  This pushes the
project to the bottom of the list for contact on that particular host.
Since the RAC reported from the server never changes until the server is
contacted again to report work or fetch work, this host may never get
around to contacting that project again.  The data must be calculated
locally from the best information available at the time.

Another major flaw is that RAC is much too slow for use as a scheduler.  It
will run only one project for a long time, then only another project for a
long time.  It will not switch on anything like an hourly basis.

What about machines that contact the servers only once a week or so?  The
data on the host is going to be quite stale by the end of the week.

So a counter proposal:

1)  Use a STD / device type for short term scheduling.  Not perfect maybe,
but the short term scheduler needs to be responsive to local data only as
it cannot count on feedback from the servers.  RAF does not work well as
once the work is downloaded, it is already set for a specific device type.

2)  Instead of Recent Average Credit, use Average Credit.  Write a some of
data into the client_state.xml file that included the time now and the host
credit now at the time of the first install of the version that uses this
scheduler or on attach of a new project, or on a reset of a project, write
the current time and the current credit as reported by the server as
initial conditions.  At the time that work is fetched, use (current credit
- initial credit) / (now - initial time) + C * RAF  as the criteria for
where to try to fetch work from.  Note that backoff will eventually allow
other projects than the top one to fetch work.  Note the C will need to be
negative because if it is positive, projects that have just completed work
will have a high RAF and will be the first in line to get more.  The long
term credit average needs to be a major component, I would propose that
they be about half each.

3)  This will require a change to the policy of how much work to fetch from
any project, and overall.  The current LTD method leaves some fairly good
methods for determining a choke number.  I am not certain that the proposed
scheme does so.  The client should neither fetch all of the work from a
single project, nor should it allow work fetch from a project that
consistently runs high priority and has used more than its share of
resource time.

One final note:

There will be no way at all to balance some resource share allocations
across a single platform.  Suppose that there are 3 projects attached to a
computer all with equal resource shares.  The GPU runs 10 * as fast as the
CPU, and one of the tasks will run CPU or GPU and the other two will run
CPU only.  The GPU / CPU project will never run on the CPU (this is OK) and
it will have a much higher average credit and RAF than the two CPU
projects.  Yet the project that can run on the GPU cannot be choked off
from GPU work fetch as that is the only project that can run on the GPU.
This would be made substantially easier if the client knew which device
types the project could supply work for.  The proposal is that the project
provide a list of device types supported on every update.  The client could
then incorporate this into the decision as to where to fetch work from.
When building a work fetch for the GPU in this case, it would scan the list
of projects and only compare those that it knew could support the GPU to
determine work fetch for the GPU.  The single project in this case that
supported the GPU would then be eligible for a full work fetch of min_queue
+ extra_work, instead of just min_queue (because it has and will always use
too much of the resources of the computer because of the wide variation in
the abilities of the devices.

Counter Proposal 2:

Give up on treating all the devices on the host as a single entity.  Treat
each different type of device as a separate computer for the purposes of
work fetch.  This may not be what the end users want though.

jm7


                                                                           
             David Anderson                                                
             <[email protected]                                             
             ey.edu>                                                    To 
             Sent by:                  BOINC Developers Mailing List       
             <boinc_dev-bounce         <[email protected]>        
             [email protected]                                          cc 
             u>                                                            
                                                                   Subject 
                                       [boinc_dev] proposed scheduling     
             10/26/2010 05:13          policy changes                      
             PM                                                            
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Experiments with the client simulator using Richard's scenario
made it clear that the current scheduling framework
(based on STD and LTD for separate processor types) is fatally flawed:
it may divide resources among projects in a way that makes no sense
and doesn't respect resource shares.

In particular, resource shares, as some have already pointed out,
should apply to total work (as measured by credit)
rather than to individual processor types.
If two projects have equal resource shares,
they should ideally have equal RAC,
even if that means that one of them gets 100% of a particular processor
type.

I think it's possible to do this,
although there are difficulties due to delayed credit granting.
I wrote up a design for this:
http://boinc.berkeley.edu/trac/wiki/ClientSchedOctTen
Comments are welcome.

BTW, the new mechanisms would be significantly simpler than the old ones.
This is always a good sign.

-- David
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.



_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] proposed scheduling policy changes

Reply via email to