Re: [boinc_dev] DCF at app_version level

John . McLeod Mon, 22 Mar 2010 12:30:30 -0700

The server DCF is, I believe, across all machines attached to the server
(500,000 or more on SETI).  If this is actually the case, I would not worry
too  much about the speed of change, but more about the accuracy for any
given machine.  It is a way of changing the starting point, but does not
solve the problems of the CPU scheduler on the client.


BTW, the reason for the caution on reducing the DCF on the client if it is
very high is the very real problem with a batch of SETI -9 exit results.
Get a few dozen of these in a row, and you will discover that there is too
much work fetched the next work fetch, unless caution is used.
Unfortunately, some of the faster machines are already breaking through
this caution and generating very low DCF values for a string of -9 exits.
BTW, this is not just SETI, there are other projects that can have tasks
that exit early.

jm7


                                                                           
             Richard                                                       
             Haselgrove                                                    
             <r.haselgr...@bti                                          To 
             nternet.com>              "David Anderson"                    
             Sent by:                  <[email protected]>,           
             <boinc_dev-bounce         <[email protected]>            
             [email protected]                                          cc 
             u>                        BOINC Developers Mailing List       
                                       <[email protected]>        
                                                                   Subject 
             03/22/2010 03:21          [boinc_dev] DCF at app_version      
             PM                        level                               
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Moving DCF down from a project scope to an app_version scope is clearly
necessary, but I think there are many unanswered questions about the
server-based approach, and potential pitfalls.

Speed of change / settling time
At the moment, the standard change is 10% of difference, per task exit -
which is generally taken to mean a 'settling time' of twenty to thirty
tasks, however long that may take. There is, however, the proviso that if
the current DCF is too wrong on the high side (more than 10x), the client
adopts a much more cautious 1% per task rate of change, with a
commensurately longer settling time: if I've done my Excel modelling
correctly for a 'bad' DCF of 100 and a target of 1, it takes 240 task
completions for DCF to fall from 100 to 10, and a further 40 tasks to get
down to 1.1 What decay rate will be used for the server calculations? Will
users be able to speed up these extreme changes, as they could at the
moment
by editing the state file? What change will be applied by the server when a

large number of results is reported in a single RPC?

Transitional interactions
As John has noted below, there will still be a single project-wide DCF
value
operating inside the client. This will be driven towards 1 separately by
each app_version in play, but at different speeds by each app_version:
think
SETI/CUDA and Astropulse. And if the rate of change is governed by task
completions (as at present), then work supply considerations come into play

as well: SETI/CUDA should settle within a day, but Astropulse - with
limited
work availablity - could take years, and will disrupt CUDA estimates at
each
intervening AP task-end.

Caching
The proposal is to signal the variance back from the server to the client
by
dynamically varying <rsc_flops_est>. A user with a lengthy cached task list

will see gradually changing estimated run times in that list - times
appropriate to the current client DCF at the top (next to run), times
appropriate to a DCF of 1 (but modified by the current value) at the
bottom.
And everything will subtly interact as the cache is processed. I think I'm
beginning to feel slightly sea-sick.

Task variance
You may have noticed a scope level of "Job batch/class" in my previous
post.
Not all tasks are created equal: I'm thinking SETI Angle Range (continual,
automatic variation depending on telescope movement during the recording)
or
CPDN (the FAMOUS model currently reaching late Beta could be issued as 10
year, 200 year, or 1400 year simulations - or anything in between). AQUA
regularly test their new runs with 1-bit or 2-bit simulations, then ramp up

through 32/48/72/96/128 - and equally regularly forget to adjust
rsc_flops_est as they go.
http://img717.imageshack.us/img717/1143/postitu.png. These are variations
_within_ a single app_version - so the chore of 'seeding' the new
self-adjusting server code with a realistic opening bid is not lifted from
administrators' shoulders.


> While the average DCF will be defined as 1 (much better), there is still
> an
> order of magnitude difference between the most efficient and least
> efficient even with the stock applications.  With custom applications,
the
> disparity gets wider, and one project, the stock application is so
> inefficient that a custom application that has every result verify is
> nearly 1000 times as fast as the stock application.
>
> Pair a stock Astropulse with a highly optimized SETI app.  Pair a stock
> SETI CPU app with a stock GPU SETI App.  There are some definite
> differences in the DCF that shows for these.
>
> jm7
>
>
> The server will scale workunit.rsc_fpops_est by its DCF estimate.
> The client's DCF will tend to 1.
> -- David


_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.



_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] DCF at app_version level

Reply via email to