Re: [boinc_dev] DCF at app_version level

Richard Haselgrove Mon, 22 Mar 2010 13:00:33 -0700

I was reading the 'Job runtime estimates' from 
http://boinc.berkeley.edu/trac/wiki/CreditNew. That seems to imply that the 
average will be maintained per host, in the new host_app_version table. It 
needs to be, because the correction factor needed (which is also influenced 
by the relationship between benchmarks and real-life throughput) varies 
significantly between different processor designs. Not to mention the 
problem of anonymous platform and optimised apps.



> The server DCF is, I believe, across all machines attached to the server
> (500,000 or more on SETI).  If this is actually the case, I would not 
> worry
> too  much about the speed of change, but more about the accuracy for any
> given machine.  It is a way of changing the starting point, but does not
> solve the problems of the CPU scheduler on the client.
>
> BTW, the reason for the caution on reducing the DCF on the client if it is
> very high is the very real problem with a batch of SETI -9 exit results.
> Get a few dozen of these in a row, and you will discover that there is too
> much work fetched the next work fetch, unless caution is used.
> Unfortunately, some of the faster machines are already breaking through
> this caution and generating very low DCF values for a string of -9 exits.
> BTW, this is not just SETI, there are other projects that can have tasks
> that exit early.
>
> jm7
>
>
>
>             Richard
>             Haselgrove
>             <r.haselgr...@bti                                          To
>             nternet.com>              "David Anderson"
>             Sent by:                  <da...@ssl.berkeley.edu>,
>             <boinc_dev-bounce         <john.mcl...@sybase.com>
>             s...@ssl.berkeley.ed                                          cc
>             u>                        BOINC Developers Mailing List
>                                       <boinc_dev@ssl.berkeley.edu>
>                                                                   Subject
>             03/22/2010 03:21          [boinc_dev] DCF at app_version
>             PM                        level
>
>
>
>
>
>
>
>
>
>
> Moving DCF down from a project scope to an app_version scope is clearly
> necessary, but I think there are many unanswered questions about the
> server-based approach, and potential pitfalls.
>
> Speed of change / settling time
> At the moment, the standard change is 10% of difference, per task exit -
> which is generally taken to mean a 'settling time' of twenty to thirty
> tasks, however long that may take. There is, however, the proviso that if
> the current DCF is too wrong on the high side (more than 10x), the client
> adopts a much more cautious 1% per task rate of change, with a
> commensurately longer settling time: if I've done my Excel modelling
> correctly for a 'bad' DCF of 100 and a target of 1, it takes 240 task
> completions for DCF to fall from 100 to 10, and a further 40 tasks to get
> down to 1.1 What decay rate will be used for the server calculations? Will
> users be able to speed up these extreme changes, as they could at the
> moment
> by editing the state file? What change will be applied by the server when 
> a
>
> large number of results is reported in a single RPC?
>
> Transitional interactions
> As John has noted below, there will still be a single project-wide DCF
> value
> operating inside the client. This will be driven towards 1 separately by
> each app_version in play, but at different speeds by each app_version:
> think
> SETI/CUDA and Astropulse. And if the rate of change is governed by task
> completions (as at present), then work supply considerations come into 
> play
>
> as well: SETI/CUDA should settle within a day, but Astropulse - with
> limited
> work availablity - could take years, and will disrupt CUDA estimates at
> each
> intervening AP task-end.
>
> Caching
> The proposal is to signal the variance back from the server to the client
> by
> dynamically varying <rsc_flops_est>. A user with a lengthy cached task 
> list
>
> will see gradually changing estimated run times in that list - times
> appropriate to the current client DCF at the top (next to run), times
> appropriate to a DCF of 1 (but modified by the current value) at the
> bottom.
> And everything will subtly interact as the cache is processed. I think I'm
> beginning to feel slightly sea-sick.
>
> Task variance
> You may have noticed a scope level of "Job batch/class" in my previous
> post.
> Not all tasks are created equal: I'm thinking SETI Angle Range (continual,
> automatic variation depending on telescope movement during the recording)
> or
> CPDN (the FAMOUS model currently reaching late Beta could be issued as 10
> year, 200 year, or 1400 year simulations - or anything in between). AQUA
> regularly test their new runs with 1-bit or 2-bit simulations, then ramp 
> up
>
> through 32/48/72/96/128 - and equally regularly forget to adjust
> rsc_flops_est as they go.
> http://img717.imageshack.us/img717/1143/postitu.png. These are variations
> _within_ a single app_version - so the chore of 'seeding' the new
> self-adjusting server code with a realistic opening bid is not lifted from
> administrators' shoulders.
>
>
>> While the average DCF will be defined as 1 (much better), there is still
>> an
>> order of magnitude difference between the most efficient and least
>> efficient even with the stock applications.  With custom applications,
> the
>> disparity gets wider, and one project, the stock application is so
>> inefficient that a custom application that has every result verify is
>> nearly 1000 times as fast as the stock application.
>>
>> Pair a stock Astropulse with a highly optimized SETI app.  Pair a stock
>> SETI CPU app with a stock GPU SETI App.  There are some definite
>> differences in the DCF that shows for these.
>>
>> jm7
>>
>>
>> The server will scale workunit.rsc_fpops_est by its DCF estimate.
>> The client's DCF will tend to 1.
>> -- David
>
>
> _______________________________________________
> boinc_dev mailing list
> boinc_dev@ssl.berkeley.edu
> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
> To unsubscribe, visit the above URL and
> (near bottom of page) enter your email address.
>
>
>
> 


_______________________________________________
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] DCF at app_version level

Reply via email to