I was reading the 'Job runtime estimates' from http://boinc.berkeley.edu/trac/wiki/CreditNew. That seems to imply that the average will be maintained per host, in the new host_app_version table. It needs to be, because the correction factor needed (which is also influenced by the relationship between benchmarks and real-life throughput) varies significantly between different processor designs. Not to mention the problem of anonymous platform and optimised apps.
> The server DCF is, I believe, across all machines attached to the server > (500,000 or more on SETI). If this is actually the case, I would not > worry > too much about the speed of change, but more about the accuracy for any > given machine. It is a way of changing the starting point, but does not > solve the problems of the CPU scheduler on the client. > > BTW, the reason for the caution on reducing the DCF on the client if it is > very high is the very real problem with a batch of SETI -9 exit results. > Get a few dozen of these in a row, and you will discover that there is too > much work fetched the next work fetch, unless caution is used. > Unfortunately, some of the faster machines are already breaking through > this caution and generating very low DCF values for a string of -9 exits. > BTW, this is not just SETI, there are other projects that can have tasks > that exit early. > > jm7 > > > > Richard > Haselgrove > <r.haselgr...@bti To > nternet.com> "David Anderson" > Sent by: <da...@ssl.berkeley.edu>, > <boinc_dev-bounce <john.mcl...@sybase.com> > s...@ssl.berkeley.ed cc > u> BOINC Developers Mailing List > <boinc_dev@ssl.berkeley.edu> > Subject > 03/22/2010 03:21 [boinc_dev] DCF at app_version > PM level > > > > > > > > > > > Moving DCF down from a project scope to an app_version scope is clearly > necessary, but I think there are many unanswered questions about the > server-based approach, and potential pitfalls. > > Speed of change / settling time > At the moment, the standard change is 10% of difference, per task exit - > which is generally taken to mean a 'settling time' of twenty to thirty > tasks, however long that may take. There is, however, the proviso that if > the current DCF is too wrong on the high side (more than 10x), the client > adopts a much more cautious 1% per task rate of change, with a > commensurately longer settling time: if I've done my Excel modelling > correctly for a 'bad' DCF of 100 and a target of 1, it takes 240 task > completions for DCF to fall from 100 to 10, and a further 40 tasks to get > down to 1.1 What decay rate will be used for the server calculations? Will > users be able to speed up these extreme changes, as they could at the > moment > by editing the state file? What change will be applied by the server when > a > > large number of results is reported in a single RPC? > > Transitional interactions > As John has noted below, there will still be a single project-wide DCF > value > operating inside the client. This will be driven towards 1 separately by > each app_version in play, but at different speeds by each app_version: > think > SETI/CUDA and Astropulse. And if the rate of change is governed by task > completions (as at present), then work supply considerations come into > play > > as well: SETI/CUDA should settle within a day, but Astropulse - with > limited > work availablity - could take years, and will disrupt CUDA estimates at > each > intervening AP task-end. > > Caching > The proposal is to signal the variance back from the server to the client > by > dynamically varying <rsc_flops_est>. A user with a lengthy cached task > list > > will see gradually changing estimated run times in that list - times > appropriate to the current client DCF at the top (next to run), times > appropriate to a DCF of 1 (but modified by the current value) at the > bottom. > And everything will subtly interact as the cache is processed. I think I'm > beginning to feel slightly sea-sick. > > Task variance > You may have noticed a scope level of "Job batch/class" in my previous > post. > Not all tasks are created equal: I'm thinking SETI Angle Range (continual, > automatic variation depending on telescope movement during the recording) > or > CPDN (the FAMOUS model currently reaching late Beta could be issued as 10 > year, 200 year, or 1400 year simulations - or anything in between). AQUA > regularly test their new runs with 1-bit or 2-bit simulations, then ramp > up > > through 32/48/72/96/128 - and equally regularly forget to adjust > rsc_flops_est as they go. > http://img717.imageshack.us/img717/1143/postitu.png. These are variations > _within_ a single app_version - so the chore of 'seeding' the new > self-adjusting server code with a realistic opening bid is not lifted from > administrators' shoulders. > > >> While the average DCF will be defined as 1 (much better), there is still >> an >> order of magnitude difference between the most efficient and least >> efficient even with the stock applications. With custom applications, > the >> disparity gets wider, and one project, the stock application is so >> inefficient that a custom application that has every result verify is >> nearly 1000 times as fast as the stock application. >> >> Pair a stock Astropulse with a highly optimized SETI app. Pair a stock >> SETI CPU app with a stock GPU SETI App. There are some definite >> differences in the DCF that shows for these. >> >> jm7 >> >> >> The server will scale workunit.rsc_fpops_est by its DCF estimate. >> The client's DCF will tend to 1. >> -- David > > > _______________________________________________ > boinc_dev mailing list > boinc_dev@ssl.berkeley.edu > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. > > > > _______________________________________________ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.