Though it looks like the conversation died down again ... I think there are a couple points yet to be made.
If I had one and only one objection to be made it is that this system seems to be based upon the benchmarking system without any attempts being made to correct for those deficiencies (as best I can tell). To my mind the worst feature of the benchmarks was not that they were inaccurate, but they cannot be replicated. Repeated runs even on systems that are quiescent can get reported results that cover as spread with as much as a 20% variance. The concept of a "reference job" I am happy to see as that was the cornerstone of the proposal I made for use of calibration to quantify and test our systems in the BOINC universe. See: http://www.boinc-wiki.info/Improved_Benchmarking_System_Using_Calibration_Concepts I still see SaH as one of the "best" sources in that the source is public and probably the best understood. Most importantly it should be relatively easy to make known test tasks by hand that have known characteristics that can be tested and to a great extent perhaps even tested with instrumented code so that precise counts of FLOPS could be made. An assumption is made that the GPU versions will be more efficient. I think Aqua found that the converse is true (I do not know this for sure, it was in a post I read the other day in discussing projects with GPU applications that they dropped the GPU version because it was worse than the CPU version - multi-threaded). It may be that I am too dense to get it, but I also do not see how this proposal would adequately address the quality metrics we might extract from those projects where there are applications that span the types and classes of computing resources. For example, the two "best" projects at this time are MilkyWay and Collatz in that they have applications that span all three of the currently available computing resources: CPU, Nvidia CUDA, and ATI Stream. And finally, the issue of optimized applications vs. "stock" application ... the hardware will report the same FLOPS but it seems to me the faster execution time of the optimized application will cause problems. Opps, two more finallies, you would require a change to all science applications to make this effective and you still require the projects to make an initial estimate regardless of its accuracy (predicted number of app units). On Aug 28, 2009, at 12:45 PM, David Anderson wrote: > I'm coming around to the viewpoint that projects shouldn't be expected > to supply estimates of job duration or application performance. > I think it's feasible to maintain these estimates dynamically, > based on actual job runtimes. > I've sketched a set of changes that would accomplish this: > http://boinc.berkeley.edu/trac/wiki/AutoFlops > Comments welcome. > > BTW, a bonus of the proposed design is that it provides > a project-independent credit-granting policy. > > -- David > > Richard Haselgrove wrote: >> ... if projects >> are expected to fine-tune performance metrics down to the individual >> plan_class level, then I'm sorry, but they just won't. I've had to >> shout >> (loudly and repeatedly) at both AQUA and GPUGrid to get them to >> adjust >> rsc_fpops_est to within an order of magnitude of reality (in AQUA's >> case, >> two orders of magnitude). > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
