This will be my final post on the subject. To take the clock analogy one step further.
You're saying that your cesium clock is much better than the cheezy $1 westclox wind up I got at the local bargain shop. ... and I'm saying that the cobblestone standard explicitly calls for the use of a wind-up clock. I agree with everything you've said. I agree that Whetstones and Dhrystones are not the best predictors of performance. But the standard says "you put the dhrystone number for your machine here, and you put the whetstone number for your machine here, and you put the number of CPU seconds here, and that's your credit, in cobblestones." Martin wrote: > Lynn W. Taylor wrote: >> ... and I'm arguing that they should be. >> >> For every project, in the best possible manner. >> >> I don't have a problem with cobblestones = k * FLOPs. >> >> I have several problems with credit = k * cobblestones. >> >> .... starting with the fact that we aren't directly calculating >> cobblestones. > > A little of the architectural influence can be read from a recent > comparison article: > > http://www.tomshardware.com/reviews/athlon-l3-cache,2416.html > > #### > ... some examples: The Core i5 and i7 work with 32KB of 8-way > associative L1 data cache and 32KB of 4-way associative L1 instruction > cache. Clearly, Intel wants instructions to be available quicker while > also maximizing hits on the L1 data cache. Its L2 cache is also 8-way > set-associative, while Intel’s L3 cache is even smarter, implementing > 16-way associativity to maximize cache hits. > > However, AMD follows another strategy on the Phenom II X4 with a 2-way > set-associative L1 cache, which offers lower latencies. To compensate > for possible misses, it features twice the memory capacity: 64KB data > and 64KB instruction cache. The L2 cache is 8-way set-associative, like > Intel's design, but AMD’s L3 cache works at 48-way set associativity. > None of this can be judged without looking at the entire CPU > architecture. Naturally, only the benchmarks results really count, but > the whole purpose of this technical excursion is to provide a look into > the complexity behind multi-level caching. > #### > > Note in the benchmarks the identical dhrystone scores and yet some of > the tests can show as much as a 20% real world performance difference. > > So... Do we put +/- 20% error bars on the credits cobblestones > "depending" on 'whatever'... And that's before allowing for any > approximations made for the FLOPs count for the s...@h-wu AR. > > >> So, the AUTOFLOPS script takes a set of machines, calculates what the >> credit SHOULD HAVE BEEN based solely on the definition (just like 1 >> meter is defined as as the distance travelled by light in free space in >> 1⁄299,792,458 of a second, a cobblestone is defined by a formula, and it >> is defined in terms of whetstones and dhrystones), and returns a >> corrected scaling factor from FLOPs to Cobblestones. > > That's fine. Except to follow the analogy, you're using a broken wind-up > spring stopwatch with no compensation for the spring tension and your > "seconds" are anyone's guess! > > Or another analogy is that you're trying to use the light-time distance > between the earth and the moon as your standard measure unknowing that > the moon follows an elliptical orbit... > > (And we have to then keep changing all the road signs because apparently > all the cities must be moving backwards and forwards in strange ways...) > > >> Martin wrote: >>> Lynn W. Taylor wrote: >>>> If a cobblestone is a cobblestone, then "k" MUST BE EXACTLY 1. >>>> >>>> If a cobblestone isn't a cobblestone, then we're talking about >>>> something else. >>>> >>>> The Autoflops script takes something that isn't cobblestones, and >>>> converts it to something that is. > > I suspect that you get a different answer if s...@h happens to have a run > of one or other singular AR, or for if one OS happens to update whatever > drivers or other imponderable... > > At the moment, we're just waiting until we get silly numbers as soon as > the GPUs move in. > > >>> I'm arguing that the 'credits' that we actually see reported are not >>> really 'cobblestones'. >>> >>> What we see is some "fudge-factor * s...@h-flops-count" that variably >>> adds up to approximately the 'expected' cobblestones. The variable >>> nature of the processing for s...@h due to the AR dependence of the >>> s...@h-wus compounds the inaccuracies and problem still further. > > The killer for trying to use unknown untrusted hosts for benchmarking is > that you never know what else they might be doing or what other > unmeasured variables might corrupt the measurement. > > Taking a "median" to massage the results is inherently very suspicious. > > > Regards, > Martin > _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
