Hi, At 11:41 AM 10/14/2001 +0000, [EMAIL PROTECTED] wrote: > > According to latest benchmarks (http://www.mersenne.org/bench.htm), > > AthlonXP seems to be slower than the Thunderbird. Does anybody have a > > technical explanation ?
The technical explanation is that I only have one XP benchmark. I'm sure that as I get more benchmarks a better tuned system will report better timings. >The Athlon XP lies about its speed. Remember the old Cyrix trick? >Well AMD have gone for the same - pick a benchmark that suits >you, then claim your chip is "1800+" if it runs _that_ benchmark a >wee bit faster than the opposition's chip running at 1800 MHz. I'll try to post the real processor speed on my page, though it will be a bit of a pain keeping it all straight. I understand why AMD did this, at least they have been relatively honest (that is, conservative) in choosing their "equivalent P4 cpu speed" number. >That seems to have been the case for some months now; it's >SSE2 which makes the difference. A P4 running Prime95 is well >over twice as fast as a T'bird running at the same clock speed. The P4 has two advantages. One is memory bandwidth. The other is SSE2. Interestingly, the theoretical floating point throughput of the P4 is identical to the Athlon (one multiply and one add per clock) and the Athlon even has lower latencies. So why is SSE2 an advantage? I theorize that their are 3 reasons. 1) It relieves the register pressure. With SSE2 there are 16 floating point values in registers instead of 8. Obviously it is much easier to schedule operations that are not dependent on previous operations with twice as many data values. 2) The directly addressable registers are easier for the chip to keep track of than the standard x86 FPU register stack. I'm guessing there is some limit as to how much internal register renaming can hide this problem in the Athlon. 3) Once an SSE2 instruction is ready, it starts two different FPU operations on two consecutive clock cycles. In other words, the CPU only needs to find a new instruction to execute every other clock cycle to keep the FPU units busy. This is of course heavily related to reason #1. When AMD finally implements SSE2 it should be a screamer. The lower latencies could be a big plus. Regards, George _________________________________________________________________________ Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
