Hi,
At 06:50 PM 1/6/00 EST, [EMAIL PROTECTED] wrote:
>Spike Jones wrote:
>
>>Processor gurus, please: using the equivalence that is suggested
>>by the primenet status page [86.6 P90 CPU yr/day = 1042 GFlops]
>>I calculate that a floating point operation must be about 3 CPU cycles.
>
>Indeed, I calculate ~0.4 FLOP/cycle, which at first glance seems about a
>factor of 2 too slow, even on the humble P90 (which should, assuming no
>cache misses, be able to dispatch one FADD per cycle and (I believe - x86
>experts, please correct me if I'm wrong) one FMUL every other cycle, for
>a peak throughput of 1.5 FLOP/cycle.
The humble P90 can only do one FADD *OR* FMUL per cycle. Thus, maximum
throughput is 1.0 FLOP/cycle. Worse yet a floating point load takes one
clock and a store takes two clocks. With only 8 registers there are a
lot of loads and stores. The PPro, P-II, P-III, and Celeron have
a better architecture that allows loads and stores to run in parallel
with the FADDs and FMULs. This makes it easier to approach the 1.0
FLOP/cycle theoretical maximum.
Regards,
George
_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers