On Thu, 6 Jan 2000 18:50:41 EST, [EMAIL PROTECTED] wrote:
>Indeed, I calculate ~0.4 FLOP/cycle, which at first glance seems about a
>factor of 2 too slow, even on the humble P90 (which should, assuming no
>cache misses, be able to dispatch one FADD per cycle and (I believe - x86
>experts, please correct me if I'm wrong) one FMUL every other cycle, for
>a peak throughput of 1.5 FLOP/cycle.

  Almost.  Although the FADD unit can start an instruction every cycle and
the FMUL unit every second cycle, they share some overhead (eg, decode and
control logic), such that only one of the two can be started in any given
cycle.  This is true of both P5 and P6 architectures.

Colin Percival

_________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Reply via email to