On Jun 30, 2007, at 1:18 AM, Andrew Lentvorski wrote:
Christopher Smith wrote:
Again, there was a time when what you said was true.
Superpipelining, which is found now (let alone when a current
fourth grader is likely to enter the real world) even in embedded
chips, has produced chips for whom addition, subtraction,
multiplication and division all impose a single cycle of overhead.
So the constant factor *is* the same.
I think you are confusing latency with throughput.
If you are computing A*B followed by C*D, it is true that you can
launch them right after one another. That's throughput.
If you are computing A*B*C, you will have to wait for A*B to
complete until you can launch (A*B)*C. That's latency.
<humor>
Of course I missed this, because I learned to do all my arithmetic
computation with pencil and paper so intuitively A*B*C takes just as
long to calculate as A*B followed by C*D. Nor do I see how one can
improve performance by doing multiple independent multiplies at once.
It is intuitively no better than doing them one after the other.
</humor>
I did, in fact, address this distinction in the paragraph immediately
following the one you quoted.
--Chris
--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg