Hi Timothy,

On Sunday 06 February 2005 10:39, Timothy Miller wrote:
> Well, say it took 4 cycles to compute one sum.  Then what you need is
> dZ, dZ*2, dZ*3, and dZ*4, all of which are either trivial or easy to
> compute.  You use dZ*4 to get to the next loop, and send Z,0; Z,dZ;
> Z,dZ*2; and Z,dZ*3 down the pipeline.

OK, I just want to tie this one off.  It's clear how this will work with 
floating point adders: say the adder requires 4 clocks, and delivers 
one result every clock.  For some interpolant T, four steps of dTdx can 
be in the pipeline, and because we compute two pixels on each step, we 
need two adders.  The increment will always be 8*dTdx.

We've got something like 17 interpolants, so that's 34 simplified fp 
adders.  Interpolating vertically between scan lines will additionally 
use twice as many FP adders as interpolants, because two edges have to 
be interpolated.  The edge setup requires a multiply and add per 
interpolant.  Should we worry about this number of components, or is it 
no sweat?  The vertical rasterization doesn't necessarily have to 
deliver a span per clock but it would be nice if it did, to keep up 
with one and two pixel-wide triangles.

Finally, have we clawed our way back to 200 MHz yet?

Regards,

Daniel
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to