Hi Timothy, On Sunday 06 February 2005 10:39, Timothy Miller wrote: > Well, say it took 4 cycles to compute one sum. Then what you need is > dZ, dZ*2, dZ*3, and dZ*4, all of which are either trivial or easy to > compute. You use dZ*4 to get to the next loop, and send Z,0; Z,dZ; > Z,dZ*2; and Z,dZ*3 down the pipeline.
OK, I just want to tie this one off. It's clear how this will work with floating point adders: say the adder requires 4 clocks, and delivers one result every clock. For some interpolant T, four steps of dTdx can be in the pipeline, and because we compute two pixels on each step, we need two adders. The increment will always be 8*dTdx. We've got something like 17 interpolants, so that's 34 simplified fp adders. Interpolating vertically between scan lines will additionally use twice as many FP adders as interpolants, because two edges have to be interpolated. The edge setup requires a multiply and add per interpolant. Should we worry about this number of components, or is it no sweat? The vertical rasterization doesn't necessarily have to deliver a span per clock but it would be nice if it did, to keep up with one and two pixel-wide triangles. Finally, have we clawed our way back to 200 MHz yet? Regards, Daniel _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
