- Why is the integer add divided into four pipeline stages?
It's artificial. The whole idea is to make the integer side have gobs
of positive slack in the unrouted netlist. This way, FP and INT
compete less for routing resources, making the worst case delay
dependent on the FP side, rather than being a compromise, making the
the critical delay even longer. The int pipeline has so many dummy
stages that we really should stretch it out as much as possible.
- ... barrel shifter ...
I realize that the shifter is just stages of MUXes. I'm just breaking
it up over pipeline stages. Again, because we CAN and it doesn't hurt
anything, and it makes P&R easier on other parts of the design.
As a counter-argument, the dummy stages might be optimizable as some
kind of shift register that might save energy and area.
...
One other thing: Obviously, we want to save area, because that means
we can fit in more shaders. But read about "dark silicon" and "the
power wall". Some would say that transistors are cheap, and you can't
turn them all on at once, so you should make designs that switch less
at the expense of more transistors. This GPU design will he heavily
laden with clock gating at whatever level we can get it.
In FPGAs, you get both reset and clock-enable lines on flip-flops, and
by using the clock-enables, although the clock net keeps switching,
the logic doesn't.
always @(posedge clock) begin
if (reset) begin
...
end else if (enable) begin
....
end
end
--
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)