On 8/15/07, Mark <[EMAIL PROTECTED]> wrote: > Timothy Normand Miller wrote: > > We're hoping to get the GPU to exceed 100MHz in the FPGA. Maybe we > Here, you're referring to the Spartan, right?
That's right. We want the drawing engine to be that fast or better. > And the DMA controller > (aka nanocontroller?) is separately constrained by the PCI bandwidth? Well, it is effectively constrained by whatever it's trying to control (DMA or VGA). We'd like it to have a bit more speed than that to account for other processing overhead. > If you're targeting 66MHz 32-bit PCI (or equivalent) and the > nanocontroller can do one 32-bit read or write per cycle, am I right in > understanding that the nanocontroller would then need to run at well > over 133MHz to keep the PCI bandwidth satisfied? I say well over > because there will obviously be other instructions executed besides just > reads and writes (though obviously you don't have to saturate the peak > /theoretical/ bandwidth of the interface...). If we have it move individual words, yes. However, we want to give it some higher-level control. For instance, it could control a crossbar that connected up different agents that may send or receive data. We tell one agent to read and another agent to write, and with the data channels bound together, the data movement happens in parallel. There are other options too, like move instructions that read from one port and write to another in one cycle. I think we actually have some of those. Note that since none of these things can stall the CPU pipeline, we have to account for the overhead of querying to find out how many read words are available or how much room is free in a write fifo before we actually do it. > > not sure if Lattice registers have Enable inputs, or they may be > > mutually exclusive with async resets. > At a glance, it looks like the asynchronous set/reset and the clock > enable are independent (and there is a clock enable). I'm taking this > from page 6 of http://www.latticesemi.com/documents/DS1001.pdf. We can experiment with this. In any case, it will add complexity, and we have to trade that off against available logic cells, achievable clock speed, etc. > I'm still getting to know the tools, but it looks like a 32x32->64 > multiplier with registers at the inputs and outputs requires 567 slices > (12% of the device) and operates at 45MHz post-PAR on LFXP10C,5 (using > ispLEVER 7 & Synplify 8.8 under wine). This is using the Project > Navigator defaults (Synplify targets 200MHz and achieves 61MHz; PAR > targets 56MHz and achieves 45MHz -- I have no idea where or how those > constraints were specified). Since you haven't specified constraints manually, it's making them up. I don't know how it decides. Also, because of competition for resources, in a design with other logic, you'll never achieve even that 45MHz. We need to do something else. -- Timothy Normand Miller http://www.cse.ohio-state.edu/~millerti Open Graphics Project _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
