On 8/15/07, Mark <[EMAIL PROTECTED]> wrote:
> Timothy Normand Miller wrote:
> > We're hoping to get the GPU to exceed 100MHz in the FPGA.  Maybe we
> Here, you're referring to the Spartan, right?

That's right.  We want the drawing engine to be that fast or better.

> And the DMA controller
> (aka nanocontroller?) is separately constrained by the PCI bandwidth?

Well, it is effectively constrained by whatever it's trying to control
(DMA or VGA).  We'd like it to have a bit more speed than that to
account for other processing overhead.

> If you're targeting 66MHz 32-bit PCI (or equivalent) and the
> nanocontroller can do one 32-bit read or write per cycle, am I right in
> understanding that the nanocontroller would then need to run at well
> over 133MHz to keep the PCI bandwidth satisfied?  I say well over
> because there will obviously be other instructions executed besides just
> reads and writes (though obviously you don't have to saturate the peak
> /theoretical/ bandwidth of the interface...).

If we have it move individual words, yes.  However, we want to give it
some higher-level control.  For instance, it could control a crossbar
that connected up different agents that may send or receive data.  We
tell one agent to read and another agent to write, and with the data
channels bound together, the data movement happens in parallel.

There are other options too, like move instructions that read from one
port and write to another in one cycle.  I think we actually have some
of those.

Note that since none of these things can stall the CPU pipeline, we
have to account for the overhead of querying to find out how many read
words are available or how much room is free in a write fifo before we
actually do it.

> > not sure if Lattice registers have Enable inputs, or they may be
> > mutually exclusive with async resets.
> At a glance, it looks like the asynchronous set/reset and the clock
> enable are independent (and there is a clock enable).  I'm taking this
> from page 6 of http://www.latticesemi.com/documents/DS1001.pdf.

We can experiment with this.  In any case, it will add complexity, and
we have to trade that off against available logic cells, achievable
clock speed, etc.

> I'm still getting to know the tools, but it looks like a 32x32->64
> multiplier with registers at the inputs and outputs requires 567 slices
> (12% of the device) and operates at 45MHz post-PAR on LFXP10C,5 (using
> ispLEVER 7 & Synplify 8.8 under wine).  This is using the Project
> Navigator defaults (Synplify targets 200MHz and achieves 61MHz; PAR
> targets 56MHz and achieves 45MHz -- I have no idea where or how those
> constraints were specified).

Since you haven't specified constraints manually, it's making them up.
 I don't know how it decides.

Also, because of competition for resources, in a design with other
logic, you'll never achieve even that 45MHz.  We need to do something
else.

-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to