On 9/6/07, Mark <[EMAIL PROTECTED]> wrote:
> Is there any chance you could pound out a quick code sample for the DMA
> case?  Since the nanocontroller has to be able to support both this and
> the VGA translation, I'd find it helpful to have an example for both.
> If you have time, etc., of course.

Well, before I do that, let's start with some background.  This way,
others can participate in the process.

One thing that we would use DMA for is simple memory moves.  This
doesn't require much smarts.

To begin with, both the target and master have limited length bursts
they can perform, because they intentionally have limited-sized
address counters.  For instance, the target will get a 32-bit address
in, say 0x00012300, and process a burst, where it internally advances
the address by 4 for every word.  However, the counter only has 6
bits, so what it does is detect when it would roll over from
0x000123FC to 0x00012400 and terminate the burst.  This forces the
host controller to restart the burst with the next address.  In other
words, our PCI target forces burst to start and end on 256-byte
boundaries.  The counters are limited in this way for the sake of
keeping counters small and clock rate up.  When we have combined the
PCI controller with other logic and see how it performs, we may
consider increasing the counter size.

Although the master design isn't quite finished, I intend to impose
the same sort of limit.  Now, the master really only cares about
starting addresses, so in theory, it could handle bursts of any
length.  You only need to know an address when you initiate a
transaction, after which you can burst streams of data all you like.
Problems arise, however, when something prematurely terminates the
transaction.  When that happens, we have two options.  One is to have
the nanocontroller baby-sit the DMA master, detect when a transaction
fails, figure out where to restart, and start it over again.
Unfortunately, this means that we can't queue up multiple requests and
then go do something else!  The alternative, which I like better, is
to give the master enough smarts to restart transactions that do not
cross a 256-byte boundary.  So it has a counter and can restart on its
own.  This way, we can queue up arbitrary numbers of requests.

For a download (move from graphics memory to main memory), the
nanocontroller would begin by making requests from the graphics memory
system, in whatever size is appropriate for the queues and such.  It
would probably wait until the first word arrived from the memory
controller and then request a write transaction from the DMA master.
The write transactions would respect the 64-word boundaries.  This
would be followed by an interleaved succession of read requests to gfx
memory and write requests to the master.  Following the last write
request would be a request to end the transaction.  The actual data
movement could be baby-sat by the nanocontroller, or we could
implement a crossbar for it to set up for automatically connecting
fifos.

For an upload, a rather analogous approach would be taken.

Note that the memory system too would involve short counters, so we
can't make individual requests that cross those boundaries.  Writes
would typically be accompanied by an explicit address, but when you're
streaming writes, all you have are data words, and there needs to be a
counter that runs on its own for this.  The counter would be set for
each boundary-crossing.  Read requests would typical be made as an
address with a word count.  Both the address counter would be short,
and the word count would be small.

In order to manage multiple requests, we want to be able to queue
them.  So for a streaming write, we would queue up an address with a
count.  When that count has expired, another request would be dequeued
from the write command queue, with its starting address and count.
Read requests would be queued up the same way.

So, off the top of my head, here are some of the I/O ports we would employ:

READ_REQUEST_ADDR (write these two to enqueue a read command)
READ_REQUEST_COUNT
READ_DATA (read this to dequeue)
READ_DATA_COUNT (how many are in the return queue that can be pulled)
WRITE_DATA (write these two to enqueue a single write)
WRITE_ADDR
WRITE_STREAM_ADDR (write these two to set up a streaming write)
WRITE_STREAM_COUNT
WRITE_STREAM_DATA (a write here enqueues data and advances addr)
WRITE_QUEUE_FREE (how many free words in the write data queue)
READ_COMMAND_FREE (how many free read command entries)
WRITE_COMMAND_FREE (same for writes)

Then there'd be other ports for things like setting up the crossbar and whatnot.

Now, while we may use a crossbar for data moves, the nanocontroller
would be directly involved in processing rendering commands.
Rendering commands would start with a 32-bit word that describes the
rest of the packet.  For instance, there would be a packet type, a set
of flags to indicate what attribute values are included, and maybe a
number indicating how many of the particular rendering operation to
do.  For instance:

* Packet header that indicates we're going to draw a triangle.  It
indicates that the triangle is solid and that the vertex color is
included.  It also indicates that there are two triangles to be drawn.
* Since the vertex color is flagged, that's inserted here.
* Three vertexes
* Three more vertexes
* Next packet....

DMA traditionally involves both ring buffers and linear ones.  For
simplicity, let's consider only dealing with a linear buffer.  The
controller gets programmed (by PIO or via a command in the ring
buffer) to know whence to fetch and how many words to fetch.  It would
begin by requesting a few blocks of reads from the DMA master and then
set about processing the data.  It would interpret the commands and
turn those into what the 3D engine would see as the same as PIO
register writes.  Between packets, it would request more blocks, at
least sufficient for the next command.

-- 
Timothy Normand Miller
http://www.cse.ohio-state.edu/~millerti
Open Graphics Project
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to