Re: [Open-graphics] Memory management architecture, security

Rodolphe Ortalo Tue, 01 Feb 2005 16:52:33 -0800

On Tuesday 01 February 2005 23:06, Timothy Miller wrote:
> To all:
>
> Ok, here are the kinds of DMA transactions that I want to support:
>
> (1) Direct command buffer.  Using a ring buffer with "read pointer"
> (controlled by GPU) and "write pointer" (controlled by CPU), the host
> can fill an empty portion of the buffer, starting at the write
> pointer, with command packets.  The GPU fetches words in blocks and
> feeds them into the GPU fifo.


For (1) and (2), are the "blocks" of fixed size? (If not, how do you define 
the size of each block?)

> (2) Indirect command buffer.  Using PIO or a ring buffer, arbitrary
> host address ranges are specified.  The GPU fetches them in blocks, in
> order.  This is useful for multitasking, where there is a central
> server and individual packets from different processes are thrown into
> the one ring buffer.

Both for (1) and (2), note that the most efficient way for the kernel driver 
to grab command buffers from userspace is to steal pages from userspace 
memory. In this case, you can expect that the common case is that these pages 
are not full (command list rarely ends exactly on a page boundary) and that 
these pages are not physically contiguous (even if they appear to be in the 
virtual address space of the userspace process).
I hope such a case does not interfere with the way of operation you have in 
mind.
Note also that stealing pages from a process memory is not exactly easy 
(unless they are write-only or read-only by design, like for a file or pipe). 
The usual way is to mark them unwritable and unreadable and to put the 
process to sleep if it tries to touch these pages, until they are available 
again for it.

> (3) Direct data move.  Data to be written to graphic memory is loaded
> directly into the ring buffer.  In this case, the graphics memory,
> rather than the GPU engine is the target.  Moves from card to host are
> not possible this way.

It seems to me that, in this case, you will require AGP-like remaping 
capabilities to transfer efficiently areas longer than one page from host 
memory to graphics memory. (Unfortunately, contiguous areas in virtual memory 
cannot be assumed to be contiguous in physical memory. Enforcing such 
situation usually necessitates specific APIs in kernels and lots of secondary 
software problems...)

> (3) Indirect data move.  Using either PIO or a ring buffer, arbitrary
> host address ranges are specified.  The source/target is graphics
> memory, and either reads or writes can be specified.

If this is the way to go to do arbitrary size data movements (by batching one 
page at a time) from host memory to graphics memory, I wonder if the previous 
case "(3) Direct data move" is really needed?

> For indirect DMA, an interrupt can be asserted for the completion of
> each unit.  For direct DMA, two interrupts are available:  ring empty;
> ring has reached low water-mark.

In fine, why not have only indirect data move and indirect command buffer 
move? As I see it, such indirect modes will be mandatory to handle cleanly 
multiple-pages data transfers and the kernel will probably enforce unit host 
address ranges shorter than one page length (unless AGP is fully used - not 
such a common case, but X could use it).

BTW, an interrupt for each unit (in the indirect case) is probably overkill 
(depends on the maximal interrupt rate however). Would it be possible to mark 
units that should raise an interrupt?
(Indirect mode is especially useful for 2D operations with multiple processes 
- like the case of an X server. But in this case, units are probably pretty 
short in size and the sync offered by interrupts is requested rarely by 
processes - they usually do not bother to know that their drawings have been 
processed.)

While talking about interrupts. Note that, IMHO, two kind of interrupts are 
useful: one for the end of the data transfer (it allows the kernel to reclaim 
resources) but also one for the end of the associated drawing operation.
The latter is rarely furnished by the hardware - usually the driver needs to 
wait for the engine to do idle to be sure that drawing operations are 
completed. However this is the "interrupt" that drawing processes wants (they 
want to know that the drawing is finished on screen, not that the engine has 
finished to fetch ops).
Would it be possible to mark units transfered via DMA so that an interrupt is 
generated when the drawing operation is completed? (Given transistor budget 
constraints of course.)

Rodolphe
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Re: [Open-graphics] Memory management architecture, security

Reply via email to