On Tuesday 01 February 2005 23:06, Timothy Miller wrote: > To all: > > Ok, here are the kinds of DMA transactions that I want to support: > > (1) Direct command buffer. Using a ring buffer with "read pointer" > (controlled by GPU) and "write pointer" (controlled by CPU), the host > can fill an empty portion of the buffer, starting at the write > pointer, with command packets. The GPU fetches words in blocks and > feeds them into the GPU fifo.
For (1) and (2), are the "blocks" of fixed size? (If not, how do you define the size of each block?) > (2) Indirect command buffer. Using PIO or a ring buffer, arbitrary > host address ranges are specified. The GPU fetches them in blocks, in > order. This is useful for multitasking, where there is a central > server and individual packets from different processes are thrown into > the one ring buffer. Both for (1) and (2), note that the most efficient way for the kernel driver to grab command buffers from userspace is to steal pages from userspace memory. In this case, you can expect that the common case is that these pages are not full (command list rarely ends exactly on a page boundary) and that these pages are not physically contiguous (even if they appear to be in the virtual address space of the userspace process). I hope such a case does not interfere with the way of operation you have in mind. Note also that stealing pages from a process memory is not exactly easy (unless they are write-only or read-only by design, like for a file or pipe). The usual way is to mark them unwritable and unreadable and to put the process to sleep if it tries to touch these pages, until they are available again for it. > (3) Direct data move. Data to be written to graphic memory is loaded > directly into the ring buffer. In this case, the graphics memory, > rather than the GPU engine is the target. Moves from card to host are > not possible this way. It seems to me that, in this case, you will require AGP-like remaping capabilities to transfer efficiently areas longer than one page from host memory to graphics memory. (Unfortunately, contiguous areas in virtual memory cannot be assumed to be contiguous in physical memory. Enforcing such situation usually necessitates specific APIs in kernels and lots of secondary software problems...) > (3) Indirect data move. Using either PIO or a ring buffer, arbitrary > host address ranges are specified. The source/target is graphics > memory, and either reads or writes can be specified. If this is the way to go to do arbitrary size data movements (by batching one page at a time) from host memory to graphics memory, I wonder if the previous case "(3) Direct data move" is really needed? > For indirect DMA, an interrupt can be asserted for the completion of > each unit. For direct DMA, two interrupts are available: ring empty; > ring has reached low water-mark. In fine, why not have only indirect data move and indirect command buffer move? As I see it, such indirect modes will be mandatory to handle cleanly multiple-pages data transfers and the kernel will probably enforce unit host address ranges shorter than one page length (unless AGP is fully used - not such a common case, but X could use it). BTW, an interrupt for each unit (in the indirect case) is probably overkill (depends on the maximal interrupt rate however). Would it be possible to mark units that should raise an interrupt? (Indirect mode is especially useful for 2D operations with multiple processes - like the case of an X server. But in this case, units are probably pretty short in size and the sync offered by interrupts is requested rarely by processes - they usually do not bother to know that their drawings have been processed.) While talking about interrupts. Note that, IMHO, two kind of interrupts are useful: one for the end of the data transfer (it allows the kernel to reclaim resources) but also one for the end of the associated drawing operation. The latter is rarely furnished by the hardware - usually the driver needs to wait for the engine to do idle to be sure that drawing operations are completed. However this is the "interrupt" that drawing processes wants (they want to know that the drawing is finished on screen, not that the engine has finished to fetch ops). Would it be possible to mark units transfered via DMA so that an interrupt is generated when the drawing operation is completed? (Given transistor budget constraints of course.) Rodolphe _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
