This a proposal to give a more flexible approach to the current DRM's generic DMA engine.
The current engine is described at http://dri.sourceforge.net/doc/drm_low_level.html, section 2, item 3, which I include here for your convenience: 1. The X server can specify multiple pools of different sized buffers which are allocated and locked down. 2. The direct-rendering client maps these buffers into its virtual address space, using the DRM API. 3. The direct-rendering client reserves some of these buffers from the DRM, fills the buffers with commands, and requests that the DRM send the buffers to the graphics hardware. Small buffers are used to ensure that the X server can get the lock between buffer dispatches, thereby providing X server interactivity. Typical 40MB/s PCI transfer rates may require 10000 4kB buffer dispatches per second. 4. The DRM manages a queue of DMA buffers for each OpenGL GLXContext, and detects when a GLXContext switch is necessary. Hooks are provided so that a device-specific driver can perform the GLXContext switch in kernel-space, and a callback to the X server is provided when a device-specific driver is not available (for the SI, the callback mechanism is used because it provides an example of the most generic method for GLXContext switching). The DRM also performs simple scheduling of DMA buffer requests to prevent GLXContext thrashing. When a GLXContext is swapped a significant amount of data must be read from and/or written to the graphics device (between 4kB and 64kB for typical hardware). 5. The DMA engine is generic in the sense that the X server provides information at run-time on how to perform DMA operations for the specific hardware installed on the machine. The X server does all of the hardware detection and setup. This allows easy bootstrapping for new graphics hardware under the DRI, while providing for later performance and capability enhancements through the use of a device-specific kernel driver. There are some limitations with this approach. The most important is that it doesn't accomodate for cards which need special purpose DMA buffers which are not meant for client usage (things such as ring buffers in Mach64/Rage128/Radeon, primary DMA buffers in MGA, page table buffer in Glint, etc.) In the AGP versions of these cards, these buffers are subtracted directly to the allocated AGP buffer. In the PCI (which only exists for some cards) this is done by abusing of item 1 above, i.e., by allocating a pool of just one buffer. The more peculiar is that the different sized buffer pools are actually only used for this purpose, and not for clients having more than one buffer size to choose (as appearantly was the original intention.) Unfortunately, this trick won't work for Mach64 since [due to security reasons] it will need to have two DMA buffers pools with the same size: one private and another for general client usage. NOTE: This is the only reason why I'm addressing this issue, but since with a little effort it can yield better and cleaner code for all other drivers too, I prefer to walk that extra length than to make a dirty hack (which probably would only work for Mach64/Linux). Another IMHO unnecessary complication in the above scheme is the fact that the AGP memory and the DMA buffers are allocated in the X server (via the DRM API), but just to be given back to the DRM again. This round-trip not only leads to code and structure redundancies, as it makes more dificult to keep binary compatability. I think the DMA memory it's something that belongs to the kernel responsability so it should be managed by the DRM alone, while X would only tell how much total DMA memory should be allocated, based on the user configuration files. My suggestion is twofold: 1) Provide a set of DRM APIs for PCI DMA buffers allocation. No assumption about the buffer management is made and that is entirely left to the DRM driver (which is free to put some in a linked list, map them or whatever). In Linux they basically would be just a very thin wrapper around pci_*_consistent routines. 2) Have the current AGP allocation and memory mapping APIs available for internal drive usage, instead of just an IOCTL interface for X usage. Note that this can be made without breaking the existing source code, just by adding new APIs, and as all drivers move to this scheme, the old APIs would be removed or turned into NOP's for backward compatability. I would really appreciate some feedback as I still don't have all the puzzle pieces fitting nicely in my head so I would welcome the others developers to help solve it. Regards, José Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com ------------------------------------------------------- This sf.net email is sponsored by: With Great Power, Comes Great Responsibility Learn to use your power at OSDN's High Performance Computing Channel http://hpc.devchannel.org/ _______________________________________________ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel