[Dri-devel] RFC: More flexible and general purpose DRM DMA engine.

José Fonseca Mon, 16 Dec 2002 11:53:31 -0800

This a proposal to give a more flexible approach to the current DRM's
generic DMA engine.


The current engine is described at
http://dri.sourceforge.net/doc/drm_low_level.html, section 2, item 3,
which I include here for your convenience:

  1. The X server can specify multiple pools of different sized buffers
  which are allocated and locked down.

  2. The direct-rendering client maps these buffers into its virtual
  address space, using the DRM API.

  3. The direct-rendering client reserves some of these buffers from the
  DRM, fills the buffers with commands, and requests that the DRM send
  the buffers to the graphics hardware. Small buffers are used to ensure
  that the X server can get the lock between buffer dispatches, thereby
  providing X server interactivity. Typical 40MB/s PCI transfer rates
  may require 10000 4kB buffer dispatches per second.

  4. The DRM manages a queue of DMA buffers for each OpenGL GLXContext,
  and detects when a GLXContext switch is necessary.  Hooks are provided
  so that a device-specific driver can perform the GLXContext switch in
  kernel-space, and a callback to the X server is provided when a
  device-specific driver is not available (for the SI, the callback
  mechanism is used because it provides an example of the most generic
  method for GLXContext switching). The DRM also performs simple
  scheduling of DMA buffer requests to prevent GLXContext thrashing.
  When a GLXContext is swapped a significant amount of data must be read
  from and/or written to the graphics device (between 4kB and 64kB for
  typical hardware).

  5. The DMA engine is generic in the sense that the X server provides
  information at run-time on how to perform DMA operations for the
  specific hardware installed on the machine. The X server does all of
  the hardware detection and setup. This allows easy bootstrapping for
  new graphics hardware under the DRI, while providing for later
  performance and capability enhancements through the use of a
  device-specific kernel driver.


There are some limitations with this approach.  The most important is
that it doesn't accomodate for cards which need special purpose DMA
buffers which are not meant for client usage (things such as ring
buffers in Mach64/Rage128/Radeon, primary DMA buffers in MGA, page table
buffer in Glint, etc.) In the AGP versions of these cards, these buffers
are subtracted directly to the allocated AGP buffer. In the PCI (which
only exists for some cards) this is done by abusing of item 1 above,
i.e., by allocating a pool of just one buffer. The more peculiar is that
the different sized buffer pools are actually only used for this
purpose, and not for clients having more than one buffer size to choose
(as appearantly was the original intention.)

Unfortunately, this trick won't work for Mach64 since [due to security
reasons] it will need to have two DMA buffers pools with the same size:
one private and another for general client usage. 

NOTE: This is the only reason why I'm addressing this issue, but since
with a little effort it can yield better and cleaner code for all other
drivers too, I prefer to walk that extra length than to make a dirty
hack (which probably would only work for Mach64/Linux).

Another IMHO unnecessary complication in the above scheme is the fact
that the AGP memory and the DMA buffers are allocated in the X server
(via the DRM API), but just to be given back to the DRM again. This
round-trip not only leads to code and structure redundancies, as it
makes more dificult to keep binary compatability. I think the DMA memory
it's something that belongs to the kernel responsability so it should be
managed by the DRM alone, while X would only tell how much total DMA
memory should be allocated, based on the user configuration files.


My suggestion is twofold:

1) Provide a set of DRM APIs for PCI DMA buffers allocation. No
assumption about the buffer management is made and that is entirely left
to the DRM driver (which is free to put some in a linked list, map them
or whatever). In Linux they basically would be just a very thin wrapper
around pci_*_consistent routines.

2) Have the current AGP allocation and memory mapping APIs available for
internal drive usage, instead of just an IOCTL interface for X usage.


Note that this can be made without breaking the existing source code,
just by adding new APIs, and as all drivers move to this scheme, the
old APIs would be removed or turned into NOP's for backward
compatability.


I would really appreciate some feedback as I still don't have all the
puzzle pieces fitting nicely in my head so I would welcome the others
developers to help solve it.

Regards,

José Fonseca
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com


-------------------------------------------------------
This sf.net email is sponsored by:
With Great Power, Comes Great Responsibility 
Learn to use your power at OSDN's High Performance Computing Channel
http://hpc.devchannel.org/
_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

[Dri-devel] RFC: More flexible and general purpose DRM DMA engine.

Reply via email to