José Fonseca wrote:
This a proposal to give a more flexible approach to the current DRM's
generic DMA engine.

The current engine is described at
http://dri.sourceforge.net/doc/drm_low_level.html, section 2, item 3,
which I include here for your convenience:

  1. The X server can specify multiple pools of different sized buffers
  which are allocated and locked down.
No need for different sized buffers has ever been shown. (except the bogus abuse for the ringbuffer described below)

  2. The direct-rendering client maps these buffers into its virtual
  address space, using the DRM API.

  3. The direct-rendering client reserves some of these buffers from the
  DRM, fills the buffers with commands, and requests that the DRM send
  the buffers to the graphics hardware. Small buffers are used to ensure
  that the X server can get the lock between buffer dispatches, thereby
  providing X server interactivity. Typical 40MB/s PCI transfer rates
  may require 10000 4kB buffer dispatches per second.
The thinking behind this depends on the existence of multiple dma queues maintained by software (the drm), documented in (4). The reality is that such queues don't exist, and real drm drivers just send commands directly to the hardware, which has queues of its own (eg ringbuffers).


  4. The DRM manages a queue of DMA buffers for each OpenGL GLXContext,
  and detects when a GLXContext switch is necessary.  Hooks are provided
  so that a device-specific driver can perform the GLXContext switch in
  kernel-space, and a callback to the X server is provided when a
  device-specific driver is not available (for the SI, the callback
  mechanism is used because it provides an example of the most generic
  method for GLXContext switching). The DRM also performs simple
  scheduling of DMA buffer requests to prevent GLXContext thrashing.
  When a GLXContext is swapped a significant amount of data must be read
  from and/or written to the graphics device (between 4kB and 64kB for
  typical hardware).
This simply doesn't happen.


  5. The DMA engine is generic in the sense that the X server provides
  information at run-time on how to perform DMA operations for the
  specific hardware installed on the machine. The X server does all of
  the hardware detection and setup. This allows easy bootstrapping for
  new graphics hardware under the DRI, while providing for later
  performance and capability enhancements through the use of a
  device-specific kernel driver.
This is total bs and refers to the version of the drm which had a forth interpreter built into it.

There are some limitations with this approach.  The most important is
that it doesn't accomodate for cards which need special purpose DMA
buffers which are not meant for client usage (things such as ring
buffers in Mach64/Rage128/Radeon, primary DMA buffers in MGA, page table
buffer in Glint, etc.) In the AGP versions of these cards, these buffers
are subtracted directly to the allocated AGP buffer. In the PCI (which
only exists for some cards) this is done by abusing of item 1 above,
i.e., by allocating a pool of just one buffer. The more peculiar is that
the different sized buffer pools are actually only used for this
purpose, and not for clients having more than one buffer size to choose
(as appearantly was the original intention.)
Correct. However the abuse of this mechanism is wrong and shouldn't have been done. Ringbuffers should be allocated one way or another, but the use of the overblown drm dma buffer mechanism for that is a grotesque abuse...

What's important to see is that buffers are three things:
1) a memory management system
2) a hardware synchronization device
3) a mechanism for submitting commands to hardware

These functions can be separated. We've already separated the third function more or less in the newer drivers.


Unfortunately, this trick won't work for Mach64 since [due to security
reasons] it will need to have two DMA buffers pools with the same size:
one private and another for general client usage.
Are these both actual dma buffers in the strong sense - are they both accessed by the card's dma mechanisms?

NOTE: This is the only reason why I'm addressing this issue, but since
with a little effort it can yield better and cleaner code for all other
drivers too, I prefer to walk that extra length than to make a dirty
hack (which probably would only work for Mach64/Linux).
Agreed.

Another IMHO unnecessary complication in the above scheme is the fact
that the AGP memory and the DMA buffers are allocated in the X server
(via the DRM API), but just to be given back to the DRM again. This
round-trip not only leads to code and structure redundancies, as it
makes more dificult to keep binary compatability. I think the DMA memory
it's something that belongs to the kernel responsability so it should be
managed by the DRM alone, while X would only tell how much total DMA
memory should be allocated, based on the user configuration files.
Yes this would be cleaner.

My suggestion is twofold:

1) Provide a set of DRM APIs for PCI DMA buffers allocation. No
assumption about the buffer management is made and that is entirely left
to the DRM driver (which is free to put some in a linked list, map them
or whatever). In Linux they basically would be just a very thin wrapper
around pci_*_consistent routines.
Is there an existing linux interface to these routines that could be used instead of adding code to the drm module?


2) Have the current AGP allocation and memory mapping APIs available for
internal drive usage, instead of just an IOCTL interface for X usage.
I don't understand what you're suggesting by this.

Note that this can be made without breaking the existing source code,
just by adding new APIs, and as all drivers move to this scheme, the
old APIs would be removed or turned into NOP's for backward
compatability.
Backward compatiblity means that old X servers and old 3d clients work with new versions of the kernel modules -- this means that you can't really remove API's... But I guess you can turn them into noops as long as things like DRM_IOCTL_DMA still work as expected...

I would really appreciate some feedback as I still don't have all the
puzzle pieces fitting nicely in my head so I would welcome the others
developers to help solve it.



Keith



-------------------------------------------------------
This sf.net email is sponsored by:
With Great Power, Comes Great Responsibility
Learn to use your power at OSDN's High Performance Computing Channel
http://hpc.devchannel.org/
_______________________________________________
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to