There are actually lots of ways to do DMA ring buffering. One thing you don't want to do is fill the buffer with NOPs. You want to prevent the GPU from fetching anything it doesn't have to fetch. Don't waste bus bandwidth.
How about people look at how existing GPUs do DMA buffering and discuss them. We can pick something that works well for us. One problem I've dealt with: In X11, you implement atomic rendering functions, like say, bitblts. You can't tell from there whether or not there are more commands on the wire. If there are, you don't want to bother to tell the GPU about it, because that's a bus access to tell it about the new write pointer. If there are no more commands, however, you do need to update the write pointer so that you don't have unexecuted commands. What would be nice is a hook into the poll() in X11 that wakes up after a millisecond delay if no commands have been received and lets us issue the DMA write pointer write. On Fri, 28 Jan 2005 10:34:57 +0100, Nicolai Haehnle <[EMAIL PROTECTED]> wrote: > On Friday 28 January 2005 06:16, you wrote: > > It would make a whole lot of sense to work closely with DRI/Xorg crowd > > and get it right on both sides. DRI is by no means immutable. > > I agree with you, and since I'm interested in driver development and have at > least a little bit of experience with DRI, I believe this can be doable. > > > Here's an off-the-wall idea for command DMA. Caveat: I've never > > actually looked at command DMA on a graphics card, so I may just be > > rambling. But I have worked a fair bit with sound card DMA, and I > > rather like the way that works. > > > > The idea is to have a power-of-two sized ring buffer that generates two > > interrupts each time round, one at halfway and the other at the wrap > > point. To start a frame, you fill the bottom half of the buffer plus a > > bit with commands+data, then set up the DMA via command registers and > > start it cycling. Each interrupt refills half the command buffer. > > When the frame is finished, the last command in the buffer is "stop > > DMA". The point is, there's no per-cycle setup overhead for this > > scheme at all. It is possible for the refill routine to underrun, > > since it ultimately has to be driven from an foreground task which > > might fail to deliver on time for one reason or another (e.g., disk > > IO). In that case, the interrupt routine could fill the buffer with > > no-ops, rather than stalling it and requiring fresh setup. > > This scheme makes sense for devices where the bytes/second bandwidth is > fairly constant, which is the case for sound cards. However, with graphics > cards the same number of bytes can take either very short (small triangles) > or very long time to process (large triangles). So a straightforward ring > buffer - which is what basically every graphics card out there uses - is > just right. > > > The advantage of this scheme versus just setting up each block of DMA > > when the completion interrupt for the one before it arrives is, there's > > no latency between the interrupt and delivering the next batch of > > commands. There's almost no IO register traffic, and obviously there's > > no busy waiting. > > How would the amount of IO register traffic be any different? The > busy-waiting and interrupt latencies should also be no problem with a > normal DMA ring buffer. > > When it comes to command streams, there are some important considerations: > 1. The kernel has to prevent user-level applications from issuing arbitrary > commands (otherwise apps could program arbitrary DMA transfers) > > 2. There will be *a lot* of bandwidth used for what is essentially just > geometry data. > > This is why we absolutely must have some kind of indirect buffer system. The > way I believe it should work is roughly this: > - the user-level OpenGL (or whatever) code creates a buffer in DMA-able > memory > - the user-level code issues an ioctl telling the kernel "please execute > this buffer" > - the kernel will put a "CALL indirect_buffer" command into the main ring > buffer > - the hardware only allows a subset of commands in indirect buffers > > cu, > Nicolai > > > _______________________________________________ > Open-graphics mailing list > [email protected] > http://lists.duskglow.com/mailman/listinfo/open-graphics > List service provided by Duskglow Consulting, LLC (www.duskglow.com) > > > _______________________________________________ Open-graphics mailing list [email protected] http://lists.duskglow.com/mailman/listinfo/open-graphics List service provided by Duskglow Consulting, LLC (www.duskglow.com)
