On Wednesday 26 January 2005 12:19, Timothy Miller wrote:
> To be honest, I'm not even sure if we'll want to use DRI.  We need
> the X server to be able to sleep while waiting on interrupts.  We
> also need for the X server to be able to do most DMA without the
> ioctl overhead. If DRI doesn't give us exactly what we need for best
> performance, we'll need our own driver.
>
> If the DRI folks want to rip apart our driver and add anything
> "missing" into DRI, all the power to them.  But what we should
> produce for the prototypes is what works best for us, not necessarily
> what works in exactly the way everyone expects.

Hi Timothy,

It would make a whole lot of sense to work closely with DRI/Xorg crowd 
and get it right on both sides.  DRI is by no means immutable.

Here's an off-the-wall idea for command DMA.  Caveat: I've never 
actually looked at command DMA on a graphics card, so I may just be 
rambling.  But I have worked a fair bit with sound card DMA, and I 
rather like the way that works.

The idea is to have a power-of-two sized ring buffer that generates two 
interrupts each time round, one at halfway and the other at the wrap 
point.  To start a frame, you fill the bottom half of the buffer plus a 
bit with commands+data, then set up the DMA via command registers and 
start it cycling.  Each interrupt refills half the command buffer.  
When the frame is finished, the last command in the buffer is "stop 
DMA".  The point is, there's no per-cycle setup overhead for this 
scheme at all.  It is possible for the refill routine to underrun, 
since it ultimately has to be driven from an foreground task which 
might fail to deliver on time for one reason or another (e.g., disk 
IO).  In that case, the interrupt routine could fill the buffer with 
no-ops, rather than stalling it and requiring fresh setup.

The advantage of this scheme versus just setting up each block of DMA 
when the completion interrupt for the one before it arrives is, there's 
no latency between the interrupt and delivering the next batch of 
commands.  There's almost no IO register traffic, and obviously there's 
no busy waiting.

I dunno, maybe command blocks are so big it doesn't make any difference.  
Setting up each DMA transfer individually is certainly the simplest way 
to do it.

Regards,

Daniel
_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Reply via email to