[Open-graphics] Re: Idea for responsive card interface

Hamish Thu, 24 Aug 2006 15:43:39 -0700

On Thursday 24 August 2006 21:31, you wrote:
> On 8/24/06, Hamish <[EMAIL PROTECTED]> wrote:
> > I had a thought... Might be a random brain fart, or it might just use far
> > too too much logic, or whatever...
> >
> > Most interfaces I've seen that use DMA, you allocate a block of memory,
> > add commands to it & then wait for the card to go idle & then tell the
> > card to go for... Now with hard-drives you do pretty much the same.
> > Except modern (OK, anything since early 90's perhaps inthe SCSI world)
> > you can queue commands.
> >
> > So why not with the graphics card as well?
>
> To an extent, we'll be doing that.  A typical approach is to have a
> central circular buffer ("ring buffer") that is controlled by a
> central driver.  Whenever an application wants to send commands, it
> writes to its own "indirect" buffer, and when it wants to submit those
> commands, a command is put into the ring buffer that points at the
> indirect buffer.  [Indirect buffers are linear, not circular.]
>
> The ring buffer is the queue.
>
> > Instead of telling the card about a block of commands to execute straight
> > away, why not simply tell it to enqueue a block of commands for
> > execution? The card then saves the address of the block in an SRAM block
> > (circular buffer). The graphics engine itself then simply loop sthrough
> > block after block until there are no more to execute.
>
> The problem with directly enqueueing into the GPU is that if DMA is
> already going on, we contend for the bus.  The bus is already in use,
> so we end up wasting thousands of CPU cycles waiting on the bus to
> clear just to do a handful of PIO writes.  It's better to use a DMA
> buffer in host memory where we have some shared variables between the
> CPU and GPU, and we just use those to indicate where the queue head
> and tail are.
>
> > But wait there's more... Prioritise blocks... Have n+1 priorties of
> > command block. One is reserved as the highest. This could be used to
> > execute something NOW! Without waiting. (Or perhaps execute as the next
> > block if the logic to return is too great). You could then optionally
> > have multiple levels of priority for command blocks...
>
> We could just have the driver do that.  If two applications are trying


But that's why I'd thougt about making this a hardware function. To keep the 
driver from having to doit. Make the n+1 level privilaged in such a way that 
only the X server itself & the window manager get to play with it (OK. You 
need to be able to make it flexible so compositor gets the smae privs etc). 
Then user programs are incapable of starving the bits that you really need 
(e.g. windows moving, menus popping up etc) from drawing time... 

It might be I'm moving too far into programmable funcionality in the gfx card 
instead of fixed functionality.

> to draw at the same time, the driver could give them virtual slices
> and priorities.  It won't be perfect (numbers of commands rather than
> amounts of time), but there's only so much that is reasonable.
>

I was thinking blocks of commands up to a fixed max size. Almost like a 
hardware sceduler I guess like Lourens said. But would that be bad?

> > And more still... Tag the block for execution at certain times. Why
> > interrupt the system processor(s) to say hey retrace is on us, or we're
> > at line x of the display. Allow blocks to be enqueued for execution at
> > start of the vblank. Or at line xxx... (Not to interrupt, but to start
> > execution of).
>
> Now vblank-oriented things ARE something I might want hardware support
> for.  If you can be sure you're staying just behind the vertical
> retrace, you can draw directly to the screen without tearing and have
> an entire frame to do it.  However, many times, we will be drawing to
> a back buffer and just swap front and back buffers in a vertical blank
> interrupt, making that not so useful.
>

I'm thinking if Amiga style effects here... Changing resolutions/modes on the 
fly... (Although maybe not so useful on LCD's). 


> > The way I see it, any interrupt saved is a bonus. Having the card able to
> > do as much as possible for scheduling itself has got to save system cpu.
>
> Aside from video interrupt, I think we need two engine interrupts.
> One is "dma idle", and the other is "engine idle".
>
> What we do is this:  In a page of DMA buffer, we have some shared
> variables between GPU and CPU.  As the GPU consumes ring buffer
> entries, it'll periodically update a shared variable that indicates
> the "head" of the queue, where words are extracted from the circular
> buffer.  Similarly, as the driver fills commands into the ring buffer,
> it'll update the "tail" pointer to indicate to the GPU where the end
> of the queue is.  Whenever the GPU runs to the end of what it thinks
> locally is the tail pointer (no more DMA reads), it'll reread the
> pointer in the host.  If what it reads is different from the old
> value, it keeps going.  If what it reads is the same, it stops
> (nothing more to do) and raises an interrupt.
>
> As long as the updates to the tail pointer can be done atomically
> (which, as a 32-bit word, they would be on most architectures), then
> we can keep the GPU going continuously without ever issuing an
> expensive PIO.  If the interrupt arrives, it means that the GPU has
> stopped DMA and won't be trying again automatically, so we'll have to
> issue a PIO to get it going again when there's more to do.
>
> Any other engine-related interrupts will be ones inserted as commands
> into the queue, as suggested by Jon Smirl.
>

You mean from the 'The State of Linux Graphics' thread? Saw that afterwards 
(Sorry, I've been too busyt to keep uptodate in real-time).

Although Jon was talking about writing commands into VRAM... I was 
anticipating DMA being used to read the command blocks from system memory 
into VRAM... (So the app doesn't get the hit on communicating across the bus, 
but can go back to doing what it needs to do. Which admittedly might be 
waiting for the card to complete it's work on the buffer :)

>
> If not done right, there could be a race condition between writing the
> tail pointer and getting the interrupt, but this is surely a solved
> problem.

You might get a spurious interrupt... Would it be any worse than that? Hmmm... 


Sleep time...

H

pgpXA4ksqtAiS.pgp
Description: PGP signature

_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

[Open-graphics] Re: Idea for responsive card interface

Reply via email to