Re: [Open-graphics] Alternative synchronization mechanism in the driver

Nicolai Haehnle Mon, 21 Mar 2005 05:16:36 -0800

On Sunday 20 March 2005 23:49, Daniel Phillips wrote:
> > 1. Run glxgears on a DRI-enabled system. Then run 'yes' in a terminal
> > emulator. Watch the system go crazy. A similar effect can sometimes
> > be observed while moving and resizing OpenGL windows.
> > This suggests that access to the GPU needs a proper scheduler, just
> > like access to the CPU is arbitrated using a proper scheduler.
> 
> That has always bothered me a lot.  Is it just restricted to DRI 
> programs?  I think I've seen that with 2D SDL animations as well.


Well, on my system fps of glxgears drops below 1fps in that little 
experiment, and I haven't seen this happening with pure SDL (SDL/OpenGL is 
obviously another matter).

[snip]
> > So, why not combine these two thoughts into a bigger whole. Instead
> > of the current model where command buffers are submitted
> > synchronously via ioctl, the userspace driver will write the command
> > buffers somewhere in userspace and simply point the kernel at it
> > without taking the big hardware lock.
> 
> That is _exactly_ what I had in mind.  The main detail I've been 
> fretting over is how to deliver notification of command buffer 
> completion.  I'm currently mulling over using a socket for that, in 
> which case the indirect DMA submission might as well go over the socket 
> too.

DRI drivers already open an fd (/dev/dri/*) to send ioctls. Reading and 
writing from this fd is currently not used, so this is a good candidate 
IMO.

> > At some point (either in an 
> > ioctl when the engine is idle, or in a bottom-half handler for the
> > "hardware ring buffer empty" IRQ), the kernel will look at all
> > outstanding scheduled command buffers and write to the hardware ring
> > buffer.
> 
> Exactly.
> 
> > Note that this idea is completely independent of hardware
> > capabilities. It has the very big advantage that the big hardware
> > lock is taken almost never, which reduces the potential impact of
> > bugs,
> 
> Unless I missed something major, you can drop the "almost".

With our hardware, probably yes, but I'd still like to watch out for 
potential problems due to major (and rare) events like mode switching.

> > and it allows proper scheduling of access to the hardware, 
> > eliminates ping-ponging of the lock, etc. All in all, this design
> > should behave *much* better in the face of multiple 3D apps.
> >
> > There are a number of problems, though:
> > - Proper scheduling means that we also need proper context switching,
> > including preserving all the relevant hardware state, i.e. texture,
> > blending, etc. settings. This will be expensive unless we figure out
> > a way for the userspace driver to communicate "reconfiguration
> > points" in the command stream that contains the necessary information
> > to reload state.
> 
> The kernel driver knows which task it got the command submission from, 
> so it can switch to the correct context.

Yes, but it can be expensive. When the kernel switches contexts, it must 
make sure that all the on-card registers (I know you don't like "register 
writes", but that's what they are) that reflect OpenGL state (blending, Z 
test, texturing, texture environments, ...) are correctly preserved. It 
must also make sure that all the referenced textures, including offscreen 
rendering targets, are in place and not swapped out.

This means that the kernel must keep track of which memory areas each "GPU 
program" currently references and what all the registers contain. You can't 
just wave that away.

Some hardware support for writing/reading register states into a predefined 
area of video memory could help a lot here, but I don't know if that's 
feasible.
And even if it is feasible, the video memory management issues still remain. 
They can obviously be reduced by allowing each GPU program to lock memory 
in place so that it will not be moved by the video memory manager until a 
certain part of the program has been executed.

> > - In the current DRI design, the kernel module does basically
> > everything in a process context. With this design, it'll have to do
> > almost everything in interrupt or bottomhalf context. This alone
> > brings a number of problems, such as access to the calling process'
> > data.
> 
> Fortunately, the virtual pages are resolved to physical when the process 
> obtains the dma buffer.  I don't think we need anything else from 
> process context.

There is some meta information about the GPU program that doesn't fit into 
the DMA buffer itself. For example, with an in-kernel memory manager, we 
must communicate which textures and rendering targets the program currently 
needs and when they become "unlocked".
Think of it like this: All userspace clients will need the kernel to issue 
some direct DMA commands for them: Memory moves, i.e. memory management 
stuff, and "calls" to indirect DMA. The meta information that is not in DMA 
memory reflects the direct DMA commands that userspace needs the kernel to 
issue.

However, this probably isn't too big a problem: This data shouldn't be too 
much, so we could just allocate a special page for it in memory.

> > - This design has a lot of impact on video memory management.
> > Basically, I believe you'd have to use a completely new kind of
> > memory management (Then again, "memory management" in DRI is really
> > bad right now, so maybe that's even a bonus ;))
> 
> I've always labored under the assumption that our kernel driver would 
> manage video memory, and nobody else.

Okay.

> > - I don't think it is possible to eliminate the big hardware lock
> > completely. I'm mostly thinking about some rare operations like mode
> > setting; when looking at the DRI in general, keep in mind that some
> > hardware has problems when the CPU accesses video memory while the
> > engine is busy. Such hardware would need the big hardware lock more
> > often.
> 
> Great.  That will make our hardware look good, and broken hardware can 
> use the hardware lock.

Yes :)

cu,
Nicolai

pgpFuM23FyiTc.pgp
Description: PGP signature

_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Re: [Open-graphics] Alternative synchronization mechanism in the driver

Reply via email to