Re: [Open-graphics] Alternative synchronization mechanism in the driver

Nicolai Haehnle Mon, 21 Mar 2005 16:01:03 -0800

On Monday 21 March 2005 19:51, Daniel Phillips wrote:
> > > That is _exactly_ what I had in mind.  The main detail I've been
> > > fretting over is how to deliver notification of command buffer
> > > completion.  I'm currently mulling over using a socket for that, in
> > > which case the indirect DMA submission might as well go over the
> > > socket too.
> >
> > DRI drivers already open an fd (/dev/dri/*) to send ioctls. Reading
> > and writing from this fd is currently not used, so this is a good
> > candidate IMO.
> 
> This fd is currently just a generic character device and not a socket, 
> so DRI would have to be patched along with adding our driver to the X 
> tree, which isn't necessarily a bad idea.  I can't think of any 
> compatibility problem with changing the DRI character device(s) to a 
> socket.  The quick test is to try it and see if anything breaks.
> 
> The DRI socket's job would just be to listen for connections, then 
> create the real socket and hand it to the client.
> 
> We can probably manage to set up a per-client socket connection within 
> the existing DRI framework, and so be able to offer a driver variant 
> that works without upgrading X/DRI, for what it's worth.  I haven't 
> tried this, and I still haven't looked at a lot of DRI code, so I can't 
> swear it will work.


Stop right there. You have just blown this whole thing up an order of 
magnitude in complexity without a good reason.

What exactly do you want to achieve? I thought you just wanted a way for the 
kernel to notify userspace when some event happened in the GPU program. 
Most of this can be done using the classic ioctl model (think 
"wait_for_xyz" ioctls like all drivers already use) or shared memory or a 
combination of both.
The only situation where this *isn't* enough is if we ever find the need for 
a fully event-based model, because in that situation we need to poll() or 
select() on multiple event sources - where one of them is the DRI file. But 
we can easily extend the current DRI file to be a file that can be waited 
on by userspace. No need to go crazy with sockets here...

For the record, I don't think such a fully event-based model is even needed 
for an OpenGL implementation, unless we come up with some really fancy new 
extensions.

[snipping the register discussion here because I want to concentrate just on 
the normal path that will be taken later on; I'm responding to that below]
> > And even if it is feasible, the video memory management issues still
> > remain. They can obviously be reduced by allowing each GPU program to
> > lock memory in place so that it will not be moved by the video memory
> > manager until a certain part of the program has been executed.
> 
> Ideally, the DMA engine would advance its head pointer only after a 
> drawing operation has completely cleared the pipeline.  But perhaps 
> that is too hard to implement in hardware.  A reasonable alternative is 
> to flush the pipeline before recovering a resource that is known to be 
> in use.

If the head pointer could be advanced in that way, that would indeed really 
be helpful.

Alternatively, we could have a special "tag" command that can be inserted 
into the command stream that updates an "age" register and optionally 
writes that age value back to RAM. We could then have sequences like this 
in the command stream:

Setup texture base address
Render a number of polygons
Tag N
Setup different texture
...

Handling of this tag write would happen at the very end of the pipeline, so 
the tag gets written after the previous commands have been executed 
completely.
As soon as the tag value becomes larger than N, the kernel knows that the 
first texture is no longer used and can act accordingly.

If the head pointer can be read back yielding useful information, it could 
make the tag command redundant.

Now for the state register discussion:
> My assumption is, the kernel driver keeps all the necessary state on 
> behalf of each client.  The client _updates_ the state in process 
> context, the kernel driver accesses the state via kernel address.

And this is exactly the problem. How are state changes submitted by the 
client? The direct path would be to write all (non-safety-critical state 
changes) directly into an indirect DMA buffer.
Now let's say process 1 queues up commands like the following asynchronously 
before the kernel begins processing them:

A. Call first indirect DMA buffer, containing the following:
1. Set texture environments
2. Set blending mode and Z testing
3. Lots of trapezoid commands
B. Some memory management instructions
C. Call second indirect DMA buffer:
1. Change texture environments
2. Lots of trapezoid commands

Now in the DRM interrupt bottomhalf, the kernel decides to schedule process 
1 and submits the first indirect DMA buffer to the card. However following 
that, the higher-priority process 2 interrupts and renders its own set of 
primitives. Process 2 obviously uses different textures, and it also uses a 
different blend function. Process 2 finishes drawing, and the kernel 
decides to schedule process 1 again.

How does the kernel know that it needs to reload blending mode and Z testing 
before executing the second DMA buffer?

There are basically four solutions:
1. Parse the indirect DMA buffers in the kernel. This is obviously slow and 
therefore a bad idea.
2. Allow userspace to point to sections in indirect DMA buffers that need to 
be rerun in order to restore state information.
3. Force userspace to emit state-resetting commands at the beginning of all 
indirect DMA buffers. These reset commands are skipped (by providing an 
offset pointer) if the graphics context hasn't been changed between 
indirect buffers. (I actually did something comparable in the experimental 
R300 driver)
4. Don't emit state-setting commands in the indirect DMA buffer, but in the 
meta-ring buffer. The kernel parses the commands, stores them in an 
internal structure and forwards them to the card accordingly.

It's not that there aren't any solutions to this problem. It's just that it 
is far from obvious to me what the right solution is.

cu,
Nicolai

pgpEApeJUQc6v.pgp
Description: PGP signature

_______________________________________________
Open-graphics mailing list
[email protected]
http://lists.duskglow.com/mailman/listinfo/open-graphics
List service provided by Duskglow Consulting, LLC (www.duskglow.com)

Re: [Open-graphics] Alternative synchronization mechanism in the driver

Reply via email to