On Sat, 23 Mar 2002, Rodolphe Ortalo wrote:
> This "FIFO" behavior seems to be the one that worries you (aside from the
> mmap() trick). It seems you would like to say, e.g., "execute
> area[1],area[3]" and then muck around into area[2] before saying "execute
> area[2]".

Correct.

> (The fact that the application be blocked if area[2] is still in
> use while trying to access it does not seem to annoy you too much: the
> area will become available again as soon as the graphics engine is done
> with it.)

OK, it doesn't annoy me IF the application has a(n efficient) way to check 
whether the access will block before doing the access.  This would be optional
on the part of the application, depending on the paradigm the application
was using (e.g. if it is expecting FIFO behavior then waiting for the
buffer to free will not annoy it.)

I think there is also something to say for doing c-o-w on these buffers,
e.g. if the application accesses area2 while it is executing, it receives
a new r/w copy of area 2 immediately and the driver throws out the old copy 
when the DMA is done.  This model would be incompatible if the app desired
that any return values from the hw accel engine were left in the DMA
buffer, but IIRC not many of these return values are horribly useful even 
in the cases where the chipset does modify the buffer.

> With the current scheme, the application only owns in its memory space the
> memory area that it is working with *now*; the kernel driver owns all the
> other ones (either in EXEC or IDLE state).

>  Am I right if I say that you would like to have the opposite: the kernel
> driver should only own the memory areas that currently in execution by the
> graphics engine, all the other areas should be mapped in the application
> memory space?

Yes.  Well s/currently in execution/marked for execution by the application/.

> > I suspect the chipset version compatibility may have many answers 
> > depending on the details.  In most cases, the answer would be that
> > userspace is responsible for putting the right material in the DMA
> > buffer in the first place, but there could be cases where that just
> > doesn't work, and if there's no elegant work-around, then worst case
> > we have to eat the overhead of a buffer memcpy in the driver.
> 
> There is no real memcpy issue here: the driver can walk the DMA stream to
> check it before execution (this is optional of course, but it does
> not cost so much it seems). It can either fault when the DMA stream is
> "illegal" (whatever illegal means), or "passivate" the faulty commands
> (with NOOPS for example), or even "correct" the commands (in some rare
> cases where the correction is totally unambiguous).
>  Note that even if this may sound totally awful to speed freaks, I found
> it *extremely* useful to have the driver protect me from sending random
> bits to the graphics engine. (A faulty application always occur, and it is
> very convenient if it does not freeze the graphics engine.) I'd really
> spend the 5% CPU load needed to do that even in a production system.

Absolutely.   Whole point of KGI vs DRI.

> This is indeed an "unexplored" issue (assuming I understand you
> correctly): using in-main-memory textures. I anticipate some big question
> around there: should the in-kernel driver control *all* (main) memory
> allocations *related* to its graphics engine?

If the OS leaves no userspace API to get such a region, yes.  If the
OS does and the KGI driver can validate a DMA region created by the 
application, then we can offload this functionality into LibKGI.  
Currently Linux has mlock(), but it is not a secure way I think since
the application can discard the area while the chipset is still using it and
it will be put on the free page list where other applications could grab it 
before the GPU is done with it.  Of course I have not the understanding of
the Linux VM to know if this really is the case or not.

--
Brian

Reply via email to