On Tue, 13 May 2008 21:35:16 +0100 (IST)
Dave Airlie <[EMAIL PROTECTED]> wrote:

> 1) I feel there hasn't been enough open driver coverage to prove it. So 
> far we have done an Intel IGD, we have a lot of code that isn't required 
> for these devices, so the question of how much code exists purely to 
> support poulsbo closed source userspace there is and why we need to live 
> with it. Both radeon and nouveau developers have expressed frustration 
> about the fencing internals being really hard to work with which doesn't 
> bode well for maintainability in the future.

Well my ttm experiment bring me up to EXA with radeon, i also done several
small 3d test to see how i want to send command. So from my experiments here
are the things that are becoming painfull for me.

On some radeon hw (most of newer card with big amount of ram) you can't
map vram beyond aperture, well you can be you need to reprogram card
aperture and it's not somethings you want to do. TTM assumption is that
memory access are done through map of the buffer and so in this situation
this become cumberstone. We already discussed this and the idea was to
split vram but i don't like this solution. So in the end i am more and
more convinced that we should avoid object mapping in vma of client i see
2 advantages to this : no tlb flush on vma, no hard to solve page maping
aliasing.

On fence side i hoped that i could have reasonable code using IRQ working
reliably but after discussion with AMD what i was doing was obviously not
recommanded and prone to hard GPU lockup which is no go for me. The last
solution i have in mind about synchronization ie knowing when gpu is done
with a buffer could not use IRQ at least not on all hw i am interesed in
(r3xx/r4xx). Of course i don't want to busy wait for knowing when GPU is
done. Also fence code put too much assumption on what we should provide,
while fencing might prove usefull, i think it can be more well served by
driver specific ioctl than by a common infrastructure where hw obviously
doesn't fit well in the scheme due to their differences.

And like Stephane, i think virtual memory from GPU stuff can't be used
at its best in this scheme.

That said, i share also some concern on GEM like the high memory page but
i think this one is workable with help of kernel people. For vram the
solution discussed so far and which i like is to have driver choose
based on client request on which object to put their and to see vram as
a cache. So we will have all object backed by a ram copy (which can be
swapped) then it's all a matter on syncing vram copy & ram copy when
necessary. Domain & pread/pwrite access let you easily do this sync
only on the necessary area. Also for suspend becomes easier just sync
object where write domain is GPU. So all in all i agree that GEM might
ask each driver to redo some stuff but i think a large set of helper
function can leverage this, but more importantly i see this as freedom
for each driver and the only way to cope with hw differences.

Cheers,
Jerome Glisse <[EMAIL PROTECTED]>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to