On Tue, 13 May 2008 21:35:16 +0100 (IST) Dave Airlie <[EMAIL PROTECTED]> wrote:
> 1) I feel there hasn't been enough open driver coverage to prove it. So > far we have done an Intel IGD, we have a lot of code that isn't required > for these devices, so the question of how much code exists purely to > support poulsbo closed source userspace there is and why we need to live > with it. Both radeon and nouveau developers have expressed frustration > about the fencing internals being really hard to work with which doesn't > bode well for maintainability in the future. Well my ttm experiment bring me up to EXA with radeon, i also done several small 3d test to see how i want to send command. So from my experiments here are the things that are becoming painfull for me. On some radeon hw (most of newer card with big amount of ram) you can't map vram beyond aperture, well you can be you need to reprogram card aperture and it's not somethings you want to do. TTM assumption is that memory access are done through map of the buffer and so in this situation this become cumberstone. We already discussed this and the idea was to split vram but i don't like this solution. So in the end i am more and more convinced that we should avoid object mapping in vma of client i see 2 advantages to this : no tlb flush on vma, no hard to solve page maping aliasing. On fence side i hoped that i could have reasonable code using IRQ working reliably but after discussion with AMD what i was doing was obviously not recommanded and prone to hard GPU lockup which is no go for me. The last solution i have in mind about synchronization ie knowing when gpu is done with a buffer could not use IRQ at least not on all hw i am interesed in (r3xx/r4xx). Of course i don't want to busy wait for knowing when GPU is done. Also fence code put too much assumption on what we should provide, while fencing might prove usefull, i think it can be more well served by driver specific ioctl than by a common infrastructure where hw obviously doesn't fit well in the scheme due to their differences. And like Stephane, i think virtual memory from GPU stuff can't be used at its best in this scheme. That said, i share also some concern on GEM like the high memory page but i think this one is workable with help of kernel people. For vram the solution discussed so far and which i like is to have driver choose based on client request on which object to put their and to see vram as a cache. So we will have all object backed by a ram copy (which can be swapped) then it's all a matter on syncing vram copy & ram copy when necessary. Domain & pread/pwrite access let you easily do this sync only on the necessary area. Also for suspend becomes easier just sync object where write domain is GPU. So all in all i agree that GEM might ask each driver to redo some stuff but i think a large set of helper function can leverage this, but more importantly i see this as freedom for each driver and the only way to cope with hw differences. Cheers, Jerome Glisse <[EMAIL PROTECTED]> ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel