Benjamin Herrenschmidt wrote:
But this should be the same problem encountered by the agpgart driver?Actually, the TTM memory manager already does this, but also changes the caching policy of the linear kernel map.The later is not portable unfortunately, and can have other serious performance impacts.Typically, the kernel linear map is mapped using larger page sizes, or in some cases, even large TLB entries, or separate translation registers (like BATs). Thus you cannot affect the caching policy of a single 4k page. Also, on some processors, you can't just break down a single large page into small pages neither. For example, on desktop PowerPC, entire segments of 256M can have only one page size. Even x86 might have some interesting issues here... x86 and x86-64 calls change_page_attr() to take care of this. On powerpc it is simply a noop. (<asm/agp.h>) Unfortunately this leads to rather costly cache and TLB flushes. Particularly on SMP.Yup. What about a futex-like approach: A shared are mapped by both kernel and user has locks for the buffers. When submitting a command involving a buffer, userland tries to lock it. This is a simple atomic operation in user space. If that fails (the lock for that buffer is held, possibly by the kernel, or the buffer is swapped out), them it does an ioctl to the DRM to get access (which involves sleeping until the buffer can be retreived). One the operation is complete, the apps can release the locks to buffers it holds. In fact, if there is a mapping to buffers <-> objects for cards like nVidia with objects and notifiers, the kernel could auto-unlock objects when the completion interrupt for them occurs. Ben. Currently we take the following approach when the GPU needs access to a buffer: 0) Take the hardware lock. 1) The buffer is validated, and if not present in the GATT, it's flipped in. At this point, idle buffers may be flipped out. 2) The app submits a batch buffer (or in the general case a command sequence). All buffers that are referenced by this command sequence needs to have been validated, and the command sequence should be updated with their new GATT offset. 3) A "fence" is emitted, and associated with all unfenced buffers. 4) The hardware lock is released. 5) When the fence has expired (The GPU is finished with the command sequence), the buffers associated with it may optionally be thrown out. One problem is that buffers that need to be pinned (_always_ available to the GPU) cannot be thrown out and will thus fragment the aperture- or VRAM space. Buffers also carry usage- and mapping refcounts. They are not allowed to be validated when mapped, and (except under some circumstances) are not allowed to be mapped while validated. Buffer destruction occurs when the refcount goes to zero. /Thomas ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel |
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
-- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel