Dave Airlie wrote: > apologies for top posting, but Thomas's email appears to be breaking > alpine (html or something encoding) > > The big area where we win with CACHED_MAPPED is pixmaps for 2D operations. > > a) we can't know in advance if we should allocate pixmaps as cached or > uncached. > b) we can't know if we are going to be doing mostly hw or mostly sw > rendering with the pixmap. > > In this case we end up hitting the migration a lot, I couldn't come up > with a solution that worked that wasn't CACHED_MAPPED, unless we had > coherent GART.. granted I may not have thought about it enough.. > Hmm, Yes this is a tricky case. Doesn't Intel's coherent GART, DRM_BO_FLAG_CACHED, work here? I suspect it'd be a bit slow though.
/Thomas > There ends your reminder that all the world is not a 3D app :), consider > my main use case for TTM is EXA and compiz... > > Dave. > > On Tue, 4 Mar 2008, Thomas Hellström wrote: > > >> Keith Packard wrote: >> On Mon, 2008-03-03 at 19:34 +0100, Thomas Hellström wrote: >> >> >> >> >>> 2) Copying buffers in drm. This is to avoid vma creation and pagefaults, >>> right? Yes, that could be an improvement *if* kmap_atomic is used to >>> provide the kernel mapping. Doing vmap on a whole buffer is probably >>> almost as expensive as a user-space mapping, and will waste precious >>> vmalloc space. User-space buffer mappings aren't really that expensive, >>> and a second map on the same buffer is essentially a no_op, unless you >>> are using DRM_BO_FLAG_CACHED_MAPPED. >>> >>> >> If you run a kernel with the ability to map all of physical memory, >> there will always be a kernel mapping for every page of memory. DRM >> already relies on this, allocating pages below the 900M limit on 32-bit >> kernels and not using the mapping APIs in a way that would make memory >> above that limit work. >> >> Encouraging people to use 64-bit kernels on larger memory machines can >> eliminate the kernel mapping cost on machines with > 1GB of memory. >> >> >> >>> 3) Copying buffers through the GATT. I assume you're referring to >>> binding the buffer to a pre-mapped region of the GATT and then do the >>> copying without setting up a new CPU map? That's certainly possible and >>> a good candidate for performing relocations if you can't do kmap_atomic(). >>> >>> >> The kernel mapping is free; the goal here is to avoid the complexities >> of non-temporal stores, and eliminate the chipset flush kludge. The >> question here is whether writes through the GTT in WC mode are slower >> than writes to regular memory in non-temporal WB mode, followed by a >> chipset flush operation. >> >> >> >>> However, if you were to reuse buffers in user-space and just use plain >>> old !DRM_BO_FLAG_CACHED none of these would be real issues. Buffers >>> will stay bound to the GTT unless they get evicted, and the user-space >>> vmas would stay populated. You'd pay a performance price the first time >>> a buffer is created and when it is destroyed. >>> >>> >> Right now, re-using buffers is hard on our caches -- mapping the buffer >> to the GPU requires a flush, which cleans the cache lines. When we >> re-use the buffer, the writes will re-load every line from memory. >> >> Re-using the same user-space buffer will hit live cache lines. Those >> cache lines will then be copied to memory which will never be pulled >> into the cache. The number of writes to memory is the same, but we >> eliminate the cache line loads which would otherwise occur as the buffer >> is filled. >> >> > Yes, but it's important to know that these issues depend on whether you > change the kernel mapping to be uncached when binding. You're currently > not doing that, so you get caching issues and need the chipset flush. If > you were to do that, user-space mappings would be write-combined and you > wouldn't have any caching problems either. The big performance problems > would be changing the kernel mappings when binding / unbinding, and > you'd need to re-use buffers to avoid that problem. Basically what I'm > saying above is that *if* you want to reuse buffers, you can use plain > old !DRM_BO_FLAG_CACHED to avoid the caching- and flushing issues. > > /Thomas > > > > > > > > > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > -- > _______________________________________________ > Dri-devel mailing list > Dri-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dri-devel ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel