Keith Packard wrote: > On Mon, 2008-03-03 at 19:34 +0100, Thomas Hellström wrote: > > > >> 2) Copying buffers in drm. This is to avoid vma creation and pagefaults, >> right? Yes, that could be an improvement *if* kmap_atomic is used to >> provide the kernel mapping. Doing vmap on a whole buffer is probably >> almost as expensive as a user-space mapping, and will waste precious >> vmalloc space. User-space buffer mappings aren't really that expensive, >> and a second map on the same buffer is essentially a no_op, unless you >> are using DRM_BO_FLAG_CACHED_MAPPED. >> > > If you run a kernel with the ability to map all of physical memory, > there will always be a kernel mapping for every page of memory. DRM > already relies on this, allocating pages below the 900M limit on 32-bit > kernels and not using the mapping APIs in a way that would make memory > above that limit work. > > Encouraging people to use 64-bit kernels on larger memory machines can > eliminate the kernel mapping cost on machines with > 1GB of memory. > > >> 3) Copying buffers through the GATT. I assume you're referring to >> binding the buffer to a pre-mapped region of the GATT and then do the >> copying without setting up a new CPU map? That's certainly possible and >> a good candidate for performing relocations if you can't do kmap_atomic(). >> > > The kernel mapping is free; the goal here is to avoid the complexities > of non-temporal stores, and eliminate the chipset flush kludge. The > question here is whether writes through the GTT in WC mode are slower > than writes to regular memory in non-temporal WB mode, followed by a > chipset flush operation. > > >> However, if you were to reuse buffers in user-space and just use plain >> old !DRM_BO_FLAG_CACHED none of these would be real issues. Buffers >> will stay bound to the GTT unless they get evicted, and the user-space >> vmas would stay populated. You'd pay a performance price the first time >> a buffer is created and when it is destroyed. >> > > Right now, re-using buffers is hard on our caches -- mapping the buffer > to the GPU requires a flush, which cleans the cache lines. When we > re-use the buffer, the writes will re-load every line from memory. > > Re-using the same user-space buffer will hit live cache lines. Those > cache lines will then be copied to memory which will never be pulled > into the cache. The number of writes to memory is the same, but we > eliminate the cache line loads which would otherwise occur as the buffer > is filled. > Yes, but it's important to know that these issues depend on whether you change the kernel mapping to be uncached when binding. You're currently not doing that, so you get caching issues and need the chipset flush. If you were to do that, user-space mappings would be write-combined and you wouldn't have any caching problems either. The big performance problems would be changing the kernel mappings when binding / unbinding, and you'd need to re-use buffers to avoid that problem. Basically what I'm saying above is that *if* you want to reuse buffers, you can use plain old !DRM_BO_FLAG_CACHED to avoid the caching- and flushing issues.
/Thomas ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel