Jesse Barnes wrote: > On Friday, April 04, 2008 11:14 am Thomas Hellström wrote: > >> Dave Airlie wrote: >> >>> I'm just wondering if rather than specify all the CACHED and MAPPABLE and >>> SHAREABLE flags we make the BO interface in terms of CPU and GPU >>> operations.. >>> >>> So we define >>> CPU_READ - cpu needs to read from this buffer >>> CPU_WRITE - cpu need to write to the buffer >>> CPU_POOL - cpu wants to use the buffer for suballocs >>> >>> GPU_READ - gpu reads >>> GPU_WRITE - gpu writes >>> (GPU_EXEC??) - batchbuffers? (maybe buffers that need relocs.. not sure) >>> >>> We can then let the drivers internally decide what types of buffer to use >>> and not expose the flags mess to userspace. >>> >>> Dave. >>> >> This might be a good idea for most situations. However, there are >> situations where the user-space drivers need to provide more info as to >> what the buffers are used for. >> >> Cache coherent buffers is an excellent way to transfer data from GPU to >> CPU, but they are usually very slow to render from. How would you tell >> DRM that you want a cache-coherent buffer for download-from-screen type >> of operations? >> > > They also can't be used in many cases, right? Which would mean something > like > a batchbuffer allocation would need CPU_READ|CPU_WRITE|GPU_READ|GPU_EXEC, > which would have to be a WC mapping, but the driver wouldn't know just from > the flags what type of mapping to create. So yeah, I think we need some > notion of usage or at least a bit more granularity in the type passed down. > I think cache-coherent memory has gotten a bad reputation because of its limitations with the Intel chipsets. The general rule is probably that it works for mostly everything, but GPU access is substantially slower (say 60%) of WC memory.
I'll try to list some of the considerations that led to the current interface, that tend to be forgotten because people are mostly dealing with the Intel i915 type chipsets which are quite straighforward and simple in this area. If we can come up with a simpler interface for these, That'd be really good. /* GPU access mode, Can be used for protection and dirty considerations */ GPU_READ GPU_WRITE /* Early release of vertex buffers and batch buffers in a scene that needs a final flush, or buffers with non-standard signaling of GPU completion (driver-dependant)*/ INTEL_EXE PSB_BINNER PSB_RASTERIZER PSB_QUERY VIA_MPEG VIA_VIDEO /* Memory types. Due to different base registers and engine requirements, the User-space driver generally needs to be able to specify different memory types. This might not be needed with Intel chipsets, but other UMA chipsets have a number of restrictions for buffer placements for different parts of the GPU. Textures, depth buffers, mpeg buffers, shader buffers etc, but it might be that these can be replaced with the above driver-dependant flags. TT VRAM LOCAL PSB_SHADER driver dependant.... /* CPU access to the buffer. */ CPU_READ CPU_WRITE CPU_COHERENT CPU_POOL (Other GL usage hints?) > Maybe it's instructive to take a look at the way Linux does DMA mapping for > drivers? The basic concepts are coherent buffers, one time buffers, and > device<->CPU ownership transfer. In the graphics case though, coherent > mappings aren't *generally* possible (at least not yet), so we're reduced to > doing non-coherent mappings and transferring ownership back & forth, or just > keeping the mappings uncached on the CPU side in order to keep things > consistent. > > Even that's not expressive enough for what we want though. For small > objects, > mapping into CPU space cached, then flushing out to the CPU may be much more > expensive than just copying the data from a cacheable CPU buffer to a WC GTT > page. But with large objects taking an existing CPU mapping, switching it to > uncached and mapping its pages directly into the GTT is probably a big win > (better yet, never map it into the CPU address space as cached at all to > avoid all the flushing overhead). > > I agree completely. >> Please take a look at i915tex (mesa i915tex_branch) >> intel_buffer_objects.c, the function intel_bufferobj_select() that >> translates the GL target + usage hints to a subset of the flags >> available. My opinion is that we need to be able to keep this >> functionality. >> > > It looks like that code is #if 0'd, but I like the idea that the various > types > are broken down into what type of memory will work best, and it definitely > clarifies my understanding of the flags a bit. Of course, some man pages for > the libdrm drmBO* calls would be even better. :) > > I haven't really done a thorough testing of it all so it's ifdef'd out for now, but it should show the general idea. And yes, I'd love to do some documentation when I get some time over. > I think part of what we're running into here is platform specific. There's > already a big divide between what might be necessary for pure UMA > architectures vs. ones with lots of fast VRAM, and there are also the highly > platform specific cacheability concerns for integrated devices on Intel. I > just wonder if a general purpose memory manager is ever going to be "optimal" > for a given platform... At SGI at least there tended to be new memory > managers for each new architecture, without much sharing that I'm aware of... > > Anyway hopefully we can get this sorted out soon so we can push it all > upstream along with the kernel mode setting work which depends on it. I > think everyone's agreed that we want an API & architecture that's easy to > understand for both users and developers; we must be getting close to that by > now. :) > Yes. And if we're doing a final pass at this, we should probably try to list most of the upcoming use-cases for the chipsets that we know something about today, sort most of them into the driver-dependent area and nail the rest as the "permanent" interface. But let's have an interface that, perhaps by means of driver-dependant flags, allows dirty performance optimizations when needed and motivated. > Thanks, > Jesse > Thomas ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Register now and save $200. Hurry, offer ends at 11:59 p.m., Monday, April 7! Use priority code J8TLD2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel