On Sat, Sep 6, 2008 at 7:56 PM, José Fonseca <[EMAIL PROTECTED]> wrote: > On Fri, Sep 5, 2008 at 3:59 AM, Younes Manton <[EMAIL PROTECTED]> wrote: >> Also, it would be nice if the mapping interface allowed for mapping a >> subset of a buffer, and accepted a PIPE_BUFFER_USAGE_DISCARD. The >> DISCARD flag would allow the driver to rename the buffer (create a new >> one and point to it, free the old one at a later time) if it was still >> in use when the map was requested, thereby not blocking on map. >> Locking a subset would allow for minimal read back from VRAM, and if >> the client locked the entire buffer _without_ >> PIPE_BUFFER_USAGE_CPU_READ the driver might also elect to rename the >> buffer, since the semantics would allow it. The driver might also map >> immediately if it could be determined that the mapped region was >> already read by the GPU and so could be written to. Right now for >> video we use triple or quadruple buffers just to keep parallelism >> going, it would be nice to let the driver handle it and use the >> minimum number of buffers at any given moment. > > Rather than the DISCARD paradigm you're suggesting, we are currently > pushing a different paradigm, which is simply to destroy a bufffer > when it is no longer needed, and allocate a new one and needed, and > let a dynamic pool of buffers with a time cache do the trick. You get > the same best as possible behavior, as the number of buffers > dynamically grows/shrinks to match the application needs, without > increasing complexity neither in the pipe driver or the winsys, as the > buffer pool logic is a separate reuseable piece. See > gallium/src/gallium/winsys/drm/intel/common/ws_dri_*.c or > gallium/src/gallium/auxiliary/pipebuffer/* for a DRM specific and a > DRM agnostic implementation of this.
Thanks, I didn't know about this. I'll try using the pipebuffer implementation and see if it works out or not. >> Samplers could be allowed to hold texture format info, thereby >> allowing on the fly format switching. On Nvidia the texture format is >> a property of the sampler, so it's possible to read a texture as one >> format in one instance and another format in another instance. >> Likewise a render target's format is emitted when it is set as a >> target, so a format attached to pipe_framebuffer_state, or a new state >> object analogous to a sampler (e.g. an emitter) would be very handy. >> The format at creation time could be kept for hardware that can't do >> this, then it's just a matter of checking/requiring that format at use >> time matches format at creation time and signaling an error otherwise. >> This is to get around HW limitations on render targets, so we render >> to a texture in one format, and read from it in another format during >> the next pass. > > Note that presently a) gallium texture format/layout/etc can't be > changed once created, b) format is a texture property, not of the > sampling/rendering operation. Changing a) seems impossible, especially > considering we are moving to immutable state objects, which are much > simpler and effictive to handle, rather than mutable state objects. If > I understood correctly, you're asking to change b) in order to get > around hw limitations. > > My first impression is that HW limitations should not be exposed in > this way to the state tracker -- it is ok for a driver which lacks > complete hw support for a operation to support it by breaking down in > simpler supported operations, but that should be an implementation > detail that should be hidden from the state tracker. That is, nvidia > driver should have the ability to internally override texture formats > when rendering/sampling. If the hardware limitation and the way to > overcome is traversal to many devices, then we usually make that code > a library which is used *inside* the pipe driver, keeping the > state-tracker <-> pipe driver interface lean. > > But I am imagining the 3d state trackers here, perhaps video state > trackers needs to be a step further aware to be useful. Could you give > a concrete example of where and how this would be useful? The problem we have is that render target formats are very limited. The input to the IDCT stage of the decoding pipeline is 12-bit signed elements, the output is 9-bit signed elements, which then becomes the input to the MOCOMP stage. We have R16Snorm textures, so we can consume the 12-bit and 9-bit signed inputs well, but we can't render to R16Snorm, or even to R16Unorm. The closest thing we have is R8Unorm, which would be acceptable since we can lose the LSB and bias the result to the unsigned range, but not enough HW supports that. However, if you think of R8G8B8A8 as being 4 packed elements, we can render to that instead and every card supports that just fine. However, in order to consume that in the MOCOMP pass we need to reinterpret it as an R8Unorm texture. So, as you can see we need a surface to behave as a R8G8B8A8 (W/4)xH render target for pass A, then as an R8 WxH texture for pass B. We could also consider R8G8B8A8 as two elements and output 2 full 9-bit elements. Either way, we need some sort of dynamic pixel format typing. It would be very difficult to do this transparently behind the scenes, since the fragment shader code needs to be aware of the differences. The Nvidia hardware seems to support it perfectly, since the pixel format of a texture or render target is emitted when it is bound, along with min/mag filter, wrap mode, etc; a buffer is just a buffer of generic memory otherwise. I don't know much about other hardware, but I wouldn't be surprised if Nvidia wasn't the only one that worked like this. If this is the case, then one could argue that static pixel formats are an artificial restriction, and that it would make more sense for a low level API to better model how the hardware worked. But I think keeping the format as part of the texture like it is now, so that for hardware that didn't support this sort of thing the driver could check that format specified in the sampler or render target state matched the format of the texture at creation time is a good way to satisfy both sides of the equation. It would probably be better to experiment with this privately and see how it worked out if people are not currently convinced about this, because for all I know there could be some hardware quirk that makes this impossible or not worth using, but I just thought to mention it in case someone had already considered this. Younes ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev