On 07.02.2014 23:25, Dave Airlie wrote: >>> Doh, yes because GL has ARB_texture_gather then has stuff hidden away >>> in ARB_gpu_shader5 I forgot to add the extra bits which I suppose we should >>> do. >>> >>> So I've reposted with the component selection in src1 now. >> Hmm seems a bit excessive to use an extra reg for that (gather4 but only >> in d3d11 form uses a src_sel on the sampler reg, but that might not work). >> I realize this is actually more messy than I thought, since the initial >> ARB_texture_gather had the ability to query if multi-channel formats are >> allowed, but had no way to select the channel (somewhat relying on >> ARB_texture_swizzle to do it, though of course you can't issue multiple >> gathers with the same texture to get different channels that way). >> But glsl 4.00 version could select the channel. >> Is the ARB_texture_gather version actually all that useful or could you >> merge the two caps? That is, if you have the ability to fetch from >> multi-channel textures, assume you can also select the channel. The sm4 >> version of gather4 also has the single-channel format restriction - I >> guess though some hw really can do 4 channels without channel selection. > Yeah I think I'll rethink this stuff, it looks like two caps, one for > MAX_COMPONENTS for ARB_texture_gather4, and just one cap for > TEXTURE_GATHER_SM5 support which would denote support for all the > ARB_GPU_shader5 bits. > >> Other than that, what about shadow samplers? Gather4 of course can't do >> it (because the d3d10-style opcodes have different opcodes for shadow >> comparisons), but the GL style opcodes are usually the same if shadow >> samplers or not are used. Maybe you don't want to handle that right now, >> just saying that if you'd want to use the same opcode you'd be missing a >> component in case of texture cube arrays... Since this can't be used for >> fixed function though I'd guess nothing would stop you from using a >> different opcode for shadow samplers. > > I've gotten shadow samplers to work with the current opcodes, though I > have to see about cube arrays if we have the running out of space to > put everything. > > Also the GPU_shader5 spec has a few more oddities, so you have > textureGatherOffset which can take a non-constant set of offset values > to apply to all 4 texels, then you have textureGatherOffsets which > only takes constants again, but 4 of them, one per texel. Looking at > radeon hw it appears fglrx decomposes textureGatherOffsets into > multiple gather instructions at the hw level but using the > non-constant hw support to do this. So I'm not sure if the gallium > interface should just support non-constant for all offsets and just > restrict the GL.
Fwiw Fermi+ support 4 different non-constant offsets, since they're passed in a register anyway. > I've reworked the state tracker code already, > > http://cgit.freedesktop.org/~airlied/mesa/commit/?h=r600g-texture-gather&id=444bc1c8118d51600a58af8a84088e94d0800b22 > > but I suspect I've a bit further down the rabbit hole to go. > > Dave. > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev