This allows to reduce the number of dwords that are loaded with buffer_load_format_xyzw. For example, when the only used channel is 1, the driver will emit buffer_load_format_x instead.
Shader stats for DOW3 (with some local hacky scripts for SPIRV): 143 shaders in 143 tests Totals: SGPRS: 5344 -> 5352 (0.15 %) VGPRS: 3476 -> 3452 (-0.69 %) Spilled SGPRs: 30 -> 29 (-3.33 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 269860 -> 269808 (-0.02 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1267 -> 1272 (0.39 %) Wait states: 0 -> 0 (0.00 %) The 'const' qualifier has to be removed to avoid a compilation warning with nir_ssa_def_components_read(). Signed-off-by: Samuel Pitoiset <samuel.pitoi...@gmail.com> --- src/amd/common/ac_nir_to_llvm.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 8f5df12e3d..bafbdbb250 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -2242,16 +2242,19 @@ static LLVMValueRef radv_lower_gather4_integer(struct ac_llvm_context *ctx, } static LLVMValueRef build_tex_intrinsic(struct ac_nir_context *ctx, - const nir_tex_instr *instr, + nir_tex_instr *instr, bool lod_is_zero, struct ac_image_args *args) { if (instr->sampler_dim == GLSL_SAMPLER_DIM_BUF) { + unsigned mask = nir_ssa_def_components_read(&instr->dest.ssa); + return ac_build_buffer_load_format(&ctx->ac, args->resource, args->addr, ctx->ac.i32_0, - 4, true); + util_last_bit(mask), + true); } args->opcode = ac_image_sample; -- 2.15.1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev