Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:

> It was setting XYWZ swizzle and writemask to all uniforms, no matter if they
> were a vector or scalar, so this can lead to problems when loading them
> to the push constant buffer.
>
> Moreover, 'shift' calculation was designed to calculate the offset in
> DWORDS, but it doesn't take into account DFs, so the calculated swizzle
> for the later ones was wrong.
>
> The indirect case is not changed because MOV INDIRECT will write
> to all components. Added an assert to verify that these uniforms
> are aligned.
>
> v2:
> - Fix 'shift' calculation (Curro)
> - Set both swizzle and writemask.
> - Add assert(shift == 0) for the indirect case.
>
> Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
> Cc: "17.1" <mesa-sta...@lists.freedesktop.org>

Reviewed-by: Francisco Jerez <curroje...@riseup.net>

> ---
>  src/intel/compiler/brw_vec4_nir.cpp | 15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/src/intel/compiler/brw_vec4_nir.cpp 
> b/src/intel/compiler/brw_vec4_nir.cpp
> index a82d52088a8..80115aca0f9 100644
> --- a/src/intel/compiler/brw_vec4_nir.cpp
> +++ b/src/intel/compiler/brw_vec4_nir.cpp
> @@ -852,7 +852,8 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
>         * The swizzle also works in the indirect case as the generator adds
>         * the swizzle to the offset for us.
>         */
> -      unsigned shift = (nir_intrinsic_base(instr) % 16) / 4;
> +      const int type_size = type_sz(src.type);
> +      unsigned shift = (nir_intrinsic_base(instr) % 16) / type_size;
>        assert(shift + instr->num_components <= 4);
>  
>        nir_const_value *const_offset = nir_src_as_const_value(instr->src[0]);
> @@ -860,14 +861,20 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
>           /* Offsets are in bytes but they should always be multiples of 4 */
>           assert(const_offset->u32[0] % 4 == 0);
>  
> -         unsigned offset = const_offset->u32[0] + shift * 4;
> +         src.swizzle = brw_swizzle_for_size(instr->num_components);
> +         dest.writemask = brw_writemask_for_size(instr->num_components);
> +         unsigned offset = const_offset->u32[0] + shift * type_size;
>           src.offset = ROUND_DOWN_TO(offset, 16);
> -         shift = (offset % 16) / 4;
> +         shift = (offset % 16) / type_size;
> +         assert(shift + instr->num_components <= 4);
>           src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
>  
>           emit(MOV(dest, src));
>        } else {
> -         src.swizzle += BRW_SWIZZLE4(shift, shift, shift, shift);
> +         /* Uniform arrays are vec4 aligned, because of std140 alignment
> +          * rules.
> +          */
> +         assert(shift == 0);
>  
>           src_reg indirect = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_UD, 
> 1);
>  
> -- 
> 2.11.0

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to