Kenneth Graunke <kenn...@whitecape.org> writes:

> opt_register_coalesce() was optimizing sequences such as:
>
>    mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
>    mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
>    mov(8) m4.zw:F, vgrf5.xxxy:F
>
> into:
>
>    mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
>    mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D
>
> This doesn't work - if we're going to reswizzle MACH, we'd need to
> reswizzle the MUL as well.  Here, the MUL fills the accumulator's .zw
> components with attr18.yy * attr19.yy.  But the MACH instruction expects
> .z to contain attr18.x * attr19.x.  Bogus results ensue.
>
> No change in shader-db on Haswell.  Prevents regressions in Timothy's
> patches to use enhanced layouts for varying packing (which rearrange
> code just enough to trigger this pre-existing bug, but were fine
> themselves).
>
> Cc: Timothy Arceri <tarc...@itsqueeze.com>

It might be a good idea to CC mesa-stable.

> ---
>  src/intel/compiler/brw_vec4.cpp | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
> index 0b92ba704e5..4bb774bf10e 100644
> --- a/src/intel/compiler/brw_vec4.cpp
> +++ b/src/intel/compiler/brw_vec4.cpp
> @@ -1071,6 +1071,13 @@ vec4_instruction::can_reswizzle(const struct 
> gen_device_info *devinfo,
>     if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW)
>        return false;
>  
> +   /* Don't touch MACH - it uses the accumulator results from an earlier
> +    * MUL - so we'd need to reswizzle both.  We don't do that, so just
> +    * avoid it entirely.
> +    */
> +   if (opcode == BRW_OPCODE_MACH)
> +      return false;
> +

I think the ultimate problem here is that MACH (among other
instructions) has an implicit accumulator source, which has no swizzle
controls, so it won't be able to represent the composition of swizzles
(for non-identity swizzles that is).  I think this should check for
reads_accumulator_implicitly() instead in order to avoid the exact same
problem with other instructions with implicit accumulator access.

>     if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW)
>        return false;
>  
> -- 
> 2.12.2
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to