subject:"Re\: \[Mesa\-dev\] \[PATCH\] i965\/vec4\: Avoid reswizzling MACH instructions in opt_register

Re: [Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().

2017-04-22 Thread Francisco Jerez

Kenneth Graunke  writes:

> opt_register_coalesce() was optimizing sequences such as:
>
>mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
>mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
>mov(8) m4.zw:F, vgrf5.xxxy:F
>
> into:
>
>mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
>mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D
>
> This doesn't work - if we're going to reswizzle MACH, we'd need to
> reswizzle the MUL as well.  Here, the MUL fills the accumulator's .zw
> components with attr18.yy * attr19.yy.  But the MACH instruction expects
> .z to contain attr18.x * attr19.x.  Bogus results ensue.
>
> No change in shader-db on Haswell.  Prevents regressions in Timothy's
> patches to use enhanced layouts for varying packing (which rearrange
> code just enough to trigger this pre-existing bug, but were fine
> themselves).
>
> Cc: Timothy Arceri 

It might be a good idea to CC mesa-stable.

> ---
>  src/intel/compiler/brw_vec4.cpp | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
> index 0b92ba704e5..4bb774bf10e 100644
> --- a/src/intel/compiler/brw_vec4.cpp
> +++ b/src/intel/compiler/brw_vec4.cpp
> @@ -1071,6 +1071,13 @@ vec4_instruction::can_reswizzle(const struct 
> gen_device_info *devinfo,
> if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW)
>return false;
>  
> +   /* Don't touch MACH - it uses the accumulator results from an earlier
> +* MUL - so we'd need to reswizzle both.  We don't do that, so just
> +* avoid it entirely.
> +*/
> +   if (opcode == BRW_OPCODE_MACH)
> +  return false;
> +

I think the ultimate problem here is that MACH (among other
instructions) has an implicit accumulator source, which has no swizzle
controls, so it won't be able to represent the composition of swizzles
(for non-identity swizzles that is).  I think this should check for
reads_accumulator_implicitly() instead in order to avoid the exact same
problem with other instructions with implicit accumulator access.

> if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW)
>return false;
>  
> -- 
> 2.12.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().

2017-04-21 Thread Jason Ekstrand


LGTM too


On April 21, 2017 5:29:33 PM Timothy Arceri  wrote:


I don't entirely understand how this works, but it seems reasonable to me.

Acked-by: Timothy Arceri 

Thanks for fixing this :)

On 21/04/17 19:13, Kenneth Graunke wrote:

opt_register_coalesce() was optimizing sequences such as:

mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
mov(8) m4.zw:F, vgrf5.xxxy:F

into:

mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D

This doesn't work - if we're going to reswizzle MACH, we'd need to
reswizzle the MUL as well.  Here, the MUL fills the accumulator's .zw
components with attr18.yy * attr19.yy.  But the MACH instruction expects
.z to contain attr18.x * attr19.x.  Bogus results ensue.

No change in shader-db on Haswell.  Prevents regressions in Timothy's
patches to use enhanced layouts for varying packing (which rearrange
code just enough to trigger this pre-existing bug, but were fine
themselves).

Cc: Timothy Arceri 
---
  src/intel/compiler/brw_vec4.cpp | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 0b92ba704e5..4bb774bf10e 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -1071,6 +1071,13 @@ vec4_instruction::can_reswizzle(const struct 
gen_device_info *devinfo,

 if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW)
return false;

+   /* Don't touch MACH - it uses the accumulator results from an earlier
+* MUL - so we'd need to reswizzle both.  We don't do that, so just
+* avoid it entirely.
+*/
+   if (opcode == BRW_OPCODE_MACH)
+  return false;
+
 if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW)
return false;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().

2017-04-21 Thread Timothy Arceri


I don't entirely understand how this works, but it seems reasonable to me.

Acked-by: Timothy Arceri 

Thanks for fixing this :)

On 21/04/17 19:13, Kenneth Graunke wrote:

opt_register_coalesce() was optimizing sequences such as:

mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
mov(8) m4.zw:F, vgrf5.xxxy:F

into:

mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D

This doesn't work - if we're going to reswizzle MACH, we'd need to
reswizzle the MUL as well.  Here, the MUL fills the accumulator's .zw
components with attr18.yy * attr19.yy.  But the MACH instruction expects
.z to contain attr18.x * attr19.x.  Bogus results ensue.

No change in shader-db on Haswell.  Prevents regressions in Timothy's
patches to use enhanced layouts for varying packing (which rearrange
code just enough to trigger this pre-existing bug, but were fine
themselves).

Cc: Timothy Arceri 
---
  src/intel/compiler/brw_vec4.cpp | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 0b92ba704e5..4bb774bf10e 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -1071,6 +1071,13 @@ vec4_instruction::can_reswizzle(const struct 
gen_device_info *devinfo,
 if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW)
return false;
  
+   /* Don't touch MACH - it uses the accumulator results from an earlier

+* MUL - so we'd need to reswizzle both.  We don't do that, so just
+* avoid it entirely.
+*/
+   if (opcode == BRW_OPCODE_MACH)
+  return false;
+
 if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW)
return false;
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().

Re: [Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().

Re: [Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().

3 matches

Site Navigation

Mail list logo

Footer information