Re: [Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().
Kenneth Graunke writes: > opt_register_coalesce() was optimizing sequences such as: > >mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D >mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D >mov(8) m4.zw:F, vgrf5.xxxy:F > > into: > >mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D >mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D > > This doesn't work - if we're going to reswizzle MACH, we'd need to > reswizzle the MUL as well. Here, the MUL fills the accumulator's .zw > components with attr18.yy * attr19.yy. But the MACH instruction expects > .z to contain attr18.x * attr19.x. Bogus results ensue. > > No change in shader-db on Haswell. Prevents regressions in Timothy's > patches to use enhanced layouts for varying packing (which rearrange > code just enough to trigger this pre-existing bug, but were fine > themselves). > > Cc: Timothy Arceri It might be a good idea to CC mesa-stable. > --- > src/intel/compiler/brw_vec4.cpp | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp > index 0b92ba704e5..4bb774bf10e 100644 > --- a/src/intel/compiler/brw_vec4.cpp > +++ b/src/intel/compiler/brw_vec4.cpp > @@ -1071,6 +1071,13 @@ vec4_instruction::can_reswizzle(const struct > gen_device_info *devinfo, > if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW) >return false; > > + /* Don't touch MACH - it uses the accumulator results from an earlier > +* MUL - so we'd need to reswizzle both. We don't do that, so just > +* avoid it entirely. > +*/ > + if (opcode == BRW_OPCODE_MACH) > + return false; > + I think the ultimate problem here is that MACH (among other instructions) has an implicit accumulator source, which has no swizzle controls, so it won't be able to represent the composition of swizzles (for non-identity swizzles that is). I think this should check for reads_accumulator_implicitly() instead in order to avoid the exact same problem with other instructions with implicit accumulator access. > if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW) >return false; > > -- > 2.12.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().
LGTM too On April 21, 2017 5:29:33 PM Timothy Arceri wrote: I don't entirely understand how this works, but it seems reasonable to me. Acked-by: Timothy Arceri Thanks for fixing this :) On 21/04/17 19:13, Kenneth Graunke wrote: opt_register_coalesce() was optimizing sequences such as: mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D mov(8) m4.zw:F, vgrf5.xxxy:F into: mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D This doesn't work - if we're going to reswizzle MACH, we'd need to reswizzle the MUL as well. Here, the MUL fills the accumulator's .zw components with attr18.yy * attr19.yy. But the MACH instruction expects .z to contain attr18.x * attr19.x. Bogus results ensue. No change in shader-db on Haswell. Prevents regressions in Timothy's patches to use enhanced layouts for varying packing (which rearrange code just enough to trigger this pre-existing bug, but were fine themselves). Cc: Timothy Arceri --- src/intel/compiler/brw_vec4.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 0b92ba704e5..4bb774bf10e 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -1071,6 +1071,13 @@ vec4_instruction::can_reswizzle(const struct gen_device_info *devinfo, if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW) return false; + /* Don't touch MACH - it uses the accumulator results from an earlier +* MUL - so we'd need to reswizzle both. We don't do that, so just +* avoid it entirely. +*/ + if (opcode == BRW_OPCODE_MACH) + return false; + if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW) return false; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().
I don't entirely understand how this works, but it seems reasonable to me. Acked-by: Timothy Arceri Thanks for fixing this :) On 21/04/17 19:13, Kenneth Graunke wrote: opt_register_coalesce() was optimizing sequences such as: mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D mov(8) m4.zw:F, vgrf5.xxxy:F into: mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D This doesn't work - if we're going to reswizzle MACH, we'd need to reswizzle the MUL as well. Here, the MUL fills the accumulator's .zw components with attr18.yy * attr19.yy. But the MACH instruction expects .z to contain attr18.x * attr19.x. Bogus results ensue. No change in shader-db on Haswell. Prevents regressions in Timothy's patches to use enhanced layouts for varying packing (which rearrange code just enough to trigger this pre-existing bug, but were fine themselves). Cc: Timothy Arceri --- src/intel/compiler/brw_vec4.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 0b92ba704e5..4bb774bf10e 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -1071,6 +1071,13 @@ vec4_instruction::can_reswizzle(const struct gen_device_info *devinfo, if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW) return false; + /* Don't touch MACH - it uses the accumulator results from an earlier +* MUL - so we'd need to reswizzle both. We don't do that, so just +* avoid it entirely. +*/ + if (opcode == BRW_OPCODE_MACH) + return false; + if (!can_do_writemask(devinfo) && dst_writemask != WRITEMASK_XYZW) return false; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev