On Wed, Jan 16, 2019 at 8:40 PM Jason Ekstrand <ja...@jlekstrand.net> wrote: > > The pass was discovered to cause problems with the MUL+MACH combination > we emit for nir_[iu]mul_high. In an experimental branch of mine, I ran > into issues where the MUL+MACH ended up using a strided source due to > working on half of a uint64_t and the new lowering pass helpfully tried > to fix the multiply which wrote to an unstriated accumulator. Not only
unstrided > did the multiply not need to be fixed but the "fix" ended up breaking it > because a MOV to the accumulator is not the same as using it as a > multiply destination due to the magic way the 33/64 bits of the > accumulator are handled for different instruction types. > > Fixes: efa4e4bc5fc "intel/fs: Introduce regioning lowering pass" > Cc: Francisco Jerez <curroje...@riseup.net> > --- > src/intel/compiler/brw_fs_lower_regioning.cpp | 16 +++++++++++++++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp > b/src/intel/compiler/brw_fs_lower_regioning.cpp > index cc4163b4c2c..b8a89e82272 100644 > --- a/src/intel/compiler/brw_fs_lower_regioning.cpp > +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp > @@ -53,7 +53,13 @@ namespace { > unsigned > required_dst_byte_stride(const fs_inst *inst) > { > - if (type_sz(inst->dst.type) < get_exec_type_size(inst) && > + if (inst->dst.is_accumulator()) { > + /* Even though it's not explicitly documented in the PRMs or the > + * BSpec, writes to the accumulator appear to not need any special > + * treatment with respect too their destination stride alignment. to I read the backlog on IRC and I still don't understand, but that's okay :) _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev