On Friday, January 1, 2016 2:31:56 PM PST Matt Turner wrote: > On Fri, Jan 1, 2016 at 4:46 AM, Kenneth Graunke <kenn...@whitecape.org> wrote: > > While most align16 instructions only support a SubRegNum of 0 or 4 > > (using swizzling to control the other channels), 3-src instructions > > actually support arbitrary SubRegNums. When the RepCtrl bit is set, > > we believe it ignores the swizzle and uses the equivalent of a <0,1,0> > > region from the subnr. > > > > In the past, we adopted a vec4-centric approach of specifying subnr of > > 0 or 4 and a swizzle, then having brw_eu_emit.c convert that to a proper > > SubRegNum. This isn't a great fit for the scalar backend, where we > > don't set swizzles at all, and happily set subnrs in the range [0, 7]. > > > > This patch changes brw_eu_emit.c to use subnr and swizzle directly, > > relying on the higher levels to set them sensibly. > > > > This should fix problems where scalar sources get copy propagated into > > 3-src instructions in the FS backend. I've only observed this with > > TES push model inputs, but I suppose it could happen in other cases. > > > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > > Cc: Matt Turner <matts...@gmail.com> > > --- > > src/mesa/drivers/dri/i965/brw_eu_emit.c | 11 +++++------ > > src/mesa/drivers/dri/i965/brw_vec4.cpp | 13 +++++++++++++ > > 2 files changed, 18 insertions(+), 6 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/ dri/i965/brw_eu_emit.c > > index 5fb9662..35d8039 100644 > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > > @@ -847,12 +847,11 @@ brw_alu2(struct brw_codegen *p, unsigned opcode, > > static int > > get_3src_subreg_nr(struct brw_reg reg) > > { > > - if (reg.vstride == BRW_VERTICAL_STRIDE_0) { > > - assert(brw_is_single_value_swizzle(reg.swizzle)); > > - return reg.subnr / 4 + BRW_GET_SWZ(reg.swizzle, 0); > > - } else { > > - return reg.subnr / 4; > > - } > > + /* Normally, SubRegNum is in bytes (0..31). However, 3-src instructions > > + * use 32-bit units (components 0..7). Since they only support F/D/UD > > + * types, this doesn't lose any flexibility, but uses fewer bits. > > + */ > > + return reg.subnr / 4; > > } > > > > static brw_inst * > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/ dri/i965/brw_vec4.cpp > > index dd22398..c6a52c5 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > @@ -1784,9 +1784,22 @@ vec4_visitor::convert_to_hw_regs() > > case ATTR: > > unreachable("not reached"); > > } > > + > > src = reg; > > } > > > > + if (inst->is_3src()) { > > + /* 3-src instructions with scalar sources support arbitrary subnr, > > + * but don't actually use swizzles. Convert swizzle into subnr. > > + */ > > + for (int i = 0; i < 3; i++) { > > + if (inst->src[i].vstride == BRW_VERTICAL_STRIDE_0) { > > + assert(brw_is_single_value_swizzle(inst->src[i].swizzle)); > > + inst->src[i].subnr += 4 * BRW_GET_SWZ(inst- >src[i].swizzle, 0); > > + } > > + } > > + } > > + > > > I believe this is correct. The only way scalar sources are propagated > into 3-src instructions is if file == UNIFORM && > brw_is_single_value_swizzle(). > > The series is > > Reviewed-by: Matt Turner <matts...@gmail.com>
Huh. It seems like we could hit that with ATTR as well. But, no matter, as I put the code in a place where it will run regardless of the original file. --Ken
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev