On Fri, Sep 25, 2015 at 11:10 AM, Kenneth Graunke <kenn...@whitecape.org> wrote: > On Friday, September 25, 2015 11:03:46 AM Matt Turner wrote: >> On Fri, Sep 25, 2015 at 10:47 AM, Kenneth Graunke <kenn...@whitecape.org> >> wrote: >> > GS_OPCODE_SET_WRITE_OFFSET is a MUL with a constant src[1] and special >> > strides. We can easily make the generator handle constant src[0] >> > arguments by instead generating a MOV with the product of both operands. >> > >> > This isn't necessarily a win in and of itself - instead of a MUL, we >> > generate a MOV, which should be basically the same cost. However, we >> > can probably avoid the earlier MOV to put src[0] into a register. >> > >> > shader-db statistics for geometry shaders only: >> > >> > total instructions in shared programs: 3207 -> 3173 (-1.06%) >> > instructions in affected programs: 3207 -> 3173 (-1.06%) >> > helped: 11 >> > >> > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> >> > --- >> > src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++++++ >> > src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 9 +++++++-- >> > 2 files changed, 14 insertions(+), 2 deletions(-) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp >> > b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp >> > index 5b6444e..610caef 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp >> > @@ -202,6 +202,13 @@ try_constant_propagate(const struct brw_device_info >> > *devinfo, >> > return true; >> > } >> > break; >> > + case GS_OPCODE_SET_WRITE_OFFSET: >> > + /* This is just a multiply by a constant with special strides. >> > + * The generator will handle immediates in both arguments >> > (generating >> > + * a single MOV of the product). So feel free to propagate in src0. >> > + */ >> > + inst->src[arg] = value; >> > + return true; >> > >> > case BRW_OPCODE_CMP: >> > if (arg == 1) { >> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp >> > b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp >> > index e69c067..620167d 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp >> > @@ -541,8 +541,13 @@ vec4_generator::generate_gs_set_write_offset(struct >> > brw_reg dst, >> > src1.file == BRW_IMMEDIATE_VALUE && >> > src1.type == BRW_REGISTER_TYPE_UD && >> > src1.dw1.ud <= USHRT_MAX); >> > - brw_MUL(p, suboffset(stride(dst, 2, 2, 1), 3), stride(src0, 8, 2, 4), >> > - retype(src1, BRW_REGISTER_TYPE_UW)); >> > + if (src0.file == IMM) { >> > + brw_MOV(p, suboffset(stride(dst, 2, 2, 1), 3), >> > + brw_imm_ud(src0.dw1.ud * src1.dw1.ud)); >> >> Alternatively, we could make opt_algebraic() constant-evaluate this at >> a higher level. I'm not sure if that would help generate better code, >> but it seems a little cleaner. > > I guess that's possible, but I'm not sure it would be cleaner. We still > need the crazy stride/suboffset on the resulting MOV. Which I don't > think we can represent in src_reg in general. But since the destination > is an MRF, we could do it at the brw_reg level. But that seems ugly > as well.
Ah, I missed that this was an align1 mul(2)/mov(2). Yeah, that's not possible to handle in a higher level without adding another opcode. Reviewed-by: Matt Turner <matts...@gmail.com> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev