On Fri, Oct 28, 2016 at 4:07 PM, Matt Turner <[email protected]> wrote:
> On Fri, Oct 28, 2016 at 2:53 PM, Jason Ekstrand <[email protected]> > wrote: > > The address immediate field is only 9 bits and, since the value is in > > bytes, the highest GRF we can point to with it is g15. This makes it > > pretty close to useless for MOV_INDIRECT. There were already piles of > > restrictions preventing us from using it prior to Broadwell, so let's get > > rid of the gen8+ code path entirely. > > > > Signed-off-by: Jason Ekstrand <[email protected]> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97779 > > Cc: "12.0 13.0" <[email protected]> > > --- > > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 55 > +++++++++++++------------- > > 1 file changed, 28 insertions(+), 27 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > index d25d26a..7130bf5 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp > > @@ -386,33 +386,34 @@ fs_generator::generate_mov_indirect(fs_inst *inst, > > retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW); > > > > struct brw_reg ind_src; > > - if (devinfo->gen < 8) { > > - /* From the Haswell PRM section "Register Region Restrictions": > > - * > > - * "The lower bits of the AddressImmediate must not > overflow to > > - * change the register address. The lower 5 bits of Address > > - * Immediate when added to lower 5 bits of address register > gives > > - * the sub-register offset. The upper bits of Address > Immediate > > - * when added to upper bits of address register gives the > register > > - * address. Any overflow from sub-register offset is > dropped." > > - * > > - * This restriction is only listed in the Haswell PRM but > emperical > > - * testing indicates that it applies on all older generations > and is > > - * lifted on Broadwell. > > - * > > - * Since the indirect may cause us to cross a register > boundary, this > > - * makes the base offset almost useless. We could try and do > > - * something clever where we use a actual base offset if > > - * base_offset % 32 == 0 but that would mean we were generating > > - * different code depending on the base offset. Instead, for > the > > - * sake of consistency, we'll just do the add ourselves. > > - */ > > - brw_ADD(p, addr, indirect_byte_offset, > brw_imm_uw(imm_byte_offset)); > > - ind_src = brw_VxH_indirect(0, 0); > > - } else { > > - brw_MOV(p, addr, indirect_byte_offset); > > - ind_src = brw_VxH_indirect(0, imm_byte_offset); > > - } > > + > > + /* There are a number of reasons why we don't use the base offset > here. > > + * One reason is that the field is only 9 bits which means we can > only > > + * use it on the first 16 GRFs. Also, from the Haswell PRM > section > > s/on/to access/ > Yeah, that's better > > + * "Register Region Restrictions": > > + * > > + * "The lower bits of the AddressImmediate must not overflow to > > + * change the register address. The lower 5 bits of Address > > + * Immediate when added to lower 5 bits of address register > gives > > + * the sub-register offset. The upper bits of Address Immediate > > + * when added to upper bits of address register gives the > register > > + * address. Any overflow from sub-register offset is dropped." > > + * > > + * Since the indirect may cause us to cross a register boundary, > this > > + * makes the base offset almost useless. We could try and do > something > > + * clever where we use a actual base offset if base_offset % 32 > == 0 but > > + * that would mean we were generating different code depending on > the > > + * base offset. Instead, for the sake of consistency, we'll just > do the > > + * add ourselves. This restriction is only listed in the Haswell > PRM > > + * but emperical testing indicates that it applies on all older > > empirical > Thanks > > + * generations and is lifted on Broadwell. > > + * > > + * In the end, while base_offset is nice to look at in the > generated > > + * code, using it saves us 0 instructions and would require quite > a bit > > + * of case-by-case work. It's just not worth it. > > + */ > > + brw_ADD(p, addr, indirect_byte_offset, > brw_imm_uw(imm_byte_offset)); > > + ind_src = brw_VxH_indirect(0, 0); > > I think you can move the declaration of ind_src here as well. > Done. > Reviewed-by: Matt Turner <[email protected]> > Thanks!
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
