On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote: > The automatic exec size inference can accidentally mess things up if > we're not careful. For instance, if we have > > add(4) g38.2<4>D g38.1<8,2,4>D g38.2<8,2,4>D > > then the destination register will end up having a width of 2 with a > horizontal stride of 4 and a vertical stride of 8. The EU emit code > sees the width of 2 and decides that we really wanted an exec size of > 2 > which doesn't do what we wanted.
Right :-/ I have to say that this change makes me a little nervous, mostly because it doesn't look easy to identify all the cases where we were relying on automatic execsizes to fix things up for us... since this is not as easy as to look for places where we use 'vec1' or something like that. How did you get the list of things that needed explicit sizes? Also, both commits before this address cases of exec_size = 1, but we rely on automatic exec sizes for exec_size = 2 as well, I guess we have none of these? Anyway, I guess Jenkins would have caught at least most omissions so maybe I am being too paranoid. > --- > src/intel/compiler/brw_fs_generator.cpp | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/src/intel/compiler/brw_fs_generator.cpp > b/src/intel/compiler/brw_fs_generator.cpp > index 8322be1..46f9a33 100644 > --- a/src/intel/compiler/brw_fs_generator.cpp > +++ b/src/intel/compiler/brw_fs_generator.cpp > @@ -190,6 +190,12 @@ fs_generator::fs_generator(const struct > brw_compiler *compiler, void *log_data, > { > p = rzalloc(mem_ctx, struct brw_codegen); > brw_init_codegen(devinfo, p, mem_ctx); > + > + /* In the FS code generator, we are very careful to ensure that > we always > + * set the right execution size so we don't need the EU code to > "help" us > + * by trying to infer it. Sometimes, it infers the wrong thing. > + */ > + p->automatic_exec_sizes = false; > } > > fs_generator::~fs_generator() > @@ -395,17 +401,17 @@ fs_generator::generate_fb_write(fs_inst *inst, > struct brw_reg payload) > struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(), > BRW_REGISTER_TYPE_UD)); > > /* Check runtime bit to detect if we have to send AA data or > not */ > - brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); > brw_push_insn_state(p); > - brw_inst_set_exec_size(p->devinfo, brw_last_inst, > BRW_EXECUTE_1); > + brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); > + brw_set_default_exec_size(p, BRW_EXECUTE_1); > brw_AND(p, > v1_null_ud, > retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD), > brw_imm_ud(1<<26)); > brw_inst_set_cond_modifier(p->devinfo, brw_last_inst, > BRW_CONDITIONAL_NZ); > - brw_pop_insn_state(p); > > int jmp = brw_JMPI(p, brw_imm_ud(0), BRW_PREDICATE_NORMAL) - > p->store; > + brw_pop_insn_state(p); > { > /* Don't send AA data */ > fire_fb_write(inst, offset(payload, 1), implied_header, > inst->mlen-1); _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev