Re: [Mesa-dev] [PATCH v3 38/48] intel/fs: Don't use automatic exec size
On Fri, Oct 27, 2017 at 4:47 AM, Iago Toralwrote: > On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote: > > The automatic exec size inference can accidentally mess things up if > > we're not careful. For instance, if we have > > > > add(4)g38.2<4>Dg38.1<8,2,4>Dg38.2<8,2,4>D > > > > then the destination register will end up having a width of 2 with a > > horizontal stride of 4 and a vertical stride of 8. The EU emit code > > sees the width of 2 and decides that we really wanted an exec size of > > 2 > > which doesn't do what we wanted. > > Right :-/ > > I have to say that this change makes me a little nervous, mostly > because it doesn't look easy to identify all the cases where we were > relying on automatic execsizes to fix things up for us... since this is > not as easy as to look for places where we use 'vec1' or something like > that. How did you get the list of things that needed explicit sizes? > Lots of grep :) If I recall correctly, I searched for EXECUTE_1, vec1, EXECUTE_2, and vec2. > Also, both commits before this address cases of exec_size = 1, but we > rely on automatic exec sizes for exec_size = 2 as well, I guess we have > none of these? > > Anyway, I guess Jenkins would have caught at least most omissions so > maybe I am being too paranoid. That's my hope as well. :-) Durring the debugging of this chunk of the series, Jenkins found quite a few errors. Once Jenkins was happy, I did one more pass of grep to be sure. --Jason > > --- > > src/intel/compiler/brw_fs_generator.cpp | 12 +--- > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/src/intel/compiler/brw_fs_generator.cpp > > b/src/intel/compiler/brw_fs_generator.cpp > > index 8322be1..46f9a33 100644 > > --- a/src/intel/compiler/brw_fs_generator.cpp > > +++ b/src/intel/compiler/brw_fs_generator.cpp > > @@ -190,6 +190,12 @@ fs_generator::fs_generator(const struct > > brw_compiler *compiler, void *log_data, > > { > > p = rzalloc(mem_ctx, struct brw_codegen); > > brw_init_codegen(devinfo, p, mem_ctx); > > + > > + /* In the FS code generator, we are very careful to ensure that > > we always > > +* set the right execution size so we don't need the EU code to > > "help" us > > +* by trying to infer it. Sometimes, it infers the wrong thing. > > +*/ > > + p->automatic_exec_sizes = false; > > } > > > > fs_generator::~fs_generator() > > @@ -395,17 +401,17 @@ fs_generator::generate_fb_write(fs_inst *inst, > > struct brw_reg payload) > >struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(), > > BRW_REGISTER_TYPE_UD)); > > > >/* Check runtime bit to detect if we have to send AA data or > > not */ > > - brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); > >brw_push_insn_state(p); > > - brw_inst_set_exec_size(p->devinfo, brw_last_inst, > > BRW_EXECUTE_1); > > + brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); > > + brw_set_default_exec_size(p, BRW_EXECUTE_1); > >brw_AND(p, > >v1_null_ud, > >retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD), > >brw_imm_ud(1<<26)); > >brw_inst_set_cond_modifier(p->devinfo, brw_last_inst, > > BRW_CONDITIONAL_NZ); > > - brw_pop_insn_state(p); > > > >int jmp = brw_JMPI(p, brw_imm_ud(0), BRW_PREDICATE_NORMAL) - > > p->store; > > + brw_pop_insn_state(p); > >{ > > /* Don't send AA data */ > > fire_fb_write(inst, offset(payload, 1), implied_header, > > inst->mlen-1); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 38/48] intel/fs: Don't use automatic exec size
On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote: > The automatic exec size inference can accidentally mess things up if > we're not careful. For instance, if we have > > add(4)g38.2<4>Dg38.1<8,2,4>Dg38.2<8,2,4>D > > then the destination register will end up having a width of 2 with a > horizontal stride of 4 and a vertical stride of 8. The EU emit code > sees the width of 2 and decides that we really wanted an exec size of > 2 > which doesn't do what we wanted. Right :-/ I have to say that this change makes me a little nervous, mostly because it doesn't look easy to identify all the cases where we were relying on automatic execsizes to fix things up for us... since this is not as easy as to look for places where we use 'vec1' or something like that. How did you get the list of things that needed explicit sizes? Also, both commits before this address cases of exec_size = 1, but we rely on automatic exec sizes for exec_size = 2 as well, I guess we have none of these? Anyway, I guess Jenkins would have caught at least most omissions so maybe I am being too paranoid. > --- > src/intel/compiler/brw_fs_generator.cpp | 12 +--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/src/intel/compiler/brw_fs_generator.cpp > b/src/intel/compiler/brw_fs_generator.cpp > index 8322be1..46f9a33 100644 > --- a/src/intel/compiler/brw_fs_generator.cpp > +++ b/src/intel/compiler/brw_fs_generator.cpp > @@ -190,6 +190,12 @@ fs_generator::fs_generator(const struct > brw_compiler *compiler, void *log_data, > { > p = rzalloc(mem_ctx, struct brw_codegen); > brw_init_codegen(devinfo, p, mem_ctx); > + > + /* In the FS code generator, we are very careful to ensure that > we always > +* set the right execution size so we don't need the EU code to > "help" us > +* by trying to infer it. Sometimes, it infers the wrong thing. > +*/ > + p->automatic_exec_sizes = false; > } > > fs_generator::~fs_generator() > @@ -395,17 +401,17 @@ fs_generator::generate_fb_write(fs_inst *inst, > struct brw_reg payload) > struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(), > BRW_REGISTER_TYPE_UD)); > > /* Check runtime bit to detect if we have to send AA data or > not */ > - brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); > brw_push_insn_state(p); > - brw_inst_set_exec_size(p->devinfo, brw_last_inst, > BRW_EXECUTE_1); > + brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); > + brw_set_default_exec_size(p, BRW_EXECUTE_1); > brw_AND(p, > v1_null_ud, > retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD), > brw_imm_ud(1<<26)); > brw_inst_set_cond_modifier(p->devinfo, brw_last_inst, > BRW_CONDITIONAL_NZ); > - brw_pop_insn_state(p); > > int jmp = brw_JMPI(p, brw_imm_ud(0), BRW_PREDICATE_NORMAL) - > p->store; > + brw_pop_insn_state(p); > { > /* Don't send AA data */ > fire_fb_write(inst, offset(payload, 1), implied_header, > inst->mlen-1); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 38/48] intel/fs: Don't use automatic exec size inference
The automatic exec size inference can accidentally mess things up if we're not careful. For instance, if we have add(4)g38.2<4>Dg38.1<8,2,4>Dg38.2<8,2,4>D then the destination register will end up having a width of 2 with a horizontal stride of 4 and a vertical stride of 8. The EU emit code sees the width of 2 and decides that we really wanted an exec size of 2 which doesn't do what we wanted. --- src/intel/compiler/brw_fs_generator.cpp | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp index 8322be1..46f9a33 100644 --- a/src/intel/compiler/brw_fs_generator.cpp +++ b/src/intel/compiler/brw_fs_generator.cpp @@ -190,6 +190,12 @@ fs_generator::fs_generator(const struct brw_compiler *compiler, void *log_data, { p = rzalloc(mem_ctx, struct brw_codegen); brw_init_codegen(devinfo, p, mem_ctx); + + /* In the FS code generator, we are very careful to ensure that we always +* set the right execution size so we don't need the EU code to "help" us +* by trying to infer it. Sometimes, it infers the wrong thing. +*/ + p->automatic_exec_sizes = false; } fs_generator::~fs_generator() @@ -395,17 +401,17 @@ fs_generator::generate_fb_write(fs_inst *inst, struct brw_reg payload) struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_UD)); /* Check runtime bit to detect if we have to send AA data or not */ - brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); brw_push_insn_state(p); - brw_inst_set_exec_size(p->devinfo, brw_last_inst, BRW_EXECUTE_1); + brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); + brw_set_default_exec_size(p, BRW_EXECUTE_1); brw_AND(p, v1_null_ud, retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD), brw_imm_ud(1<<26)); brw_inst_set_cond_modifier(p->devinfo, brw_last_inst, BRW_CONDITIONAL_NZ); - brw_pop_insn_state(p); int jmp = brw_JMPI(p, brw_imm_ud(0), BRW_PREDICATE_NORMAL) - p->store; + brw_pop_insn_state(p); { /* Don't send AA data */ fire_fb_write(inst, offset(payload, 1), implied_header, inst->mlen-1); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev