Re: [Mesa-dev] [PATCH v3 38/48] intel/fs: Don't use automatic exec size

2017-10-27 Thread Jason Ekstrand
On Fri, Oct 27, 2017 at 4:47 AM, Iago Toral  wrote:

> On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote:
> > The automatic exec size inference can accidentally mess things up if
> > we're not careful.  For instance, if we have
> >
> > add(4)g38.2<4>Dg38.1<8,2,4>Dg38.2<8,2,4>D
> >
> > then the destination register will end up having a width of 2 with a
> > horizontal stride of 4 and a vertical stride of 8.  The EU emit code
> > sees the width of 2 and decides that we really wanted an exec size of
> > 2
> > which doesn't do what we wanted.
>
> Right :-/
>
> I have to say that this change makes me a little nervous, mostly
> because it doesn't look easy to identify all the cases where we were
> relying on automatic execsizes to fix things up for us... since this is
> not as easy as to look for places where we use 'vec1' or something like
> that. How did you get the list of things that needed explicit sizes?
>

Lots of grep :)  If I recall correctly, I searched for EXECUTE_1, vec1,
EXECUTE_2, and vec2.


> Also, both commits before this address cases of exec_size = 1, but we
> rely on automatic exec sizes for exec_size = 2 as well, I guess we have
> none of these?
>
> Anyway, I guess Jenkins would have caught at least most omissions so
> maybe I am being too paranoid.


That's my hope as well. :-)  Durring the debugging of this chunk of the
series, Jenkins found quite a few errors.  Once Jenkins was happy, I did
one more pass of grep to be sure.

--Jason


> > ---
> >  src/intel/compiler/brw_fs_generator.cpp | 12 +---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/intel/compiler/brw_fs_generator.cpp
> > b/src/intel/compiler/brw_fs_generator.cpp
> > index 8322be1..46f9a33 100644
> > --- a/src/intel/compiler/brw_fs_generator.cpp
> > +++ b/src/intel/compiler/brw_fs_generator.cpp
> > @@ -190,6 +190,12 @@ fs_generator::fs_generator(const struct
> > brw_compiler *compiler, void *log_data,
> >  {
> > p = rzalloc(mem_ctx, struct brw_codegen);
> > brw_init_codegen(devinfo, p, mem_ctx);
> > +
> > +   /* In the FS code generator, we are very careful to ensure that
> > we always
> > +* set the right execution size so we don't need the EU code to
> > "help" us
> > +* by trying to infer it.  Sometimes, it infers the wrong thing.
> > +*/
> > +   p->automatic_exec_sizes = false;
> >  }
> >
> >  fs_generator::~fs_generator()
> > @@ -395,17 +401,17 @@ fs_generator::generate_fb_write(fs_inst *inst,
> > struct brw_reg payload)
> >struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(),
> > BRW_REGISTER_TYPE_UD));
> >
> >/* Check runtime bit to detect if we have to send AA data or
> > not */
> > -  brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
> >brw_push_insn_state(p);
> > -  brw_inst_set_exec_size(p->devinfo, brw_last_inst,
> > BRW_EXECUTE_1);
> > +  brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
> > +  brw_set_default_exec_size(p, BRW_EXECUTE_1);
> >brw_AND(p,
> >v1_null_ud,
> >retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD),
> >brw_imm_ud(1<<26));
> >brw_inst_set_cond_modifier(p->devinfo, brw_last_inst,
> > BRW_CONDITIONAL_NZ);
> > -  brw_pop_insn_state(p);
> >
> >int jmp = brw_JMPI(p, brw_imm_ud(0), BRW_PREDICATE_NORMAL) -
> > p->store;
> > +  brw_pop_insn_state(p);
> >{
> >   /* Don't send AA data */
> >   fire_fb_write(inst, offset(payload, 1), implied_header,
> > inst->mlen-1);
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 38/48] intel/fs: Don't use automatic exec size

2017-10-27 Thread Iago Toral
On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote:
> The automatic exec size inference can accidentally mess things up if
> we're not careful.  For instance, if we have
> 
> add(4)g38.2<4>Dg38.1<8,2,4>Dg38.2<8,2,4>D
> 
> then the destination register will end up having a width of 2 with a
> horizontal stride of 4 and a vertical stride of 8.  The EU emit code
> sees the width of 2 and decides that we really wanted an exec size of
> 2
> which doesn't do what we wanted.

Right :-/

I have to say that this change makes me a little nervous, mostly
because it doesn't look easy to identify all the cases where we were
relying on automatic execsizes to fix things up for us... since this is
not as easy as to look for places where we use 'vec1' or something like
that. How did you get the list of things that needed explicit sizes?

Also, both commits before this address cases of exec_size = 1, but we
rely on automatic exec sizes for exec_size = 2 as well, I guess we have
none of these?

Anyway, I guess Jenkins would have caught at least most omissions so
maybe I am being too paranoid.

> ---
>  src/intel/compiler/brw_fs_generator.cpp | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs_generator.cpp
> b/src/intel/compiler/brw_fs_generator.cpp
> index 8322be1..46f9a33 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -190,6 +190,12 @@ fs_generator::fs_generator(const struct
> brw_compiler *compiler, void *log_data,
>  {
> p = rzalloc(mem_ctx, struct brw_codegen);
> brw_init_codegen(devinfo, p, mem_ctx);
> +
> +   /* In the FS code generator, we are very careful to ensure that
> we always
> +* set the right execution size so we don't need the EU code to
> "help" us
> +* by trying to infer it.  Sometimes, it infers the wrong thing.
> +*/
> +   p->automatic_exec_sizes = false;
>  }
>  
>  fs_generator::~fs_generator()
> @@ -395,17 +401,17 @@ fs_generator::generate_fb_write(fs_inst *inst,
> struct brw_reg payload)
>    struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(),
> BRW_REGISTER_TYPE_UD));
>  
>    /* Check runtime bit to detect if we have to send AA data or
> not */
> -  brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
>    brw_push_insn_state(p);
> -  brw_inst_set_exec_size(p->devinfo, brw_last_inst,
> BRW_EXECUTE_1);
> +  brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
> +  brw_set_default_exec_size(p, BRW_EXECUTE_1);
>    brw_AND(p,
>    v1_null_ud,
>    retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD),
>    brw_imm_ud(1<<26));
>    brw_inst_set_cond_modifier(p->devinfo, brw_last_inst,
> BRW_CONDITIONAL_NZ);
> -  brw_pop_insn_state(p);
>  
>    int jmp = brw_JMPI(p, brw_imm_ud(0), BRW_PREDICATE_NORMAL) -
> p->store;
> +  brw_pop_insn_state(p);
>    {
>   /* Don't send AA data */
>   fire_fb_write(inst, offset(payload, 1), implied_header,
> inst->mlen-1);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3 38/48] intel/fs: Don't use automatic exec size inference

2017-10-25 Thread Jason Ekstrand
The automatic exec size inference can accidentally mess things up if
we're not careful.  For instance, if we have

add(4)g38.2<4>Dg38.1<8,2,4>Dg38.2<8,2,4>D

then the destination register will end up having a width of 2 with a
horizontal stride of 4 and a vertical stride of 8.  The EU emit code
sees the width of 2 and decides that we really wanted an exec size of 2
which doesn't do what we wanted.
---
 src/intel/compiler/brw_fs_generator.cpp | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 8322be1..46f9a33 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -190,6 +190,12 @@ fs_generator::fs_generator(const struct brw_compiler 
*compiler, void *log_data,
 {
p = rzalloc(mem_ctx, struct brw_codegen);
brw_init_codegen(devinfo, p, mem_ctx);
+
+   /* In the FS code generator, we are very careful to ensure that we always
+* set the right execution size so we don't need the EU code to "help" us
+* by trying to infer it.  Sometimes, it infers the wrong thing.
+*/
+   p->automatic_exec_sizes = false;
 }
 
 fs_generator::~fs_generator()
@@ -395,17 +401,17 @@ fs_generator::generate_fb_write(fs_inst *inst, struct 
brw_reg payload)
   struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(), 
BRW_REGISTER_TYPE_UD));
 
   /* Check runtime bit to detect if we have to send AA data or not */
-  brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
   brw_push_insn_state(p);
-  brw_inst_set_exec_size(p->devinfo, brw_last_inst, BRW_EXECUTE_1);
+  brw_set_default_compression_control(p, BRW_COMPRESSION_NONE);
+  brw_set_default_exec_size(p, BRW_EXECUTE_1);
   brw_AND(p,
   v1_null_ud,
   retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD),
   brw_imm_ud(1<<26));
   brw_inst_set_cond_modifier(p->devinfo, brw_last_inst, 
BRW_CONDITIONAL_NZ);
-  brw_pop_insn_state(p);
 
   int jmp = brw_JMPI(p, brw_imm_ud(0), BRW_PREDICATE_NORMAL) - p->store;
+  brw_pop_insn_state(p);
   {
  /* Don't send AA data */
  fire_fb_write(inst, offset(payload, 1), implied_header, inst->mlen-1);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev