Re: [Mesa-dev] [PATCH v3 00/48] nir, intel: Prerequisites for subgroups

2017-10-30 Thread Iago Toral
I posted comments in patches 39 and 43, but otherwise patches 39-48
are:

Reviewed-by: Iago Toral Quiroga 

Iago

On Fri, 2017-10-27 at 13:57 +0200, Iago Toral wrote:
> I dropped a few more comments on patches 18-20, 23, 25, 29-31, 34 and
> 38 but nothing major in general.
> 
> I have doubts that patch 20 isn't working around some other bug and
> 38
> scares me a little bit but I guess if Jenkins is happy I shouldn't
> worry too much.
> 
> Otherwise, patches 16-38 are:
> 
> Reviewed-by: Iago Toral Quiroga 
> 
> (patch 15 already has your Rb)
> 
> Iago
> 
> On Thu, 2017-10-26 at 14:15 +0200, Iago Toral wrote:
> > I left a few minor comments in patches 1, 2, 8 and 14. Otherwise
> > patches 1-2, 4-5 and 7-14 (3 and 6 already have Rb) are:
> > 
> > Reviewed-by: Iago Toral Quiroga 
> > 
> > I feel like patches 10, 11 could maybe use another extra review if
> > there is someone who wants to do it, since I am not very familiar
> > with
> > how all the indirect addressing stuff works and the restrictions of
> > the
> > hardware that affect this.
> > 
> > Patch 14 looks good, although the part where you locate the DO
> > block
> > for a matching WHILE could probably use the review of someone else
> > more
> > familiar with the CFG code than me.
> > 
> > Iago
> > 
> > On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > > This series is a third respin of my subgroups prerequisites
> > > series
> > > that
> > > that I sent out a few weeks ago.  Not a whole lot has changed but
> > > there are
> > > some new patches.  Primarily,
> > > 
> > >  1) Some patches which were reviewed by Matt and Lionel were
> > > pushed
> > > and are
> > > no longer in the series.  Thanks guys!
> > > 
> > >  2) I've applied R-B tags from various people for patches which
> > > are
> > > reviewed but depend on still unreviewed patches.
> > > 
> > >  3) A few patches to fix little-core.  In particular, the extra
> > > little-core
> > > EU restrictions cause problems for BROADCAST, MOV_INDIRECT,
> > > and
> > > integer
> > > MUL.
> > > 
> > > This series can be found on fd.o nere:
> > > 
> > > https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/subgro
> > > up
> > > -p
> > > rereqs-v3
> > > 
> > > Happy reviewing!
> > > 
> > > 
> > > Cc: Matt Turner 
> > > Cc: Francisco Jerez 
> > > Cc: Connor Abbott 
> > > 
> > > Francisco Jerez (1):
> > >   intel/fs: Restrict live intervals to the subset possibly
> > > reachable
> > > from any definition.
> > > 
> > > Jason Ekstrand (47):
> > >   intel/fs: Pass builders instead of blocks into emit_[un]zip
> > >   intel/fs: Be more explicit about our placement of [un]zip
> > >   intel/fs: Use ANY/ALL32 predicates in SIMD32
> > >   intel/fs: Don't stomp f0.1 in SIMD16 ballot
> > >   intel/fs: Use an explicit D type for vote any/all/eq intrinsics
> > >   intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
> > >   intel/compiler: Add some restrictions to MOV_INDIRECT and
> > > BROADCAST
> > >   intel/eu: Just modify the offset in brw_broadcast
> > >   intel/eu/reg: Add a subscript() helper
> > >   intel/eu: Fix broadcast instruction for 64-bit values on
> > > little-
> > > core
> > >   intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core
> > >   intel/fs: Fix integer multiplication lowering for src/dst
> > > hazards
> > >   intel/fs: Use the original destination region for int MUL
> > > lowering
> > >   i965/fs: Extend the live ranges of VGRFs which leave loops
> > >   i965/fs/nir: Simplify 64-bit store_output
> > >   i965/fs: Return a fs_reg from
> > > shuffle_64bit_data_for_32bit_write
> > >   i965/fs/nir: Minor refactor of store_output
> > >   i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
> > >   intel/fs: Protect opt_algebraic from OOB BROADCAST indices
> > >   intel/fs: Uniformize the index in readInvocation
> > >   intel/fs: Retype dest to match value in read[First]Invocation
> > >   intel/fs: Assign constant locations if they haven't been
> > > assigned
> > >   intel/fs: Remove min_dispatch_width from fs_visitor
> > >   intel/cs: Drop max_dispatch_width checks from compile_cs
> > >   intel/cs: Stop setting dispatch_grf_start_reg
> > >   intel/cs: Ignore runtime_check_aads_emit for CS
> > >   intel/fs: Mark 64-bit values as being contiguous
> > >   intel/cs: Rework the way thread local ID is handled
> > >   intel/cs: Re-run final NIR optimizations for each SIMD size
> > >   intel/cs: Re-run final NIR optimizations for each SIMD size
> > >   intel/cs: Push subgroup ID instead of base thread ID
> > >   intel/compiler/fs: Set up subgroup invocation as a system value
> > >   intel/fs: Rework zero-length URB write handling
> > >   intel/eu: Make automatic exec sizes a configurable option
> > >   intel/eu: Explicitly set EXECUTE_1 where needed
> > >   intel/fs: Explicitly set EXECUTE_1 where needed
> > >   intel/fs: Don't use 

Re: [Mesa-dev] [PATCH v3 00/48] nir, intel: Prerequisites for subgroups

2017-10-27 Thread Iago Toral
I dropped a few more comments on patches 18-20, 23, 25, 29-31, 34 and
38 but nothing major in general.

I have doubts that patch 20 isn't working around some other bug and 38
scares me a little bit but I guess if Jenkins is happy I shouldn't
worry too much.

Otherwise, patches 16-38 are:

Reviewed-by: Iago Toral Quiroga 

(patch 15 already has your Rb)

Iago

On Thu, 2017-10-26 at 14:15 +0200, Iago Toral wrote:
> I left a few minor comments in patches 1, 2, 8 and 14. Otherwise
> patches 1-2, 4-5 and 7-14 (3 and 6 already have Rb) are:
> 
> Reviewed-by: Iago Toral Quiroga 
> 
> I feel like patches 10, 11 could maybe use another extra review if
> there is someone who wants to do it, since I am not very familiar
> with
> how all the indirect addressing stuff works and the restrictions of
> the
> hardware that affect this.
> 
> Patch 14 looks good, although the part where you locate the DO block
> for a matching WHILE could probably use the review of someone else
> more
> familiar with the CFG code than me.
> 
> Iago
> 
> On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > This series is a third respin of my subgroups prerequisites series
> > that
> > that I sent out a few weeks ago.  Not a whole lot has changed but
> > there are
> > some new patches.  Primarily,
> > 
> >  1) Some patches which were reviewed by Matt and Lionel were pushed
> > and are
> > no longer in the series.  Thanks guys!
> > 
> >  2) I've applied R-B tags from various people for patches which are
> > reviewed but depend on still unreviewed patches.
> > 
> >  3) A few patches to fix little-core.  In particular, the extra
> > little-core
> > EU restrictions cause problems for BROADCAST, MOV_INDIRECT, and
> > integer
> > MUL.
> > 
> > This series can be found on fd.o nere:
> > 
> > https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/subgroup
> > -p
> > rereqs-v3
> > 
> > Happy reviewing!
> > 
> > 
> > Cc: Matt Turner 
> > Cc: Francisco Jerez 
> > Cc: Connor Abbott 
> > 
> > Francisco Jerez (1):
> >   intel/fs: Restrict live intervals to the subset possibly
> > reachable
> > from any definition.
> > 
> > Jason Ekstrand (47):
> >   intel/fs: Pass builders instead of blocks into emit_[un]zip
> >   intel/fs: Be more explicit about our placement of [un]zip
> >   intel/fs: Use ANY/ALL32 predicates in SIMD32
> >   intel/fs: Don't stomp f0.1 in SIMD16 ballot
> >   intel/fs: Use an explicit D type for vote any/all/eq intrinsics
> >   intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
> >   intel/compiler: Add some restrictions to MOV_INDIRECT and
> > BROADCAST
> >   intel/eu: Just modify the offset in brw_broadcast
> >   intel/eu/reg: Add a subscript() helper
> >   intel/eu: Fix broadcast instruction for 64-bit values on little-
> > core
> >   intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core
> >   intel/fs: Fix integer multiplication lowering for src/dst hazards
> >   intel/fs: Use the original destination region for int MUL
> > lowering
> >   i965/fs: Extend the live ranges of VGRFs which leave loops
> >   i965/fs/nir: Simplify 64-bit store_output
> >   i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write
> >   i965/fs/nir: Minor refactor of store_output
> >   i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
> >   intel/fs: Protect opt_algebraic from OOB BROADCAST indices
> >   intel/fs: Uniformize the index in readInvocation
> >   intel/fs: Retype dest to match value in read[First]Invocation
> >   intel/fs: Assign constant locations if they haven't been assigned
> >   intel/fs: Remove min_dispatch_width from fs_visitor
> >   intel/cs: Drop max_dispatch_width checks from compile_cs
> >   intel/cs: Stop setting dispatch_grf_start_reg
> >   intel/cs: Ignore runtime_check_aads_emit for CS
> >   intel/fs: Mark 64-bit values as being contiguous
> >   intel/cs: Rework the way thread local ID is handled
> >   intel/cs: Re-run final NIR optimizations for each SIMD size
> >   intel/cs: Re-run final NIR optimizations for each SIMD size
> >   intel/cs: Push subgroup ID instead of base thread ID
> >   intel/compiler/fs: Set up subgroup invocation as a system value
> >   intel/fs: Rework zero-length URB write handling
> >   intel/eu: Make automatic exec sizes a configurable option
> >   intel/eu: Explicitly set EXECUTE_1 where needed
> >   intel/fs: Explicitly set EXECUTE_1 where needed
> >   intel/fs: Don't use automatic exec size inference
> >   nir: Add a new subgroups lowering pass
> >   nir: Add a ssa_dest_init_for_type helper
> >   nir: Make ballot intrinsics variable-size
> >   nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
> >   nir/lower_subgroups: Lower ballot intrinsics to the specified bit
> > size
> >   nir,intel/compiler: Use a fixed subgroup size
> >   spirv: Add a vtn_constant_value helper
> >   spirv: Rework barriers
> >   nir: Validate base types on 

Re: [Mesa-dev] [PATCH v3 00/48] nir, intel: Prerequisites for subgroups

2017-10-26 Thread Jason Ekstrand
On Thu, Oct 26, 2017 at 5:15 AM, Iago Toral  wrote:

> I left a few minor comments in patches 1, 2, 8 and 14. Otherwise
> patches 1-2, 4-5 and 7-14 (3 and 6 already have Rb) are:
>
> Reviewed-by: Iago Toral Quiroga 
>
> I feel like patches 10, 11 could maybe use another extra review if
> there is someone who wants to do it, since I am not very familiar with
> how all the indirect addressing stuff works and the restrictions of the
> hardware that affect this.
>
> Patch 14 looks good, although the part where you locate the DO block
> for a matching WHILE could probably use the review of someone else more
> familiar with the CFG code than me.
>

Curro has been working on trying to come up with a better solution to the
problem so that patch will hopefully get replaced.


> Iago
>
> On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> > This series is a third respin of my subgroups prerequisites series
> > that
> > that I sent out a few weeks ago.  Not a whole lot has changed but
> > there are
> > some new patches.  Primarily,
> >
> >  1) Some patches which were reviewed by Matt and Lionel were pushed
> > and are
> > no longer in the series.  Thanks guys!
> >
> >  2) I've applied R-B tags from various people for patches which are
> > reviewed but depend on still unreviewed patches.
> >
> >  3) A few patches to fix little-core.  In particular, the extra
> > little-core
> > EU restrictions cause problems for BROADCAST, MOV_INDIRECT, and
> > integer
> > MUL.
> >
> > This series can be found on fd.o nere:
> >
> > https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/subgroup-p
> > rereqs-v3
> >
> > Happy reviewing!
> >
> >
> > Cc: Matt Turner 
> > Cc: Francisco Jerez 
> > Cc: Connor Abbott 
> >
> > Francisco Jerez (1):
> >   intel/fs: Restrict live intervals to the subset possibly reachable
> > from any definition.
> >
> > Jason Ekstrand (47):
> >   intel/fs: Pass builders instead of blocks into emit_[un]zip
> >   intel/fs: Be more explicit about our placement of [un]zip
> >   intel/fs: Use ANY/ALL32 predicates in SIMD32
> >   intel/fs: Don't stomp f0.1 in SIMD16 ballot
> >   intel/fs: Use an explicit D type for vote any/all/eq intrinsics
> >   intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
> >   intel/compiler: Add some restrictions to MOV_INDIRECT and BROADCAST
> >   intel/eu: Just modify the offset in brw_broadcast
> >   intel/eu/reg: Add a subscript() helper
> >   intel/eu: Fix broadcast instruction for 64-bit values on little-
> > core
> >   intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core
> >   intel/fs: Fix integer multiplication lowering for src/dst hazards
> >   intel/fs: Use the original destination region for int MUL lowering
> >   i965/fs: Extend the live ranges of VGRFs which leave loops
> >   i965/fs/nir: Simplify 64-bit store_output
> >   i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write
> >   i965/fs/nir: Minor refactor of store_output
> >   i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
> >   intel/fs: Protect opt_algebraic from OOB BROADCAST indices
> >   intel/fs: Uniformize the index in readInvocation
> >   intel/fs: Retype dest to match value in read[First]Invocation
> >   intel/fs: Assign constant locations if they haven't been assigned
> >   intel/fs: Remove min_dispatch_width from fs_visitor
> >   intel/cs: Drop max_dispatch_width checks from compile_cs
> >   intel/cs: Stop setting dispatch_grf_start_reg
> >   intel/cs: Ignore runtime_check_aads_emit for CS
> >   intel/fs: Mark 64-bit values as being contiguous
> >   intel/cs: Rework the way thread local ID is handled
> >   intel/cs: Re-run final NIR optimizations for each SIMD size
> >   intel/cs: Re-run final NIR optimizations for each SIMD size
> >   intel/cs: Push subgroup ID instead of base thread ID
> >   intel/compiler/fs: Set up subgroup invocation as a system value
> >   intel/fs: Rework zero-length URB write handling
> >   intel/eu: Make automatic exec sizes a configurable option
> >   intel/eu: Explicitly set EXECUTE_1 where needed
> >   intel/fs: Explicitly set EXECUTE_1 where needed
> >   intel/fs: Don't use automatic exec size inference
> >   nir: Add a new subgroups lowering pass
> >   nir: Add a ssa_dest_init_for_type helper
> >   nir: Make ballot intrinsics variable-size
> >   nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
> >   nir/lower_subgroups: Lower ballot intrinsics to the specified bit
> > size
> >   nir,intel/compiler: Use a fixed subgroup size
> >   spirv: Add a vtn_constant_value helper
> >   spirv: Rework barriers
> >   nir: Validate base types on array dereferences
> >   compiler/nir_types: Handle vectors in glsl_get_array_element
> >
> >  src/compiler/Makefile.sources  |   2 +-
> >  src/compiler/glsl/glsl_to_nir.cpp  |   1 +
> >  src/compiler/nir/nir.h   

Re: [Mesa-dev] [PATCH v3 00/48] nir, intel: Prerequisites for subgroups

2017-10-26 Thread Iago Toral
I left a few minor comments in patches 1, 2, 8 and 14. Otherwise
patches 1-2, 4-5 and 7-14 (3 and 6 already have Rb) are:

Reviewed-by: Iago Toral Quiroga 

I feel like patches 10, 11 could maybe use another extra review if
there is someone who wants to do it, since I am not very familiar with
how all the indirect addressing stuff works and the restrictions of the
hardware that affect this.

Patch 14 looks good, although the part where you locate the DO block
for a matching WHILE could probably use the review of someone else more
familiar with the CFG code than me.

Iago

On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote:
> This series is a third respin of my subgroups prerequisites series
> that
> that I sent out a few weeks ago.  Not a whole lot has changed but
> there are
> some new patches.  Primarily,
> 
>  1) Some patches which were reviewed by Matt and Lionel were pushed
> and are
> no longer in the series.  Thanks guys!
> 
>  2) I've applied R-B tags from various people for patches which are
> reviewed but depend on still unreviewed patches.
> 
>  3) A few patches to fix little-core.  In particular, the extra
> little-core
> EU restrictions cause problems for BROADCAST, MOV_INDIRECT, and
> integer
> MUL.
> 
> This series can be found on fd.o nere:
> 
> https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/subgroup-p
> rereqs-v3
> 
> Happy reviewing!
> 
> 
> Cc: Matt Turner 
> Cc: Francisco Jerez 
> Cc: Connor Abbott 
> 
> Francisco Jerez (1):
>   intel/fs: Restrict live intervals to the subset possibly reachable
> from any definition.
> 
> Jason Ekstrand (47):
>   intel/fs: Pass builders instead of blocks into emit_[un]zip
>   intel/fs: Be more explicit about our placement of [un]zip
>   intel/fs: Use ANY/ALL32 predicates in SIMD32
>   intel/fs: Don't stomp f0.1 in SIMD16 ballot
>   intel/fs: Use an explicit D type for vote any/all/eq intrinsics
>   intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all
>   intel/compiler: Add some restrictions to MOV_INDIRECT and BROADCAST
>   intel/eu: Just modify the offset in brw_broadcast
>   intel/eu/reg: Add a subscript() helper
>   intel/eu: Fix broadcast instruction for 64-bit values on little-
> core
>   intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core
>   intel/fs: Fix integer multiplication lowering for src/dst hazards
>   intel/fs: Use the original destination region for int MUL lowering
>   i965/fs: Extend the live ranges of VGRFs which leave loops
>   i965/fs/nir: Simplify 64-bit store_output
>   i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write
>   i965/fs/nir: Minor refactor of store_output
>   i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
>   intel/fs: Protect opt_algebraic from OOB BROADCAST indices
>   intel/fs: Uniformize the index in readInvocation
>   intel/fs: Retype dest to match value in read[First]Invocation
>   intel/fs: Assign constant locations if they haven't been assigned
>   intel/fs: Remove min_dispatch_width from fs_visitor
>   intel/cs: Drop max_dispatch_width checks from compile_cs
>   intel/cs: Stop setting dispatch_grf_start_reg
>   intel/cs: Ignore runtime_check_aads_emit for CS
>   intel/fs: Mark 64-bit values as being contiguous
>   intel/cs: Rework the way thread local ID is handled
>   intel/cs: Re-run final NIR optimizations for each SIMD size
>   intel/cs: Re-run final NIR optimizations for each SIMD size
>   intel/cs: Push subgroup ID instead of base thread ID
>   intel/compiler/fs: Set up subgroup invocation as a system value
>   intel/fs: Rework zero-length URB write handling
>   intel/eu: Make automatic exec sizes a configurable option
>   intel/eu: Explicitly set EXECUTE_1 where needed
>   intel/fs: Explicitly set EXECUTE_1 where needed
>   intel/fs: Don't use automatic exec size inference
>   nir: Add a new subgroups lowering pass
>   nir: Add a ssa_dest_init_for_type helper
>   nir: Make ballot intrinsics variable-size
>   nir/lower_system_values: Lower SUBGROUP_*_MASK based on type
>   nir/lower_subgroups: Lower ballot intrinsics to the specified bit
> size
>   nir,intel/compiler: Use a fixed subgroup size
>   spirv: Add a vtn_constant_value helper
>   spirv: Rework barriers
>   nir: Validate base types on array dereferences
>   compiler/nir_types: Handle vectors in glsl_get_array_element
> 
>  src/compiler/Makefile.sources  |   2 +-
>  src/compiler/glsl/glsl_to_nir.cpp  |   1 +
>  src/compiler/nir/nir.h |  25 +-
>  src/compiler/nir/nir_intrinsics.h  |  13 +-
>  .../nir/nir_lower_read_invocation_to_scalar.c  | 112 -
>  src/compiler/nir/nir_lower_subgroups.c | 257
> 
>  src/compiler/nir/nir_lower_system_values.c |   4 +-
>  src/compiler/nir/nir_opt_intrinsics.c  |  69 +-
>  src/compiler/nir/nir_validate.c