Re: [Mesa-dev] clover's interface to clang

2020-11-11 Thread Francisco Jerez
I don't remember the specifics of why we ended up interfacing with Clang this way. What is technically wrong with it, specifically? I don't have any objection to switching to the Driver and Compilation interface, nor to translating the "-cl-denorms-are-zero" option to whatever the current option

Re: [Mesa-dev] Rust drivers in Mesa

2020-10-01 Thread Francisco Jerez
Alyssa Rosenzweig writes: > Hi all, > > Recently I've been thinking about the potential for the Rust programming > language in Mesa. Rust bills itself a safe system programming language > with comparable performance to C [0], which is a naturally fit for > graphics driver development. > > Mesa

[Mesa-dev] [PATCHv2 4/4] anv/gen9: Optimize slice and subslice load balancing behavior.

2019-08-12 Thread Francisco Jerez
See "i965/gen9: Optimize slice and subslice load balancing behavior." for the rationale. According to Jason, improves Aztec Ruins performance by 2.7%. Reviewed-by: Kenneth Graunke (v1) v2: Undo CPU performance micro-optimization done in i965 and iris due to lack of data justifying it on

Re: [Mesa-dev] [PATCH 4/4] OPTIONAL: anv/gen9: Optimize slice and subslice load balancing behavior.

2019-08-10 Thread Francisco Jerez
Jason Ekstrand writes: > On Sat, Aug 10, 2019 at 2:22 PM Francisco Jerez > wrote: > >> Jason Ekstrand writes: >> >> > On Fri, Aug 9, 2019 at 7:22 PM Francisco Jerez >> > wrote: >> > >> >> See "i965/gen9: Optimize slice and su

Re: [Mesa-dev] [PATCH 4/4] OPTIONAL: anv/gen9: Optimize slice and subslice load balancing behavior.

2019-08-10 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Aug 9, 2019 at 7:22 PM Francisco Jerez > wrote: > >> See "i965/gen9: Optimize slice and subslice load balancing behavior." >> for the rationale. Marked optional because no performance evaluation >> has been done on

[Mesa-dev] [PATCH 3/4] iris/gen9: Optimize slice and subslice load balancing behavior.

2019-08-09 Thread Francisco Jerez
See "i965/gen9: Optimize slice and subslice load balancing behavior." for the rationale. Reviewed-by: Kenneth Graunke --- src/gallium/drivers/iris/iris_blorp.c | 6 ++ src/gallium/drivers/iris/iris_context.c | 1 + src/gallium/drivers/iris/iris_context.h | 3 +

[Mesa-dev] [PATCH 4/4] OPTIONAL: anv/gen9: Optimize slice and subslice load balancing behavior.

2019-08-09 Thread Francisco Jerez
See "i965/gen9: Optimize slice and subslice load balancing behavior." for the rationale. Marked optional because no performance evaluation has been done on this commit, it is provided to match the hashing settings of the Iris driver. Test reports welcome. --- src/intel/vulkan/anv_genX.h

[Mesa-dev] [PATCH 1/4] i965/gen9: Optimize slice and subslice load balancing behavior.

2019-08-09 Thread Francisco Jerez
The default pixel hashing mode settings used for slice and subslice load balancing are far from optimal under certain conditions (see the comments below for the gory details). The top-of-the-line GT4 parts suffer from a particularly severe performance problem currently due to a subslice load

[Mesa-dev] [PATCH 2/4] intel/genxml: Add GT_MODE hashing defs for Gen9.

2019-08-09 Thread Francisco Jerez
Reviewed-by: Kenneth Graunke --- src/intel/genxml/gen9.xml | 17 + 1 file changed, 17 insertions(+) diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml index 9df7cd82738..0d037489df9 100644 --- a/src/intel/genxml/gen9.xml +++ b/src/intel/genxml/gen9.xml @@ -6477,6

[Mesa-dev] [PATCH] intel/ir: Fix CFG corruption in opt_predicated_break().

2019-07-23 Thread Francisco Jerez
Specifically the optimization of a conditional BREAK + WHILE sequence into a conditional WHILE seems pretty broken. The list of successors of "earlier_block" (where the conditional BREAK was found) is emptied and then re-created with the same edges for no apparent reason. On top of that the list

Re: [Mesa-dev] [PATCH 07/15] clover/spirv: Add functions for parsing arguments, linking programs, etc.

2019-05-11 Thread Francisco Jerez
Karol Herbst writes: > From: Pierre Moreau > > v2 (Karol Herbst): > silence warnings about unhandled enum values > --- > .../clover/spirv/invocation.cpp | 598 ++ > .../clover/spirv/invocation.hpp | 12 + > 2 files changed, 610 insertions(+) > >

Re: [Mesa-dev] [PATCH 06/15] clover/spirv: Add functions for validating SPIR-V binaries

2019-05-11 Thread Francisco Jerez
gt; + // SPIR-V header. > + if (length < 20u) > + return false; > + > + const uint32_t first_word = binary[0u]; > + return (first_word == SpvMagicNumber) || > + (util_bswap32(first_word) == SpvMagicNumber); > +} > + This function seems like dead code.

Re: [Mesa-dev] [PATCH 05/15] meson: Check for SPIRV-Tools and llvm-spirv

2019-05-11 Thread Francisco Jerez
nt saying where to find llvm-spirv (Karol Herbst). > * v3: > - make SPIRV-Tools and llvm-spirv optional (Francisco Jerez); > - bump requirement for llvm-spirv to version 0.2 > * v2: > - Bump the required version of SPIRV-Tools to the latest release; > - Add a dependency on

Re: [Mesa-dev] [PATCH 11/15] rename pipe_llvm_program_header to pipe_binary_program_header

2019-05-11 Thread Francisco Jerez
Karol Herbst writes: > We want to use it for other formats as well, so give it a more generic name > > Signed-off-by: Karol Herbst > Reviewed-by: Francisco Jerez > --- > src/gallium/drivers/r600/evergreen_compute.c | 2 +- > src/gallium/drivers/

Re: [Mesa-dev] [PATCH 2/5] intel/fs: Lower integer multiply correctly when destination stride equals 4.

2019-04-29 Thread Francisco Jerez
Francisco Jerez writes: > Because the "low" temporary needs to be accessed with word type and > twice the original stride, attempting to preserve the alignment of the > original destination can potentially lead to instructions with illegal > destination stride greater than

Re: [Mesa-dev] [PATCH v7] intel/compiler: validate region restrictions for mixed float mode

2019-04-17 Thread Francisco Jerez
mption that all operands are assumed to be packed, so we > + * understand that this might be hinting that there may be an exception > + * for f32 operands with a vstride of 0, so we don't validate this for > + * them while we don't have empirical evidence t

Re: [Mesa-dev] [PATCH v4 00/40] intel: VK_KHR_shader_float16_int8 implementation

2019-04-13 Thread Francisco Jerez
Jason Ekstrand writes: > Quick status check. Mesa 19.1 is supposed to branch in two weeks. Are we > about ready to land this? > Seems pretty close to ready to me... > On Mon, Mar 25, 2019 at 11:13 AM Juan A. Suarez Romero > wrote: > >> On Fri, 2019-03-22 at 17:53 +0100, Iago Toral wrote:

Re: [Mesa-dev] [PATCH v6 32/35] intel/compiler: validate region restrictions for mixed float mode

2019-04-13 Thread Francisco Jerez
"Juan A. Suarez Romero" writes: > On Wed, 2019-04-10 at 17:13 -0700, Francisco Jerez wrote: >> "Juan A. Suarez Romero" writes: >> >> > From: Iago Toral Quiroga >> > >> > v2: >> > - Adapted unit tests to make them c

Re: [Mesa-dev] [PATCH v6 32/35] intel/compiler: validate region restrictions for mixed float mode

2019-04-10 Thread Francisco Jerez
"Juan A. Suarez Romero" writes: > From: Iago Toral Quiroga > > v2: > - Adapted unit tests to make them consistent with the changes done >to the validation of half-float conversions. > > v3 (Curro): > - Check all the accummulators > - Constify declarations > - Do not check src1 type in

Re: [Mesa-dev] [PATCH v6 30/35] intel/compiler: validate region restrictions for half-float conversions

2019-04-10 Thread Francisco Jerez
ions. > - Check restriction on src1. > - Remove invalid test. Reviewed-by: Francisco Jerez > --- > src/intel/compiler/brw_eu_validate.c| 155 +++- > src/intel/compiler/test_eu_validate.cpp | 116 ++ > 2 files changed, 270 insertions(+),

Re: [Mesa-dev] [PATCH v7 28/35] intel/compiler: implement SIMD16 restrictions for mixed-float instructions

2019-04-10 Thread Francisco Jerez
Iago Toral writes: > On Mon, 2019-04-08 at 12:00 -0700, Francisco Jerez wrote: >> "Juan A. Suarez Romero" writes: >> >> > From: Iago Toral Quiroga >> > >> > v2: f32to16/f16to32 can use a :W destination (Curr

Re: [Mesa-dev] [PATCH v7 28/35] intel/compiler: implement SIMD16 restrictions for mixed-float instructions

2019-04-08 Thread Francisco Jerez
"Juan A. Suarez Romero" writes: > From: Iago Toral Quiroga > > v2: f32to16/f16to32 can use a :W destination (Curro) > --- > src/intel/compiler/brw_fs.cpp | 71 +++ > 1 file changed, 71 insertions(+) > > diff --git a/src/intel/compiler/brw_fs.cpp

Re: [Mesa-dev] [PATCH v5 35/38] intel/compiler: validate region restrictions for mixed float mode

2019-04-01 Thread Francisco Jerez
"Juan A. Suarez Romero" writes: > On Wed, 2019-03-27 at 19:37 -0700, Francisco Jerez wrote: >> "Juan A. Suarez Romero" writes: >> >> > From: Iago Toral Quiroga >> > >> > v2: >> > - Adapted unit tests to make them c

Re: [Mesa-dev] [PATCH v5 35/38] intel/compiler: validate region restrictions for mixed float mode

2019-03-27 Thread Francisco Jerez
"Juan A. Suarez Romero" writes: > From: Iago Toral Quiroga > > v2: > - Adapted unit tests to make them consistent with the changes done >to the validation of half-float conversions. > --- > src/intel/compiler/brw_eu_validate.c| 256 ++ > src/intel/compiler/test_eu_validate.cpp

Re: [Mesa-dev] [PATCH v5 33/38] intel/compiler: validate region restrictions for half-float conversions

2019-03-27 Thread Francisco Jerez
"Juan A. Suarez Romero" writes: > From: Iago Toral Quiroga > > v2: > - Consider implicit conversions in 2-src instructions too (Curro) > - For restrictions that involve destination stride requirements >only validate them for Align1, since Align16 always requires >packed data. > -

Re: [Mesa-dev] [PATCH v6 31/38] intel/compiler: implement SIMD16 restrictions for mixed-float instructions

2019-03-27 Thread Francisco Jerez
"Juan A. Suarez Romero" writes: > From: Iago Toral Quiroga > > --- > src/intel/compiler/brw_fs.cpp | 65 +++ > 1 file changed, 65 insertions(+) > > diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp > index 2fc7793709b..3616a7afc31 100644

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-13 Thread Francisco Jerez
Iago Toral writes: > On Tue, 2019-03-12 at 15:44 -0700, Francisco Jerez wrote: >> Iago Toral writes: >> >> > On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote: >> > > On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote: >> > > > Iago

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-12 Thread Francisco Jerez
Francisco Jerez writes: > Iago Toral writes: > >> On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote: >>> On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote: >>> > Iago Toral writes: >>> > >>> > > On Fri, 2019-03-01 at 19:0

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-12 Thread Francisco Jerez
Iago Toral writes: > On Wed, 2019-03-06 at 09:21 +0100, Iago Toral wrote: >> On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote: >> > On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote: >> > > Iago Toral writes: >> > > >> > > >

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-12 Thread Francisco Jerez
Iago Toral writes: > On Tue, 2019-03-05 at 07:35 +0100, Iago Toral wrote: >> On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote: >> > Iago Toral writes: >> > >> > > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote: >> > > > Ia

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-12 Thread Francisco Jerez
Iago Toral writes: > On Mon, 2019-03-04 at 15:36 -0800, Francisco Jerez wrote: >> Iago Toral writes: >> >> > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote: >> > > Iago Toral writes: >> > > >> > > > On Thu, 2019-02-

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-04 Thread Francisco Jerez
Iago Toral writes: > On Fri, 2019-03-01 at 19:04 -0800, Francisco Jerez wrote: >> Iago Toral writes: >> >> > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote: >> > > Iago Toral writes: >> > > >> > > > On Wed, 2019-02-

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-01 Thread Francisco Jerez
Iago Toral writes: > On Fri, 2019-03-01 at 09:39 +0100, Iago Toral wrote: >> On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote: >> > Iago Toral writes: >> > >> > > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote: >> > > > Ia

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-03-01 Thread Francisco Jerez
Iago Toral writes: > On Thu, 2019-02-28 at 09:54 -0800, Francisco Jerez wrote: >> Iago Toral writes: >> >> > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote: >> > > Iago Toral writes: >> > > >> > > > On Tue, 2019-02-

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-02-28 Thread Francisco Jerez
Iago Toral writes: > On Wed, 2019-02-27 at 13:47 -0800, Francisco Jerez wrote: >> Iago Toral writes: >> >> > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez wrote: >> > > Iago Toral Quiroga writes: >> > > >> > > &

Re: [Mesa-dev] [PATCH v5 33/40] intel/compiler: also set F execution type for mixed float mode in BDW

2019-02-28 Thread Francisco Jerez
Iago Toral writes: > On Wed, 2019-02-27 at 15:44 -0800, Francisco Jerez wrote: >> Iago Toral Quiroga writes: >> >> > The section 'Execution Data Types' of 3D Media GPGPU volume, which >> > describes execution types, is exactly the same in BDW and SKL+. >

Re: [Mesa-dev] [PATCH v5 33/40] intel/compiler: also set F execution type for mixed float mode in BDW

2019-02-27 Thread Francisco Jerez
W_REGISTER_TYPE_F; > > - assert(src0_exec_type == BRW_REGISTER_TYPE_F); > - return BRW_REGISTER_TYPE_F; > + assert(src0_exec_type == BRW_REGISTER_TYPE_HF); > + return BRW_REGISTER_TYPE_HF; Not really convinced the function is fully correct, but it should be strictly better with this patch: Acked-by: Francisco Jerez > } > > /** > -- > 2.17.1 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-02-27 Thread Francisco Jerez
Iago Toral writes: > On Tue, 2019-02-26 at 14:54 -0800, Francisco Jerez wrote: >> Iago Toral Quiroga writes: >> >> > --- >> > src/intel/compiler/brw_eu_validate.c| 64 - >> > src/intel/compiler/test_eu_validate.cpp | 122 >>

Re: [Mesa-dev] [PATCH v4 35/40] intel/compiler: validate conversions between 64-bit and 8-bit types

2019-02-26 Thread Francisco Jerez
Iago Toral Quiroga writes: > --- > src/intel/compiler/brw_eu_validate.c| 10 +- > src/intel/compiler/test_eu_validate.cpp | 46 + > 2 files changed, 55 insertions(+), 1 deletion(-) > > diff --git a/src/intel/compiler/brw_eu_validate.c >

Re: [Mesa-dev] [PATCH v4 34/40] intel/compiler: validate region restrictions for half-float conversions

2019-02-26 Thread Francisco Jerez
Iago Toral Quiroga writes: > --- > src/intel/compiler/brw_eu_validate.c| 64 - > src/intel/compiler/test_eu_validate.cpp | 122 > 2 files changed, 185 insertions(+), 1 deletion(-) > > diff --git a/src/intel/compiler/brw_eu_validate.c >

Re: [Mesa-dev] [PATCH v4 33/40] intel/compiler: also set F execution type for mixed float mode in BDW

2019-02-26 Thread Francisco Jerez
Iago Toral Quiroga writes: > The section 'Execution Data Types' of 3D Media GPGPU volume, which > describes execution types, is exactly the same in BDW and SKL+. > > Also, this section states that there is a single execution type, so it > makes sense that this is the wider of the two floating

Re: [Mesa-dev] [PATCH 1/5] intel/fs: Exclude control sources from execution type and region alignment calculations.

2019-02-15 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez > wrote: > >> Currently the execution type calculation will return a bogus value in >> cases like: >> >> mov_indirect(8) vgrf0:w, vgrf1:w, vgrf2:ud, 32u >> >> Which will be

Re: [Mesa-dev] [PATCH 2/5] intel/fs: Lower integer multiply correctly when destination stride equals 4.

2019-02-15 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez > wrote: > >> Because the "low" temporary needs to be accessed with word type and >> twice the original stride, attempting to preserve the alignment of the >> original destination

Re: [Mesa-dev] [PATCH 3/5] intel/fs: Cap dst-aligned region stride to maximum representable hstride value.

2019-02-15 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez > wrote: > >> This is required in combination with the following commit, because >> otherwise if a source region with an extended 8+ stride is present in >> the instruction (which we're about to

Re: [Mesa-dev] [PATCH 4/5] intel/fs: Implement extended strides greater than 4 for IR source regions.

2019-02-14 Thread Francisco Jerez
Jason Ekstrand writes: > On Fri, Jan 18, 2019 at 6:09 PM Francisco Jerez > wrote: > >> Strides up to 32B can be implemented for the source regions of most >> instructions by leveraging either the vertical or the horizontal >> stride of the hardware Align1 r

[Mesa-dev] [PATCH] intel/dump_gpu: Disambiguate between BOs from different GEM handle spaces.

2019-02-07 Thread Francisco Jerez
This fixes a rather astonishing problem that came up while debugging an issue in the Vulkan CTS. Apparently the Vulkan CTS framework has the tendency to create multiple VkDevices, each one with a separate DRM device FD and therefore a disjoint GEM buffer object handle space. Because the

Re: [Mesa-dev] [PATCH] intel/compiler: update validator to account for half-float exec type promotion

2019-02-04 Thread Francisco Jerez
Iago Toral writes: > On Mon, 2019-02-04 at 08:50 +0100, Iago Toral wrote: >> On Fri, 2019-02-01 at 11:23 -0800, Francisco Jerez wrote: >> > Iago Toral writes: >> > >> > > On Fri, 2019-01-25 at 12:54 -0800, Francisco Jerez wrote: >> > > > Ia

Re: [Mesa-dev] [PATCH] intel/compiler: update validator to account for half-float exec type promotion

2019-02-01 Thread Francisco Jerez
Iago Toral writes: > On Fri, 2019-01-25 at 12:54 -0800, Francisco Jerez wrote: >> Iago Toral writes: >> >> > On Thu, 2019-01-24 at 11:45 -0800, Francisco Jerez wrote: >> > > Iago Toral writes: >> > > >> > > > On Wed, 2019-01-

Re: [Mesa-dev] [PATCH] intel/compiler: update validator to account for half-float exec type promotion

2019-01-25 Thread Francisco Jerez
Iago Toral writes: > On Thu, 2019-01-24 at 11:45 -0800, Francisco Jerez wrote: >> Iago Toral writes: >> >> > On Wed, 2019-01-23 at 06:03 -0800, Francisco Jerez wrote: >> > > Iago Toral Quiroga writes: >> > > >> > > >

Re: [Mesa-dev] [PATCH] intel/compiler: Add a file-level description of brw_eu_validate.c

2019-01-24 Thread Francisco Jerez
te.cpp) will be rejected. Strictly by that rule this patch should be rejected ;). Maybe say "functional changes" instead of "patches"? Other than that: Reviewed-by: Francisco Jerez > */ > > #include "brw_eu.h" > -- > 2.19.2 signature.asc Descrip

Re: [Mesa-dev] [PATCH] intel/compiler: update validator to account for half-float exec type promotion

2019-01-24 Thread Francisco Jerez
Iago Toral writes: > On Wed, 2019-01-23 at 06:03 -0800, Francisco Jerez wrote: >> Iago Toral Quiroga writes: >> >> > Commit c84ec70b3a72 implemented execution type promotion to 32-bit >> > for >> > conversions involving half-float registe

Re: [Mesa-dev] [PATCH] intel/compiler: update validator to account for half-float exec type promotion

2019-01-23 Thread Francisco Jerez
Iago Toral Quiroga writes: > Commit c84ec70b3a72 implemented execution type promotion to 32-bit for > conversions involving half-float registers, which empirical testing suggested > was required, but it did not incorporate this change into the assembly > validator > logic. This commits adds

Re: [Mesa-dev] [PATCH] intel/compiler: Reset default flag register in brw_find_live_channel()

2019-01-22 Thread Francisco Jerez
h_insn_state(p); > > + /* The flag register is only used on Gen7 in align1 mode, so avoid setting > +* unnecessary bits in the instruction words, get the information we need > +* and reset the default flag register. Maybe mention here that this also allows more instructio

Re: [Mesa-dev] [PATCH] intel/compiler: Reset default flag register in brw_find_live_channel()

2019-01-22 Thread Francisco Jerez
Matt Turner writes: > emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its > flag_subreg set, so that the IR knows which flag is accessed. However > the flag is only used on Gen7 in Align1 mode. > > To avoid setting unnecessary bits in the instruction words, get the > information we

Re: [Mesa-dev] [PATCH] intel/compiler: Reset default flag register in brw_find_live_channel()

2019-01-22 Thread Francisco Jerez
Matt Turner writes: > emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its > flag_subreg set, so that the IR knows which flag is accessed. However > the flag is only used on Gen7 in Align1 mode, and it is used as an > explicit source and destination. > > To avoid setting unnecessary

[Mesa-dev] [PATCH 1/5] intel/fs: Exclude control sources from execution type and region alignment calculations.

2019-01-18 Thread Francisco Jerez
Currently the execution type calculation will return a bogus value in cases like: mov_indirect(8) vgrf0:w, vgrf1:w, vgrf2:ud, 32u Which will be considered to have a 32-bit integer execution type even though the actual indirect move operation will be carried out with 16-bit precision.

[Mesa-dev] [PATCH 5/5] intel/fs: Rely on undocumented unrestricted regioning for 32x16-bit integer multiply.

2019-01-18 Thread Francisco Jerez
Even though the hardware spec claims that any "integer DWord multiply" operation is affected by the regioning restrictions of CHV/BXT/GLK, this is inconsistent with the behavior of the simulator and with empirical evidence -- Return false from has_dst_aligned_region_restriction() for such

[Mesa-dev] [PATCH 2/5] intel/fs: Lower integer multiply correctly when destination stride equals 4.

2019-01-18 Thread Francisco Jerez
Because the "low" temporary needs to be accessed with word type and twice the original stride, attempting to preserve the alignment of the original destination can potentially lead to instructions with illegal destination stride greater than four. Because the CHV/BXT alignment restrictions are

[Mesa-dev] [PATCH 3/5] intel/fs: Cap dst-aligned region stride to maximum representable hstride value.

2019-01-18 Thread Francisco Jerez
This is required in combination with the following commit, because otherwise if a source region with an extended 8+ stride is present in the instruction (which we're about to declare legal) we'll end up emitting code that attempts to write to such a region, even though strides greater than four

[Mesa-dev] [PATCH 4/5] intel/fs: Implement extended strides greater than 4 for IR source regions.

2019-01-18 Thread Francisco Jerez
Strides up to 32B can be implemented for the source regions of most instructions by leveraging either the vertical or the horizontal stride of the hardware Align1 region. The main motivation for this is that currently the lower_integer_multiplication() pass will happily double the stride of one

Re: [Mesa-dev] [PATCH v10 09/20] clover: Track flags per module section

2019-01-18 Thread Francisco Jerez
Pierre Moreau writes: > One flag that needs to be tracked is whether a library is allowed to > received mathematics optimisations or not, as the authorisation is given > when creating the library while the optimisations are specified when > creating the executable. > > Reviewed-by: Aaron Watry

Re: [Mesa-dev] [PATCH v10 06/20] clover/api: Rework the validation of devices for building

2019-01-18 Thread Francisco Jerez
Pierre Moreau writes: > Reviewed-by: Francisco Jerez > > Changes since: > * v5: > - Drop the `valid_devs` argument to `validate_build_common()` > (Francisco Jerez) > - Change `clLinkProgram()` to initialise `prog`’s devices prior to > calling `validate

Re: [Mesa-dev] [PATCH] intel/fs: Don't apply the des stride alignment rule to accumulators

2019-01-17 Thread Francisco Jerez
Jason Ekstrand writes: > On Thu, Jan 17, 2019 at 3:34 PM Francisco Jerez > wrote: > >> Jason Ekstrand writes: >> >> > Bah... previous e-mail unfinished. Please ignore. >> > >> > On Thu, Jan 17, 2019 at 4:15 AM Francisco Jere

Re: [Mesa-dev] [PATCH] intel/fs: Don't apply the des stride alignment rule to accumulators

2019-01-17 Thread Francisco Jerez
> > mov(8)g9<1>UD g5<8,4,2>UD { align1 1Q }; > mul(8)acc0<1>UD g9<8,8,1>UD 0x0004UW { align1 1Q }; > mach(8) g6<1>UD g5<8,4,2>UD 0x0004UD { align1 1Q AccWrEnable }; > > Fixes: efa4e4bc5fc "int

Re: [Mesa-dev] [PATCH] intel/fs: Don't apply the des stride alignment rule to accumulators

2019-01-17 Thread Francisco Jerez
Jason Ekstrand writes: > Bah... previous e-mail unfinished. Please ignore. > > On Thu, Jan 17, 2019 at 4:15 AM Francisco Jerez > wrote: > >> Jason Ekstrand writes: >> >> > The pass was discovered to cause problems with the MUL+MACH combinati

Re: [Mesa-dev] [PATCH] intel/fs: Don't apply the des stride alignment rule to accumulators

2019-01-17 Thread Francisco Jerez
component. > accumulator are handled for different instruction types. > > Fixes: efa4e4bc5fc "intel/fs: Introduce regioning lowering pass" > Cc: Francisco Jerez > --- > src/intel/compiler/brw_fs_lower_regioning.cpp | 16 +++- > 1 file changed,

[Mesa-dev] [PATCH] intel/fs: Promote execution type to 32-bit when any half-float conversion is needed.

2019-01-15 Thread Francisco Jerez
The docs are fairly incomplete and inconsistent about it, but this seems to be the reason why half-float destinations are required to be DWORD-aligned on BDW+ projects. This way the regioning lowering pass will make sure that the destination components of W to HF and HF to W conversions are

Re: [Mesa-dev] [PATCH v3 01/42] intel/compiler: handle conversions between int and half-float on atom

2019-01-15 Thread Francisco Jerez
Iago Toral Quiroga writes: > v2: adapted to work with the new regioning lowering pass > > Reviewed-by: Topi Pohjolainen (v1) > --- > src/intel/compiler/brw_ir_fs.h | 33 ++--- > 1 file changed, 26 insertions(+), 7 deletions(-) > > diff --git

Re: [Mesa-dev] [PATCH 03/10] intel/fs: Fix bug in lower_simd_width while splitting an instruction which was already split.

2019-01-07 Thread Francisco Jerez
Iago Toral writes: > On Mon, 2019-01-07 at 11:58 -0800, Francisco Jerez wrote: >> Iago Toral writes: >> >> > On Sat, 2018-12-29 at 12:38 -0800, Francisco Jerez wrote: >> > > This seems to be a problem in combination with the >> > > lower_reg

Re: [Mesa-dev] [PATCH 08/10] intel/fs: Remove existing lower_conversions pass.

2019-01-07 Thread Francisco Jerez
Iago Toral writes: > On Sat, 2018-12-29 at 12:39 -0800, Francisco Jerez wrote: >> It's redundant with the functionality provided by lower_regioning >> now. >> --- >> src/intel/Makefile.sources| 1 - >> src/intel/compiler/brw_fs.cpp

Re: [Mesa-dev] [PATCHv2 07/10] intel/fs: Introduce regioning lowering pass.

2019-01-07 Thread Francisco Jerez
Iago Toral writes: > On Sat, 2019-01-05 at 14:03 -0800, Francisco Jerez wrote: >> This legalization pass is meant to handle situations where the source >> or destination regioning controls of an instruction are unsupported >> by >> the hardware and need to be

Re: [Mesa-dev] [PATCH 05/10] intel/fs: Respect CHV/BXT regioning restrictions in copy propagation pass.

2019-01-07 Thread Francisco Jerez
Iago Toral writes: > On Sat, 2018-12-29 at 12:38 -0800, Francisco Jerez wrote: >> Currently the visitor attempts to enforce the regioning restrictions >> that apply to double-precision instructions on CHV/BXT at NIR-to-i965 >> translation time. It is possible though for

Re: [Mesa-dev] [PATCH 03/10] intel/fs: Fix bug in lower_simd_width while splitting an instruction which was already split.

2019-01-07 Thread Francisco Jerez
Iago Toral writes: > On Sat, 2018-12-29 at 12:38 -0800, Francisco Jerez wrote: >> This seems to be a problem in combination with the lower_regioning >> pass introduced by a future commit, which can modify a SIMD-split >> instruction causing its execution size to

Re: [Mesa-dev] [Mesa-stable] [PATCH 02/10] intel/fs: Implement quad swizzles on ICL+.

2019-01-07 Thread Francisco Jerez
Iago Toral writes: > On Sat, 2018-12-29 at 12:38 -0800, Francisco Jerez wrote: >> Align16 is no longer a thing, so a new implementation is provided >> using Align1 instead. Not all possible swizzles can be represented >> as >> a single Align1 region, but s

[Mesa-dev] [PATCHv2 07/10] intel/fs: Introduce regioning lowering pass.

2019-01-05 Thread Francisco Jerez
This legalization pass is meant to handle situations where the source or destination regioning controls of an instruction are unsupported by the hardware and need to be lowered away into separate instructions. This should be more reliable and future-proof than the current approach of handling

Re: [Mesa-dev] [PATCH 07/10] intel/fs: Introduce regioning lowering pass.

2019-01-05 Thread Francisco Jerez
Francisco Jerez writes: > Iago Toral writes: > >> On Sat, 2018-12-29 at 12:39 -0800, Francisco Jerez wrote: >>> This legalization pass is meant to handle situations where the source >>> or destination regioning controls of an instruction are unsupported &g

Re: [Mesa-dev] [PATCH v2 39/53] intel/compiler: add a helper to do conversions between integer and half-float

2019-01-04 Thread Francisco Jerez
Iago Toral writes: > On Wed, 2019-01-02 at 15:00 -0800, Francisco Jerez wrote: >> Iago Toral Quiroga writes: >> >> > There are hardware restrictions to consider that seem to affect >> > atom platforms >> > only. >> >> Same comment h

Re: [Mesa-dev] [PATCH 07/10] intel/fs: Introduce regioning lowering pass.

2019-01-04 Thread Francisco Jerez
Iago Toral writes: > On Sat, 2018-12-29 at 12:39 -0800, Francisco Jerez wrote: >> This legalization pass is meant to handle situations where the source >> or destination regioning controls of an instruction are unsupported >> by >> the hardware and need to be

Re: [Mesa-dev] [PATCH v2 39/53] intel/compiler: add a helper to do conversions between integer and half-float

2019-01-02 Thread Francisco Jerez
Iago Toral Quiroga writes: > There are hardware restrictions to consider that seem to affect atom platforms > only. Same comment here as for PATCH 13 of this series. This and PATCH 40 shouldn't be necessary anymore with [1] in place. Please drop them. [1]

Re: [Mesa-dev] [PATCH v2 13/53] intel/compiler: add a helper to handle conversions to 64-bit in atom

2019-01-02 Thread Francisco Jerez
This patch is redundant with the regioning lowering pass I sent a few days ago [1]. The problem with this approach is that on the one hand it's easy for the back-end compiler to cause code which was legalized at NIR translation time to become illegal again accidentally, on the other hand there's

[Mesa-dev] [PATCH 02/10] intel/fs: Implement quad swizzles on ICL+.

2018-12-29 Thread Francisco Jerez
Align16 is no longer a thing, so a new implementation is provided using Align1 instead. Not all possible swizzles can be represented as a single Align1 region, but some fast paths are provided for frequently used swizzles that can be represented efficiently in Align1 mode. Fixes ~90 subgroup

[Mesa-dev] Assorted bug fixes and improvements back-ported from an internal branch.

2018-12-29 Thread Francisco Jerez
These are a number of fixes and clean-ups we've been carrying around for a while in an internal branch. Most of the fixes are required for conformance of a future platform, but due to their nature some of them are likely to affect shipping platforms as well -- Especially the issues addressed by

[Mesa-dev] [PATCH 07/10] intel/fs: Introduce regioning lowering pass.

2018-12-29 Thread Francisco Jerez
This legalization pass is meant to handle situations where the source or destination regioning controls of an instruction are unsupported by the hardware and need to be lowered away into separate instructions. This should be more reliable and future-proof than the current approach of handling

[Mesa-dev] [PATCH 03/10] intel/fs: Fix bug in lower_simd_width while splitting an instruction which was already split.

2018-12-29 Thread Francisco Jerez
This seems to be a problem in combination with the lower_regioning pass introduced by a future commit, which can modify a SIMD-split instruction causing its execution size to become illegal again. A subsequent call to lower_simd_width() would hit this bug on a future platform. Cc:

[Mesa-dev] [PATCH 04/10] intel/eu/gen7: Fix brw_MOV() with DF destination and strided source.

2018-12-29 Thread Francisco Jerez
I triggered this bug while prototyping code for a future platform on IVB. Could be a problem today though if a strided move is copy-propagated into a type-converting move with DF destination. Cc: mesa-sta...@lists.freedesktop.org --- src/intel/compiler/brw_eu_emit.c | 11 --- 1 file

[Mesa-dev] [PATCH 05/10] intel/fs: Respect CHV/BXT regioning restrictions in copy propagation pass.

2018-12-29 Thread Francisco Jerez
Currently the visitor attempts to enforce the regioning restrictions that apply to double-precision instructions on CHV/BXT at NIR-to-i965 translation time. It is possible though for the copy propagation pass to violate this restriction if a strided move is propagated into one of the affected

[Mesa-dev] [PATCH 09/10] intel/fs: Remove nasty open-coded CHV/BXT 64-bit workarounds.

2018-12-29 Thread Francisco Jerez
--- src/intel/compiler/brw_fs_builder.h | 68 +- src/intel/compiler/brw_fs_nir.cpp | 89 +++-- 2 files changed, 12 insertions(+), 145 deletions(-) diff --git a/src/intel/compiler/brw_fs_builder.h b/src/intel/compiler/brw_fs_builder.h index

[Mesa-dev] [PATCH 01/10] intel/fs: Handle source modifiers in lower_integer_multiplication().

2018-12-29 Thread Francisco Jerez
lower_integer_multiplication() implements 32x32-bit multiplication on some platforms by bit-casting one of the 32-bit sources into two 16-bit unsigned integer portions. This can give incorrect results if the original instruction specified a source modifier. Fix it by emitting an additional MOV

[Mesa-dev] [PATCH 10/10] intel/fs: Remove FS_OPCODE_UNPACK_HALF_2x16_SPLIT opcodes.

2018-12-29 Thread Francisco Jerez
These are broken on a future platform, but it turns out we don't need to fix them, since they're just type-converting moves with strided source. Kill them. --- src/intel/compiler/brw_eu_defines.h | 2 -- src/intel/compiler/brw_fs.cpp | 2 -- src/intel/compiler/brw_fs.h

[Mesa-dev] [PATCH 06/10] intel/fs: Constify fs_inst::can_do_source_mods().

2018-12-29 Thread Francisco Jerez
--- src/intel/compiler/brw_fs.cpp | 2 +- src/intel/compiler/brw_ir_fs.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 4aacc72a1b7..889509badab 100644 --- a/src/intel/compiler/brw_fs.cpp +++

[Mesa-dev] [PATCH 08/10] intel/fs: Remove existing lower_conversions pass.

2018-12-29 Thread Francisco Jerez
It's redundant with the functionality provided by lower_regioning now. --- src/intel/Makefile.sources| 1 - src/intel/compiler/brw_fs.cpp | 1 - src/intel/compiler/brw_fs.h | 1 - .../compiler/brw_fs_lower_conversions.cpp | 132

Re: [Mesa-dev] [PATCH] intel/compiler: Flag surface reads as having side-effects

2018-11-26 Thread Francisco Jerez
Jason Ekstrand writes: > --- > src/intel/compiler/brw_shader.cpp | 12 > 1 file changed, 12 insertions(+) > > diff --git a/src/intel/compiler/brw_shader.cpp > b/src/intel/compiler/brw_shader.cpp > index 34b8f3acf93..5cb91e0dce9 100644 > --- a/src/intel/compiler/brw_shader.cpp >

Re: [Mesa-dev] [PATCH 5/5] anv/icl: Set use full ways in L3CNTLREG

2018-11-26 Thread Francisco Jerez
Anuj Phogat writes: > L3 allocation table in h/w specification recommends using 4 KB > granularity for programming allocation fields in L3CNTLREG. > > Signed-off-by: Anuj Phogat > Cc: Kenneth Graunke > Cc: Francisco Jerez > Cc: Lionel Landwerlin Reviewed-by: Francisco

Re: [Mesa-dev] [PATCH 3/5] intel/icl: Set way_size_per_bank to 4

2018-11-26 Thread Francisco Jerez
Anuj Phogat writes: > Signed-off-by: Anuj Phogat > Cc: Kenneth Graunke > Cc: Francisco Jerez > Cc: Lionel Landwerlin Reviewed-by: Francisco Jerez > --- > src/intel/common/gen_l3_config.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > di

Re: [Mesa-dev] [PATCH 2/5] i965/icl: Set use full ways in L3CNTLREG

2018-11-26 Thread Francisco Jerez
Anuj Phogat writes: > L3 allocation table in h/w specification recommends using 4 KB > granularity for programming allocation fields in L3CNTLREG. > > Signed-off-by: Anuj Phogat > Cc: Kenneth Graunke > Cc: Francisco Jerez > Cc: Lionel Landwerlin > --- >

Re: [Mesa-dev] [PATCH V2 1/4] i965/icl: Fix L3 configurations

2018-11-26 Thread Francisco Jerez
Anuj Phogat writes: > Use L3 configuration specified in h/w specification. > > V2: Drop configs which do under allocation of l3 cache. > Bump up the comment above table. > > Signed-off-by: Anuj Phogat > Cc: Kenneth Graunke > Cc: Francisco Jerez Reviewed-by: Franci

Re: [Mesa-dev] [PATCH 1/5] i965/icl: Fix L3 configurations

2018-11-16 Thread Francisco Jerez
Anuj Phogat writes: > On Fri, Nov 16, 2018 at 6:21 AM Eero Tamminen > wrote: >> >> Hi, >> >> On 16.11.2018 10.33, Francisco Jerez wrote: >> > Kenneth Graunke writes: >> [...] >> >> Perhaps we'll get both configs working, and then will

Re: [Mesa-dev] [PATCH 1/5] i965/icl: Fix L3 configurations

2018-11-16 Thread Francisco Jerez
Kenneth Graunke writes: > On Thursday, November 15, 2018 11:16:09 PM PST Francisco Jerez wrote: >> Kenneth Graunke writes: >> >> > On Thursday, November 15, 2018 5:51:18 PM PST Francisco Jerez wrote: >> >> Anuj Phogat writes: >> >> >

Re: [Mesa-dev] [PATCH 1/5] i965/icl: Fix L3 configurations

2018-11-15 Thread Francisco Jerez
Kenneth Graunke writes: > On Thursday, November 15, 2018 5:51:18 PM PST Francisco Jerez wrote: >> Anuj Phogat writes: >> >> > Use L3 configuration table specified in h/w specification. >> > >> > Signed-off-by: Anuj Phogat >> > Cc: Kenneth

Re: [Mesa-dev] [PATCH 1/5] i965/icl: Fix L3 configurations

2018-11-15 Thread Francisco Jerez
Anuj Phogat writes: > Use L3 configuration table specified in h/w specification. > > Signed-off-by: Anuj Phogat > Cc: Kenneth Graunke > Cc: Francisco Jerez > Cc: Lionel Landwerlin > --- > src/intel/common/gen_l3_config.c | 16 ++-- > 1 file changed, 10

  1   2   3   4   5   6   7   8   9   10   >