NIR already has these so they are redundant. A run of shader-db confirms
that the only cases where these backend optimizations are activated
are some Tomb Raider shaders where the affected variables are qualified
as "precise", which is why NIR won't apply them and why the backend
shouldn't either
The original SrcType is a 3-bit field that takes a subset of the types
supported for the hardware for 3-source instructions. Since gen8,
when the half-float type was added, 3-source floating point operations
can use use mixed precision mode, where not all the operands have the
same floating-point
v2 (Topi):
- Make bit-size handling order be 16-bit, 32-bit, 64-bit
- Clamp lower exponent range at -28 instead of -30.
Reviewed-by: Topi Pohjolainen
Reviewed-by: Jason Ekstrand
---
src/compiler/nir/nir_opt_algebraic.py | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff
v2:
- Merge Float16 and Int8 capabilities into a single patch
Reviewed-by: Jason Ekstrand (v1)
---
src/compiler/shader_info.h| 2 ++
src/compiler/spirv/spirv_to_nir.c | 8 ++--
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/src/compiler/shader_info.h
---
src/intel/compiler/brw_fs_cmod_propagation.cpp | 8 +++-
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/src/intel/compiler/brw_fs_cmod_propagation.cpp
b/src/intel/compiler/brw_fs_cmod_propagation.cpp
index 7bb5c9afbc9..dfef9d720a2 100644
---
Reviewed-by: Jason Ekstrand
---
src/intel/compiler/brw_fs_nir.cpp | 25 +
1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/intel/compiler/brw_fs_nir.cpp
b/src/intel/compiler/brw_fs_nir.cpp
index 57bc8a01a91..2f3ad554147 100644
---
v2:
- Merge Float16 and Int8 into a single patch.
- Merge extension enable.
Reviewed-by: Jason Ekstrand (v1)
---
src/intel/vulkan/anv_device.c | 9 +
src/intel/vulkan/anv_extensions.py | 1 +
2 files changed, 10 insertions(+)
diff --git a/src/intel/vulkan/anv_device.c
---
.../compiler/brw_fs_combine_constants.cpp | 60 +++
1 file changed, 49 insertions(+), 11 deletions(-)
diff --git a/src/intel/compiler/brw_fs_combine_constants.cpp
b/src/intel/compiler/brw_fs_combine_constants.cpp
index e0c95d379b8..24307e365ab 100644
---
---
src/intel/compiler/brw_fs_nir.cpp | 15 +++
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/src/intel/compiler/brw_fs_nir.cpp
b/src/intel/compiler/brw_fs_nir.cpp
index a9fd98bab68..57bc8a01a91 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++
Reviewed-by: Jason Ekstrand
---
src/intel/compiler/brw_reg_type.h | 18 ++
1 file changed, 18 insertions(+)
diff --git a/src/intel/compiler/brw_reg_type.h
b/src/intel/compiler/brw_reg_type.h
index ffbec90d3fe..a3365b7e34c 100644
--- a/src/intel/compiler/brw_reg_type.h
+++
Reviewed-by: Topi Pohjolainen
---
src/intel/compiler/brw_eu_emit.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index 97e0dda5ef1..0f6498614e8 100644
--- a/src/intel/compiler/brw_eu_emit.c
+++
Source0 and Destination extract the floating-point precision automatically
from the SrcType and DstType instruction fields respectively when they are
set to types :F or :HF. For Source1 and Source2 operands, we use the new
1-bit fields Src1Type and Src2Type, where 0 means normal precision and 1
Even if we don't do 3-src algebraic optimizations for MAD and LRP in
the backend any more, the combine constants pass can still do a fine
job putting grouping these constants into single registers for better
register pressure.
v2:
- updated comment to reference register pressure benefits rather
The hardware only allows a stride of 1 on a Byte destination for raw
byte MOV instructions. This is required even when the destination
is the NULL register.
Rather than making sure that we emit a proper NULL:B destination
every time we need one, just fix it at emission time.
Reviewed-by: Jason
We are now using these bits, so don't assert that they are not set, just
avoid compaction in that case.
Reviewed-by: Topi Pohjolainen
---
src/intel/compiler/brw_eu_compact.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/intel/compiler/brw_eu_compact.c
There are hardware restrictions to consider that seem to affect atom platforms
only.
---
src/intel/compiler/brw_fs_nir.cpp | 32 +++
1 file changed, 32 insertions(+)
diff --git a/src/intel/compiler/brw_fs_nir.cpp
b/src/intel/compiler/brw_fs_nir.cpp
index
v2:
- Merge Float16 and Int8 in a single patch
Reviewed-by: Jason Ekstrand (v1)
---
src/intel/vulkan/anv_pipeline.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index 6db9945e0d4..c303ba321c3 100644
---
This function is used in two different scenarios that for 32-bit
instructions are the same, but for 16-bit instructions are not.
One scenario is that in which we are working at a SIMD8 register
level and we need to know if a register is fully defined or written.
This is useful, for example, in
There is a hardware restriction where <0,1,0>:HF in Align16 doesn't replicate
a single 16-bit channel, but instead it replicates a full 32-bit channel.
---
.../compiler/brw_fs_combine_constants.cpp | 24 +--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git
Particularly, we need the same lowewrings we use for 16-bit
integers.
Reviewed-by: Jason Ekstrand
---
src/intel/compiler/brw_nir.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 0641b659979..2c265dd2394
The hardware doesn't support half-float for these.
Reviewed-by: Topi Pohjolainen
Reviewed-by: Jason Ekstrand
---
src/intel/compiler/brw_nir.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index a1ab8290271..8801f7f77b0
3-src instructions don't support immediates, but since 36bc5f06dd22,
we allow them on MAD and LRP relying on the combine constants pass to
fix it up later. However, that pass is specialized for 32-bit float
immediates and can't handle HF constants at present, so this patch
ensures that
Reviewed-by: Jason Ekstrand
---
src/intel/compiler/brw_compiler.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/intel/compiler/brw_compiler.c
b/src/intel/compiler/brw_compiler.c
index fe632c5badc..f885e79c3e6 100644
--- a/src/intel/compiler/brw_compiler.c
+++
There are no 8-bit immediates, so assert in that case.
16-bit immediates are replicated in each word of a 32-bit immediate, so
we only need to check the lower 16-bits.
v2:
- Fix is_zero with half-float to consider -0 as well (Jason).
- Fix is_negative_one for word type.
---
Reviewed-by: Jason Ekstrand
---
src/compiler/nir/nir.h| 1 +
src/compiler/nir/nir_opt_algebraic.py | 1 +
2 files changed, 2 insertions(+)
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index d99cc6b2d38..96a18d9c9bd 100644
--- a/src/compiler/nir/nir.h
+++
---
src/intel/compiler/brw_fs_nir.cpp | 55 ++-
1 file changed, 33 insertions(+), 22 deletions(-)
diff --git a/src/intel/compiler/brw_fs_nir.cpp
b/src/intel/compiler/brw_fs_nir.cpp
index 92ec85a27cc..15715651aa6 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++
v2:
- use nir_fmul_imm and nir_fadd_imm helpers (Jason)
---
src/compiler/spirv/vtn_glsl450.c | 23 ++-
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index b54aeb9b217..f411d17cfe4 100644
---
The PRM states that half-float operands are supported since gen9.
Reviewed-by: Topi Pohjolainen
---
src/intel/compiler/brw_eu_emit.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c
index
The 16-bit polynomial execution doesn't meet Khronos precision requirements.
Also, the half-float denorm range starts at 2^(-14) and with asin taking input
values in the range [0, 1], polynomial approximations can lead to flushing
relatively easy.
An alternative is to use the atan2 formula to
v2:
- use nir_fadd_imm and nir_fmul_imm helpers (Jason)
- rebased on top of new sized boolean opcodes
- use nir_b2f helper
---
src/compiler/spirv/vtn_glsl450.c | 23 +++
1 file changed, 11 insertions(+), 12 deletions(-)
diff --git a/src/compiler/spirv/vtn_glsl450.c
v2
- use nir_fmul_imm helper (Jason)
Reviewed-by: Jason Ekstrand
---
src/compiler/spirv/vtn_glsl450.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index b8b534397a2..ec91d9308c5 100644
---
Reviewed-by: Jason Ekstrand
---
src/compiler/nir/nir.h| 1 +
src/compiler/nir/nir_opt_algebraic.py | 1 +
2 files changed, 2 insertions(+)
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 96a18d9c9bd..2f9df3dfe83 100644
--- a/src/compiler/nir/nir.h
+++
This is available since gen8.
---
src/intel/compiler/brw_reg_type.c | 35 +++
1 file changed, 31 insertions(+), 4 deletions(-)
diff --git a/src/intel/compiler/brw_reg_type.c
b/src/intel/compiler/brw_reg_type.c
index 60240ba1513..72295a2bd75 100644
---
We were assuming 32-bit elements. Also, In SIMD8 we pack 2 vector components
in a single SIMD register, so for example, component Y of a 16-bit vec2
starts is at byte offset 16B. This means that when we compute the offset of
the elements to be differentiated we should not stomp whatever base
There are some hardware restrictions that brw_nir_lower_conversions should
have taken care of before we get here.
---
src/intel/compiler/brw_fs_nir.cpp | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/intel/compiler/brw_fs_nir.cpp
b/src/intel/compiler/brw_fs_nir.cpp
The hardware only has two bits to specify the horizontal stride, so the
maximum horizontal stride we can use is 4. The pass calculates strides
based on the sizes of the types involved, and for conversions between
64-bit and 8-bit types that can lead to strides of 8.
The compiler should make sure
These are not directly supported in hardware and brw_nir_lower_conversions
should have taken care of that before we get here.
---
src/intel/compiler/brw_fs_nir.cpp | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/intel/compiler/brw_fs_nir.cpp
Going forward having these split is a bit more convenient since these two
groups have different restrictions.
---
src/intel/compiler/brw_fs_nir.cpp | 8
1 file changed, 8 insertions(+)
diff --git a/src/intel/compiler/brw_fs_nir.cpp
b/src/intel/compiler/brw_fs_nir.cpp
index
Broadwell hardware has a bug that manifests in SIMD8 executions of
16-bit MAD instructions when any of the sources is a Y or W component.
We pack these components in the same SIMD register as components X and
Z respectively, but starting at offset 16B (so they live in the second
half of the
From the Skylake PRM, Extended Math Function:
"The execution size must be no more than 8 when half-floats
are used in source or destination operand."
Earlier generations do not support Extended Math with half-float.
v2
- Rewrite the code to make it more readable (Jason).
Reviewed-by:
Reviewed-by: Jason Ekstrand
---
src/compiler/spirv/vtn_glsl450.c | 48 ++--
1 file changed, 46 insertions(+), 2 deletions(-)
diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index c8400d6c80f..2540331b6cc 100644
---
Since we handle booleans as integers this makes more sense.
v2:
- rebased to incorporate new boolean conversion opcodes
Reviewed-by: Topi Pohjolainen (v1)
Reviewed-by: Jason Ekstrand (v1)
---
src/intel/compiler/brw_fs_nir.cpp | 20 ++--
1 file changed, 10 insertions(+), 10
Extended math desn't support half-float on these generations.
Reviewed-by: Jason Ekstrand
---
src/intel/compiler/brw_nir.c | 13 -
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 8801f7f77b0..0641b659979
v2:
- fix huge_val for 16-bit, it was mean't to be 2^14 not 10^14.
v3:
- rebase on top of new bool sized opcodes
- use nir_b2f helper
- use nir_fmul_imm helper
---
src/compiler/spirv/vtn_glsl450.c | 18 +++---
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git
v2:
- make 16-bit be its own separate case (Jason)
Reviewed-by: Topi Pohjolainen (v1)
---
src/intel/compiler/brw_fs_nir.cpp | 18 +-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/src/intel/compiler/brw_fs_nir.cpp
b/src/intel/compiler/brw_fs_nir.cpp
index
v2:
- use nir_imm_fmul helper (Jason)
Reviewed-by: Jason Ekstrand
---
src/compiler/nir/nir_builtin_builder.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/compiler/nir/nir_builtin_builder.h
b/src/compiler/nir/nir_builtin_builder.h
index 0e5b9db462a..d1435b37fd4
This version rebases the series on top of a more recent master and addresses
review feedback to v1.
The main change is the rewrite of the type conversion patches to reduce the
growing complexity of the backend following discussions with Jason. The main
actions I took in the end are:
1) Moved the
Some conversions are not directly supported in hardware and need to be
split in two conversion instructions going through an intermediary type.
Doing this at the NIR level simplifies a bit the complexity in the backend.
---
src/intel/Makefile.sources| 1 +
v2:
- use nir_fadd_imm and nir_fmul_imm helpers (Jason)
---
src/compiler/spirv/vtn_glsl450.c | 44 +++-
1 file changed, 26 insertions(+), 18 deletions(-)
diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index ec91d9308c5..c8400d6c80f
---
src/compiler/spirv/vtn_glsl450.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c
index f411d17cfe4..6b471efda2b 100644
--- a/src/compiler/spirv/vtn_glsl450.c
+++ b/src/compiler/spirv/vtn_glsl450.c
@@
---
src/compiler/nir/nir_builder.h | 12
1 file changed, 12 insertions(+)
diff --git a/src/compiler/nir/nir_builder.h b/src/compiler/nir/nir_builder.h
index 826e549019a..74ecde798d5 100644
--- a/src/compiler/nir/nir_builder.h
+++ b/src/compiler/nir/nir_builder.h
@@ -985,6 +985,18 @@
---
src/compiler/nir/nir_builder.h | 12
1 file changed, 12 insertions(+)
diff --git a/src/compiler/nir/nir_builder.h b/src/compiler/nir/nir_builder.h
index 74ecde798d5..14f3baab20b 100644
--- a/src/compiler/nir/nir_builder.h
+++ b/src/compiler/nir/nir_builder.h
@@ -571,6 +571,18 @@
On 17.12.18 04:23, Marek Olšák wrote:
> The definitions weren't changed, but the values were. The names need to
> be different, so that si_debug.c prints both the GFX6 and GFX9 values.
You're right. I have some larger changes of the debug printing that is
smarter about which fields to print
On 30/11/18 17:32, Ian Romanick wrote:
> On 11/29/2018 03:53 PM, Eric Anholt wrote:
>> e<#secure method=pgpmime mode=sign>
>> Erik Faye-Lund writes:
>>
>>> On Wed, 2018-11-28 at 13:43 -0800, Eric Anholt wrote:
Jordan Justen writes:
> This adds the "Developer's Certificate of
If we are outputting the same value to more than one output
component rewrite the inputs to read from a single component.
This will allow the duplicate varying components to be optimised
away by the existing opts.
shader-db results i965 (SKL):
total instructions in shared programs: 12869230 ->
This will be reused by the following patch.
Reviewed-by: Marek Olšák
---
src/compiler/nir/nir_linking_helpers.c | 16 ++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/src/compiler/nir/nir_linking_helpers.c
b/src/compiler/nir/nir_linking_helpers.c
index
This will help the new opt introduced in the following patches
allowing us to remove extra duplicate varyings.
Reviewed-by: Marek Olšák
---
src/gallium/drivers/radeonsi/si_shader_nir.c | 2 --
src/mesa/state_tracker/st_glsl_to_nir.cpp| 4 +++-
2 files changed, 3 insertions(+), 3
The following patches will add support for an addition
optimisation so this function will no longer just optimise varying
constants.
Reviewed-by: Marek Olšák
---
src/amd/vulkan/radv_pipeline.c| 4 ++--
src/compiler/nir/nir.h| 2 +-
This just cleans things up a little and make things more safe for
derefs.
---
src/compiler/nir/nir_linking_helpers.c | 28 +++---
1 file changed, 12 insertions(+), 16 deletions(-)
diff --git a/src/compiler/nir/nir_linking_helpers.c
b/src/compiler/nir/nir_linking_helpers.c
https://bugs.freedesktop.org/show_bug.cgi?id=109086
--- Comment #8 from Dmitry ---
Works good with my applications. When will this patch be included in the
release?
--
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the
An earlier patch that introduced the function failed to handle the case
where an image format layout qualifier is not specified, which is allowed
in Core profiles. In these cases, nir_variable's image format is
GL_NONE, and we don't need to print a debug message for those.
---
The former expects to see SSA-only things, but the latter injects registers.
The assertions in the lowering where not seeing this because they asserted
on the bit_size values only, not on the is_ssa field, so add that assertion
too.
Fixes: 11dc1307794e "nir: Add a bool to int32 lowering pass"
Part 3, wherein I regroup, and once again present an option where
Signed-off-by is optional. (Or ... required :)
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/31
I turned it into 3 patches.
> 21f1070b6ef docs: Add developer-certificate-of-origin.txt
Adds the DCO 1.1 as a separate
101 - 163 of 163 matches
Mail list logo