If further optimization cause some of the operands to either be the same
or be constants, undoing the transformation can lead to further savings.
I don't know why these patterns were not added in those patches. I did
not check to see which specific patterns actually helped. I just added
all of them for symmetry. This prevents some loop unrolling regressions
Plane Shift caused by Samuel's "nir: implement the GLSL equivalent of if
simplication in nir_opt_if" patch.
Skylake and Broadwell had similar results. (Skylake shown)
total instructions in shared programs: 14369768 -> 14369557 (<.01%)
instructions in affected programs: 44076 -> 43865 (-0.48%)
helped: 141
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 1.50 x̃: 1
helped stats (rel) min: 0.07% max: 1.52% x̄: 0.66% x̃: 0.60%
95% mean confidence interval for instructions value: -1.67 -1.32
95% mean confidence interval for instructions %-change: -0.72% -0.59%
Instructions are helped.
total cycles in shared programs: 532430629 -> 532425772 (<.01%)
cycles in affected programs: 1170832 -> 1165975 (-0.41%)
helped: 101
HURT: 5
helped stats (abs) min: 1 max: 160 x̄: 48.54 x̃: 32
helped stats (rel) min: <.01% max: 8.49% x̄: 2.76% x̃: 2.03%
HURT stats (abs) min: 2 max: 22 x̄: 9.20 x̃: 4
HURT stats (rel) min: <.01% max: 0.05% x̄: 0.02% x̃: <.01%
95% mean confidence interval for cycles value: -53.64 -38.00
95% mean confidence interval for cycles %-change: -3.06% -2.20%
Cycles are helped.
Signed-off-by: Ian Romanick <[email protected]>
Cc: Samuel Pitoiset <[email protected]>
Cc: Timothy Arceri <[email protected]>
---
src/compiler/nir/nir_opt_algebraic.py | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/src/compiler/nir/nir_opt_algebraic.py
b/src/compiler/nir/nir_opt_algebraic.py
index ba788f221a3..f153570105b 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -292,6 +292,8 @@ optimizations = [
(('~fge', ('fmax', a, b), a), True),
(('~flt', a, ('fmin', b, a)), False),
(('~flt', ('fmax', a, b), a), False),
+ (('~fge', a, ('fmax', b, a)), ('fge', a, b)),
+ (('~fge', ('fmin', a, b), a), ('fge', b, a)),
(('ilt', a, ('imax', b, a)), ('ilt', a, b)),
(('ilt', ('imin', a, b), a), ('ilt', b, a)),
@@ -301,7 +303,23 @@ optimizations = [
(('ult', ('umin', a, b), a), ('ult', b, a)),
(('uge', a, ('umin', b, a)), True),
(('uge', ('umax', a, b), a), True),
-
+ (('ilt', a, ('imin', b, a)), False),
+ (('ilt', ('imax', a, b), a), False),
+ (('ige', a, ('imax', b, a)), ('ige', a, b)),
+ (('ige', ('imin', a, b), a), ('ige', b, a)),
+ (('ult', a, ('umin', b, a)), False),
+ (('ult', ('umax', a, b), a), False),
+ (('uge', a, ('umax', b, a)), ('uge', a, b)),
+ (('uge', ('umin', a, b), a), ('uge', b, a)),
+
+ (('ilt', '#a', ('imax', '#b', c)), ('ior', ('ilt', a, b), ('ilt', a, c))),
+ (('ilt', ('imin', '#a', b), '#c'), ('ior', ('ilt', a, c), ('ilt', b, c))),
+ (('ige', '#a', ('imin', '#b', c)), ('ior', ('ige', a, b), ('ige', a, c))),
+ (('ige', ('imax', '#a', b), '#c'), ('ior', ('ige', a, c), ('ige', b, c))),
+ (('ult', '#a', ('umax', '#b', c)), ('ior', ('ult', a, b), ('ult', a, c))),
+ (('ult', ('umin', '#a', b), '#c'), ('ior', ('ult', a, c), ('ult', b, c))),
+ (('uge', '#a', ('umin', '#b', c)), ('ior', ('uge', a, b), ('uge', a, c))),
+ (('uge', ('umax', '#a', b), '#c'), ('ior', ('uge', a, c), ('uge', b, c))),
(('ilt', '#a', ('imin', '#b', c)), ('iand', ('ilt', a, b), ('ilt', a, c))),
(('ilt', ('imax', '#a', b), '#c'), ('iand', ('ilt', a, c), ('ilt', b, c))),
(('ige', '#a', ('imax', '#b', c)), ('iand', ('ige', a, b), ('ige', a, c))),