Re: [Mesa-dev] [PATCH 17/22] nir: Narrow some dot product operations
On 03/05/2018 04:35 PM, Matt Turner wrote: > On Fri, Feb 23, 2018 at 3:56 PM, Ian Romanickwrote: >> From: Ian Romanick >> >> On vector platforms, this helps elide some constant loads. >> >> No changes on Broadwell or Skylake. >> >> Haswell >> total instructions in shared programs: 13093793 -> 13060163 (-0.26%) >> instructions in affected programs: 1277532 -> 1243902 (-2.63%) >> helped: 13216 >> HURT: 95 > > What's going on in the hurt shaders? I'm not completely sure. All of these shaders are negatively affected by the DPH transformation. Only one of the shaders is small enough (19 instructions) to easily examine. In that case, it looks like a couple things end up not getting loaded via VF. Only one of those is the DPH operand. There are appear to be changes in the constant loading in the others as well, which causes the shaders to diverge slightly after about 5 instruction making comparisons between the 70+ instruction shaders frustrating at best. Many of the shorter shaders had flow control, so that exacerbated the issue. I tried a couple modifications to the DPH pattern including 'vec4(is_used_once)' and 'c(is_not_const)'. These had missed results in cycles, and didn't consistently help the instruction counts in the 95. I did discover that I should have listed the transformations in the opposite order. As is, code that matches the last fdot4 pattern will never become a multiply (speculation) because the previous transformations will gradually convert it to a fdot2. Flipping the order helped instructions in 1 program but hurt cycles. Looking at the changed shader, it appears that flipping the order allows an fdot4 to be converted to and fdot2 instead of an fdot3. This allows CSE to eliminate the (new) fdot2. Oddly, flipping the order made a shader-db slightly slower... 1.113±0.711 seconds (0.429%±0.274%) at n=10 for a HSW run on my quadcore HSW desktop. I would have expected it to be slightly faster. *shrug* ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 17/22] nir: Narrow some dot product operations
On Fri, Feb 23, 2018 at 3:56 PM, Ian Romanickwrote: > From: Ian Romanick > > On vector platforms, this helps elide some constant loads. > > No changes on Broadwell or Skylake. > > Haswell > total instructions in shared programs: 13093793 -> 13060163 (-0.26%) > instructions in affected programs: 1277532 -> 1243902 (-2.63%) > helped: 13216 > HURT: 95 What's going on in the hurt shaders? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 17/22] nir: Narrow some dot product operations
Reviewed-by: Samuel Iglesias GonsálvezOn 24/02/18 00:56, Ian Romanick wrote: > From: Ian Romanick > > On vector platforms, this helps elide some constant loads. > > No changes on Broadwell or Skylake. > > Haswell > total instructions in shared programs: 13093793 -> 13060163 (-0.26%) > instructions in affected programs: 1277532 -> 1243902 (-2.63%) > helped: 13216 > HURT: 95 > helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2 > helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78% > HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1 > HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19% > 95% mean confidence interval for instructions value: -2.57 -2.49 > 95% mean confidence interval for instructions %-change: -3.65% -3.54% > Instructions are helped. > > total cycles in shared programs: 409580819 -> 409268463 (-0.08%) > cycles in affected programs: 71730652 -> 71418296 (-0.44%) > helped: 9898 > HURT: 2352 > helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16 > helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50% > HURT stats (abs) min: 2 max: 276 x̄: 23.25 x̃: 6 > HURT stats (rel) min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97% > 95% mean confidence interval for cycles value: -33.19 -17.80 > 95% mean confidence interval for cycles %-change: -4.50% -4.26% > Cycles are helped. > > total fills in shared programs: 82059 -> 82052 (<.01%) > fills in affected programs: 21 -> 14 (-33.33%) > helped: 7 > HURT: 0 > > Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown) > total instructions in shared programs: 11811851 -> 11780605 (-0.26%) > instructions in affected programs: 1155007 -> 1123761 (-2.71%) > helped: 12304 > HURT: 95 > helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2 > helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86% > HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1 > HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19% > 95% mean confidence interval for instructions value: -2.56 -2.48 > 95% mean confidence interval for instructions %-change: -3.71% -3.59% > Instructions are helped. > > total cycles in shared programs: 257618409 -> 257316805 (-0.12%) > cycles in affected programs: 71999580 -> 71697976 (-0.42%) > helped: 9155 > HURT: 2380 > helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16 > helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62% > HURT stats (abs) min: 2 max: 290 x̄: 21.14 x̃: 4 > HURT stats (rel) min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33% > 95% mean confidence interval for cycles value: -34.32 -17.97 > 95% mean confidence interval for cycles %-change: -4.55% -4.29% > Cycles are helped. > > GM45 and Iron Lake had nearly identical results (Iron Lake shown) > total instructions in shared programs: 7886750 -> 7879944 (-0.09%) > instructions in affected programs: 373781 -> 366975 (-1.82%) > helped: 3715 > HURT: 47 > helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1 > helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06% > HURT stats (abs) min: 1 max: 6 x̄: 2.55 x̃: 2 > HURT stats (rel) min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35% > 95% mean confidence interval for instructions value: -1.85 -1.77 > 95% mean confidence interval for instructions %-change: -2.91% -2.73% > Instructions are helped. > > total cycles in shared programs: 178114636 -> 178095452 (-0.01%) > cycles in affected programs: 7227666 -> 7208482 (-0.27%) > helped: 3349 > HURT: 301 > helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4 > helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63% > HURT stats (abs) min: 2 max: 42 x̄: 9.13 x̃: 10 > HURT stats (rel) min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50% > 95% mean confidence interval for cycles value: -5.52 -4.99 > 95% mean confidence interval for cycles %-change: -0.81% -0.73% > Cycles are helped. > > Signed-off-by: Ian Romanick > --- > src/compiler/nir/nir_opt_algebraic.py | 8 > 1 file changed, 8 insertions(+) > > diff --git a/src/compiler/nir/nir_opt_algebraic.py > b/src/compiler/nir/nir_opt_algebraic.py > index 26ddf10..3366a43 100644 > --- a/src/compiler/nir/nir_opt_algebraic.py > +++ b/src/compiler/nir/nir_opt_algebraic.py > @@ -125,6 +125,14 @@ optimizations = [ > (('ffma', a, b, c), ('fadd', ('fmul', a, b), c), 'options->lower_ffma'), > (('~fadd', ('fmul', a, b), c), ('ffma', a, b, c), 'options->fuse_ffma'), > > + (('fdot4', ('vec4', a, b, c, 1.0), d), ('fdph', ('vec3', a, b, c), > d)), > + (('fdot4', ('vec4', a, b, c, 0.0), d), ('fdot3', ('vec3', a, b, c), > d)), > + (('fdot4', ('vec4', a, b, 0.0, 0.0), c), ('fdot2', ('vec2', a, b), c)), > + (('fdot4', ('vec4', a, 0.0, 0.0, 0.0), b), ('fmul', a, b)), > + > + (('fdot3', ('vec3', a, b, 0.0), c), ('fdot2', ('vec2', a, b), c)), > + (('fdot3', ('vec3', a, 0.0, 0.0), b), ('fmul', a, b)), > + > # (a * #b + #c) << #d > # ((a * #b) << #d) + (#c << #d) > # (a * (#b << #d)) + (#c << #d) signature.asc
[Mesa-dev] [PATCH 17/22] nir: Narrow some dot product operations
From: Ian RomanickOn vector platforms, this helps elide some constant loads. No changes on Broadwell or Skylake. Haswell total instructions in shared programs: 13093793 -> 13060163 (-0.26%) instructions in affected programs: 1277532 -> 1243902 (-2.63%) helped: 13216 HURT: 95 helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2 helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78% HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1 HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19% 95% mean confidence interval for instructions value: -2.57 -2.49 95% mean confidence interval for instructions %-change: -3.65% -3.54% Instructions are helped. total cycles in shared programs: 409580819 -> 409268463 (-0.08%) cycles in affected programs: 71730652 -> 71418296 (-0.44%) helped: 9898 HURT: 2352 helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16 helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50% HURT stats (abs) min: 2 max: 276 x̄: 23.25 x̃: 6 HURT stats (rel) min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97% 95% mean confidence interval for cycles value: -33.19 -17.80 95% mean confidence interval for cycles %-change: -4.50% -4.26% Cycles are helped. total fills in shared programs: 82059 -> 82052 (<.01%) fills in affected programs: 21 -> 14 (-33.33%) helped: 7 HURT: 0 Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown) total instructions in shared programs: 11811851 -> 11780605 (-0.26%) instructions in affected programs: 1155007 -> 1123761 (-2.71%) helped: 12304 HURT: 95 helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2 helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86% HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1 HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19% 95% mean confidence interval for instructions value: -2.56 -2.48 95% mean confidence interval for instructions %-change: -3.71% -3.59% Instructions are helped. total cycles in shared programs: 257618409 -> 257316805 (-0.12%) cycles in affected programs: 71999580 -> 71697976 (-0.42%) helped: 9155 HURT: 2380 helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16 helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62% HURT stats (abs) min: 2 max: 290 x̄: 21.14 x̃: 4 HURT stats (rel) min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33% 95% mean confidence interval for cycles value: -34.32 -17.97 95% mean confidence interval for cycles %-change: -4.55% -4.29% Cycles are helped. GM45 and Iron Lake had nearly identical results (Iron Lake shown) total instructions in shared programs: 7886750 -> 7879944 (-0.09%) instructions in affected programs: 373781 -> 366975 (-1.82%) helped: 3715 HURT: 47 helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1 helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06% HURT stats (abs) min: 1 max: 6 x̄: 2.55 x̃: 2 HURT stats (rel) min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35% 95% mean confidence interval for instructions value: -1.85 -1.77 95% mean confidence interval for instructions %-change: -2.91% -2.73% Instructions are helped. total cycles in shared programs: 178114636 -> 178095452 (-0.01%) cycles in affected programs: 7227666 -> 7208482 (-0.27%) helped: 3349 HURT: 301 helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4 helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63% HURT stats (abs) min: 2 max: 42 x̄: 9.13 x̃: 10 HURT stats (rel) min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50% 95% mean confidence interval for cycles value: -5.52 -4.99 95% mean confidence interval for cycles %-change: -0.81% -0.73% Cycles are helped. Signed-off-by: Ian Romanick --- src/compiler/nir/nir_opt_algebraic.py | 8 1 file changed, 8 insertions(+) diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py index 26ddf10..3366a43 100644 --- a/src/compiler/nir/nir_opt_algebraic.py +++ b/src/compiler/nir/nir_opt_algebraic.py @@ -125,6 +125,14 @@ optimizations = [ (('ffma', a, b, c), ('fadd', ('fmul', a, b), c), 'options->lower_ffma'), (('~fadd', ('fmul', a, b), c), ('ffma', a, b, c), 'options->fuse_ffma'), + (('fdot4', ('vec4', a, b, c, 1.0), d), ('fdph', ('vec3', a, b, c), d)), + (('fdot4', ('vec4', a, b, c, 0.0), d), ('fdot3', ('vec3', a, b, c), d)), + (('fdot4', ('vec4', a, b, 0.0, 0.0), c), ('fdot2', ('vec2', a, b), c)), + (('fdot4', ('vec4', a, 0.0, 0.0, 0.0), b), ('fmul', a, b)), + + (('fdot3', ('vec3', a, b, 0.0), c), ('fdot2', ('vec2', a, b), c)), + (('fdot3', ('vec3', a, 0.0, 0.0), b), ('fmul', a, b)), + # (a * #b + #c) << #d # ((a * #b) << #d) + (#c << #d) # (a * (#b << #d)) + (#c << #d) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev