Module: Mesa
Branch: main
Commit: a1c765a44c032fe505e4863297b6bbefa0ce426c
URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=a1c765a44c032fe505e4863297b6bbefa0ce426c

Author: Emma Anholt <[email protected]>
Date:   Tue Aug  9 17:48:14 2022 -0700

freedreno/ir3: Consistently lower mediump inputs to 16-bit (when we can).

If every use was a conversion to 16, then ir3_cf would fold it into the
bary instruction.  But if something had generated a highp comparison of
the mediump input with a mediump op result, it would get stuck as highp,
even though we could have used 16-bit values without upconverting.

This fixes dEQP-GLES2.functional.shaders.algorithm.rgb_to_hsl_fragment on
ANGLE on turnip, closing #7043.  fossil-db results are mixed:

fossil-db:
Totals from 697 (4.65% of 14988) affected shaders:
MaxWaves: 10712 -> 10736 (+0.22%)
Instrs: 82394 -> 83572 (+1.43%); split: -1.31%, +2.74%
CodeSize: 178280 -> 180118 (+1.03%); split: -0.46%, +1.49%
NOPs: 15887 -> 16067 (+1.13%); split: -7.48%, +8.61%
MOVs: 1297 -> 1328 (+2.39%); split: -6.86%, +9.25%
Full: 3730 -> 3842 (+3.00%); split: -1.80%, +4.80%
(ss): 1877 -> 1849 (-1.49%); split: -5.59%, +4.10%
(sy): 1249 -> 1255 (+0.48%); split: -1.04%, +1.52%
(ss)-stall: 6809 -> 6364 (-6.54%); split: -13.85%, +7.31%
(sy)-stall: 17059 -> 17257 (+1.16%); split: -6.51%, +7.67%
Cat0: 17220 -> 17400 (+1.05%); split: -6.90%, +7.94%
Cat1: 5307 -> 6366 (+19.95%); split: -6.93%, +26.89%
Cat2: 39138 -> 39101 (-0.09%); split: -0.31%, +0.22%
Cat3: 16772 -> 16741 (-0.18%)
Cat5: 1269 -> 1276 (+0.55%)

I tried to pick some apps to test that looked the most impacted, and
indeed the results are mixed:

cookie_run_kingdom:         +0.275514% +/- 0.0883816% (n=68)
trex_200:                   +0.0943847% +/- 0.0297073% (n=1463)
command_and_conquer_rivals: no difference (n=131)
war_planet_online:          no difference (n=120)
lego_legacy:                -0.192131% +/- 0.152083% (n=99)
among_us:                   -0.625227% +/- 0.385419% (n=60)

Given that the perf results are small and go both ways, and apparently
we're an outlier in not always lowering mediump inputs to 16-bit, just do
it for consistency with other drivers.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18506>

---

 .../freedreno-a630-skqp-gles_rendertests-fails.txt |  2 ++
 src/freedreno/ci/traces-freedreno.yml              |  6 ++--
 src/freedreno/ir3/ir3_nir.c                        | 34 ++++++++++++++++++++++
 3 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/src/freedreno/ci/freedreno-a630-skqp-gles_rendertests-fails.txt 
b/src/freedreno/ci/freedreno-a630-skqp-gles_rendertests-fails.txt
index 29dc326836a..baef6016744 100644
--- a/src/freedreno/ci/freedreno-a630-skqp-gles_rendertests-fails.txt
+++ b/src/freedreno/ci/freedreno-a630-skqp-gles_rendertests-fails.txt
@@ -1178,6 +1178,8 @@ image_from_yuv_textures,-1
 image_scale_aligned,-1
 imagealphathreshold_image,-1
 imageblur,-1
+# A few pixels at the edge of the blur, nothing more noticeable at those 
points than the jaggedness elsewhere.
+imageblurclampmode,180
 imagefilters_xfermodes,-1
 imagefiltersbase,-1
 imagefiltersclipped,-1
diff --git a/src/freedreno/ci/traces-freedreno.yml 
b/src/freedreno/ci/traces-freedreno.yml
index 30cb6ba0bb9..c44e29b727b 100644
--- a/src/freedreno/ci/traces-freedreno.yml
+++ b/src/freedreno/ci/traces-freedreno.yml
@@ -207,7 +207,7 @@ traces:
     freedreno-a530:
       checksum: 70e18ba06d56fea277cd3fb000729879
     freedreno-a630:
-      checksum: 9eb6d261c0c5946d8aeb0f41b2b7c1b1
+      checksum: 92944a4996ea019f51cfcdeaf4fc6926
 
   glmark2/desktop:windows=4:effect=shadow-v2.trace:
     freedreno-a306:
@@ -223,7 +223,7 @@ traces:
     freedreno-a530:
       checksum: 7712c8143a244c56922124f4ac207722
     freedreno-a630:
-      checksum: 7712c8143a244c56922124f4ac207722
+      checksum: 8a3b04aff4848fd36de9d956f24fac2f
 
   glmark2/effect2d:kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;-v2.trace:
     freedreno-a306:
@@ -231,7 +231,7 @@ traces:
     freedreno-a530:
       checksum: 50f1841f2bb96905c9fcd1815c4a95c0
     freedreno-a630:
-      checksum: 50f1841f2bb96905c9fcd1815c4a95c0
+      checksum: 05e7d72609840c897f734dca03748ead
 
   glmark2/function:fragment-steps=5:fragment-complexity=low-v2.trace:
     freedreno-a306:
diff --git a/src/freedreno/ir3/ir3_nir.c b/src/freedreno/ir3/ir3_nir.c
index a63b66f64bc..f9a35d31eb4 100644
--- a/src/freedreno/ir3/ir3_nir.c
+++ b/src/freedreno/ir3/ir3_nir.c
@@ -463,6 +463,40 @@ ir3_nir_post_finalize(struct ir3_shader *shader)
 
    if (compiler->gen >= 6 && s->info.stage == MESA_SHADER_FRAGMENT &&
        !(ir3_shader_debug & IR3_DBG_NOFP16)) {
+      /* Lower FS mediump inputs to 16-bit. If you declared it mediump, you
+       * probably want 16-bit instructions (and have set
+       * mediump/RelaxedPrecision on most of the rest of the shader's
+       * instructions).  If we don't lower it in NIR, then comparisons of the
+       * results of mediump ALU ops with the mediump input will happen in 
highp,
+       * causing extra conversions (and, incidentally, causing
+       * dEQP-GLES2.functional.shaders.algorithm.rgb_to_hsl_fragment on ANGLE 
to
+       * fail)
+       *
+       * However, we can't do flat inputs because flat.b doesn't have the
+       * destination type for how to downconvert the
+       * 32-bit-in-the-varyings-interpolator value. (also, even if it did, 
watch
+       * out for how gl_nir_lower_packed_varyings packs all flat-interpolated
+       * things together as ivec4s, so when we lower a formerly-float input
+       * you'd end up with an incorrect f2f16(i2i32(load_input())) instead of
+       * load_input).
+       */
+      uint64_t mediump_varyings = 0;
+      nir_foreach_shader_in_variable(var, s) {
+         if ((var->data.precision == GLSL_PRECISION_MEDIUM ||
+              var->data.precision == GLSL_PRECISION_LOW) &&
+             var->data.interpolation != INTERP_MODE_FLAT) {
+            mediump_varyings |= BITFIELD64_BIT(var->data.location);
+         }
+      }
+
+      if (mediump_varyings) {
+         NIR_PASS_V(s, nir_lower_mediump_io,
+                  nir_var_shader_in,
+                  mediump_varyings,
+                  false);
+      }
+
+      /* This should come after input lowering, to opportunistically lower 
non-mediump outputs. */
       NIR_PASS_V(s, nir_lower_mediump_io, nir_var_shader_out, 0, false);
    }
 

Reply via email to