On 02.10.2016 20:26, Ilia Mirkin wrote:
That's very odd. LoadPropagation should have picked that up even in
its current form. Should try to figure out why it didn't and that is
likely to "fix" a *lot* more situations.

Actually i was coming from an, given really constrained, addition to the LoadPropagation pass, where i was told to fix it within OP_MAD :/

On Sun, Oct 2, 2016 at 2:24 PM, Tobias Klausmann
<[email protected]> wrote:

On 02.10.2016 20:03, Ilia Mirkin wrote:
On Sun, Oct 2, 2016 at 1:58 PM, Tobias Klausmann
<[email protected]> wrote:
Previously we'd end up with an unnecessary mov for the thirs immediate
value.

total instructions in shared programs : 851881 -> 851864 (-0.00%)
total gprs used in shared programs    : 110295 -> 110295 (0.00%)
total local used in shared programs   : 1020 -> 1020 (0.00%)

                  local        gpr       inst      bytes
      helped           0           0          17          17
        hurt           0           0           0           0

Suggested-by: Karol Herbst <[email protected]>
Signed-off-by: Tobias Klausmann <[email protected]>
---
   src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 15
++++++++++++---
   1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 9875738..8bb5cf9 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1008,13 +1008,22 @@ ConstantFolding::opnd(Instruction *i,
ImmediateValue &imm0, int s)
         break;
      case OP_MAD:
         if (imm0.isInteger(0)) {
+         ImmediateValue imm1;
            i->setSrc(0, i->getSrc(2));
            i->src(0).mod = i->src(2).mod;
            i->setSrc(1, NULL);
            i->setSrc(2, NULL);
-         i->op = i->src(0).mod.getOp();
-         if (i->op != OP_CVT)
-            i->src(0).mod = 0;
+         if (i->src(0).getImmediate(imm1)) {
+            bld.setPosition(i, false);
+            newi = bld.mkMov(i->getDef(0), bld.mkImm(imm1.reg.data.u64),
+                             i->dType);
+            delete_Instruction(prog, i);
What's an example of a situation where this helps? It shouldn't
matter, the mov's should get cleaned up. [Clearly 17 shaders
disagree...] Is this just a side-effect of the fact that we don't run
the opts to a fixed point?

It is a second mov that causes a problem for later folding in the imm, here
output of a testshader[1]:

   0: nop u32 %r56 (0)
   1: ld  u32 %r31 c0[0x0] (0)
   2: ld  u32 %r37 c0[0x140] (0)
   3: mov u32 %r38 0x00000000 (0)
   4: mov u32 %r39 0x3f800000 (0)
   5: mad f32 %r40 %r37 %r38 %r39 (0)
   6: mad f32 %r44 %r37 %r38 %r38 (0)
   7: add f32 %r53 %r31 %r40 (0)
   8: add f32 %r54 %r31 %r44 (0)
   9: add f32 %r57 %r56 %r44 (0)

Constantfolding...

MAIN:-1 ()
BB:0 (14 instructions) - df = { }
  -> BB:1 (tree)
   0: nop u32 %r56 (0)
   1: ld  u32 %r31 c0[0x0] (0)
   2: ld  u32 %r37 c0[0x140] (0)
   3: mov u32 %r38 0x00000000 (0)
   4: mov u32 %r39 0x3f800000 (0)
   5: mov f32 %r40 %r39 (0)
   6: mov f32 %r44 %r38 (0)
   7: add f32 %r53 %r31 %r40 (0)
   8: mov f32 %r54 %r31 (0)
   9: mov f32 %r57 %r56 (0)


The outcome:
   0: ld  u32 $r2 c0[0x0] (8)
   1: mov u32 $r0 0x3f800000 (8)
   2: add ftz f32 $r0 $r2 $r0 (8)
   3: mov f32 $r3 $r1 (8)
   4: mov u32 $r1 $r2 (8)
   5: export b128 # o[0x0] $r0q (8)

With patch:
   0: ld  u32 $r2 c0[0x0] (8)
   1: add ftz f32 $r0 $r2 1.000000 (8)
   2: mov f32 $r3 $r1 (8)
   3: mov u32 $r1 $r2 (8)
   4: export b128 # o[0x0] $r0q (8)


[1]:
VERT
PROPERTY NEXT_SHADER FRAG
DCL OUT[0], GENERIC[0]
DCL CONST[0]
DCL TEMP[0..1], LOCAL
IMM[0] FLT32 {    0.0078,    -1.0000,     0.0000,     0.5000}
IMM[1] FLT32 {    1.0000,     0.0000, 65535.0000,     0.0100}
   0: MOV TEMP[0].xyz, CONST[0].xxxx
  39: MAD TEMP[1], CONST[20].xxxx, IMM[1].yyyy, IMM[1].xyyy
  41: ADD TEMP[1], TEMP[0], TEMP[1]
208: MOV OUT[0], TEMP[1]
211: END




+         }
+         else {
+            i->op = i->src(0).mod.getOp();
+            if (i->op != OP_CVT)
+               i->src(0).mod = 0;
+         }
         } else
         if (i->subOp != NV50_IR_SUBOP_MUL_HIGH &&
             (imm0.isInteger(1) || imm0.isInteger(-1))) {
--
2.10.0

_______________________________________________
Nouveau mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/nouveau


_______________________________________________
Nouveau mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/nouveau

Reply via email to