Issue 185175
Summary [DAG] Missing funnel shift optimisation in load-local-v3i129.ll test
Labels missed-optimization, llvm:SelectionDAG
Assignees
Reporter RKSimon
    https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/X86/load-local-v3i129.ll

"FAST-SHLD" targets retain the funnel shift (SHLDQ) instructions, but the "SLOW-SHLD" targets expand these back to or(shl,srl) patterns and in doing so go on to further simplify and remove all bit shift patterns:

FAST:
```
Legalized selection DAG: %bb.0 '_start:Entry'
SelectionDAG has 33 nodes:
  t0: ch,glue = EntryToken
  t66: i64 = add FrameIndex:i64<0>, Constant:i64<32>
 t186: i64 = add nuw FrameIndex:i64<0>, Constant:i64<16>
  t376: i64 = add FrameIndex:i64<0>, Constant:i64<40>
  t169: i64,ch = load<(load (s64) from %ir.y + 32, align 32, basealign 64)> t0, t66, undef:i64
  t210: i64,ch = load<(load (s64) from %ir.y + 16, align 16, basealign 64)> t0, t186, undef:i64
  t172: i64,ch = load<(load (s64) from %ir.y + 40, basealign 64)> t0, t376, undef:i64
        t374: i64 = add nuw FrameIndex:i64<0>, Constant:i64<24>
      t310: ch = store<(store (s64) into %ir.y + 24, basealign 64)> t0, Constant:i64<-1>, t374, undef:i64
        t304: i64 = or t210, Constant:i64<-2>
      t313: ch = store<(store (s64) into %ir.y + 16, align 16, basealign 64)> t210:1, t304, t186, undef:i64
          t183: i64 = srl t172, Constant:i8<2>
          t180: i64 = shl t172, Constant:i8<62>
 t363: i64 = fshl t183, t180, Constant:i8<2>
      t322: ch = store<(store (s64) into %ir.y + 40, basealign 64)> t172:1, t363, t376, undef:i64
          t331: i64 = and t169, Constant:i64<-4>
        t286: i64 = or t331, Constant:i64<1>
      t325: ch = store<(store (s64) into %ir.y + 32, align 32, basealign 64)> t169:1, t286, t66, undef:i64
    t372: ch = TokenFactor t310, t313, t322, t325
  t11: ch = X86ISD::RET_GLUE t372, TargetConstant:i32<0>
```

SLOW:
```
Legalized selection DAG: %bb.0 '_start:Entry'
SelectionDAG has 35 nodes:
  t0: ch,glue = EntryToken
 t66: i64 = add FrameIndex:i64<0>, Constant:i64<32>
  t186: i64 = add nuw FrameIndex:i64<0>, Constant:i64<16>
  t376: i64 = add FrameIndex:i64<0>, Constant:i64<40>
  t169: i64,ch = load<(load (s64) from %ir.y + 32, align 32, basealign 64)> t0, t66, undef:i64
  t210: i64,ch = load<(load (s64) from %ir.y + 16, align 16, basealign 64)> t0, t186, undef:i64
  t172: i64,ch = load<(load (s64) from %ir.y + 40, basealign 64)> t0, t376, undef:i64
 t374: i64 = add nuw FrameIndex:i64<0>, Constant:i64<24>
      t310: ch = store<(store (s64) into %ir.y + 24, basealign 64)> t0, Constant:i64<-1>, t374, undef:i64
        t304: i64 = or t210, Constant:i64<-2>
      t313: ch = store<(store (s64) into %ir.y + 16, align 16, basealign 64)> t210:1, t304, t186, undef:i64
            t183: i64 = srl t172, Constant:i8<2>
 t378: i64 = shl t183, Constant:i8<2>
            t180: i64 = shl t172, Constant:i8<62>
          t379: i64 = srl t180, Constant:i8<62>
 t380: i64 = or t378, t379
      t322: ch = store<(store (s64) into %ir.y + 40, basealign 64)> t172:1, t380, t376, undef:i64
          t331: i64 = and t169, Constant:i64<-4>
        t286: i64 = or t331, Constant:i64<1>
 t325: ch = store<(store (s64) into %ir.y + 32, align 32, basealign 64)> t169:1, t286, t66, undef:i64
    t372: ch = TokenFactor t310, t313, t322, t325
  t11: ch = X86ISD::RET_GLUE t372, TargetConstant:i32<0>
```

SLOW exposes "zext_inreg" shl+srl style patterns that the FAST fshl node prevents us from handling.

Noticed while triaging #184016
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to