[llvm-bugs] [Bug 56088] [AArch64][SVE] Suboptimal code-gen for fmul(index)

LLVM Bugs via llvm-bugs Fri, 17 Jun 2022 06:27:36 -0700

Issue	56088
Summary	[AArch64][SVE] Suboptimal code-gen for fmul(index)
Labels
Assignees
Reporter	stevesuzuki-arm

    https://godbolt.org/z/6E4Ts6zc7
```
define <vscale x 8 x half> @fmul_index_nxv8(half %a, <vscale x 8 x half> %b) #0 {
    %1 = fsub reassoc nnan ninf nsz contract afn half 0xH3C00, %a
    %2 = insertelement <vscale x 8 x half> undef, half %1, i64 0
    %3 = shufflevector <vscale x 8 x half> %2, <vscale x 8 x half> undef, <vscale x 8 x i32> zeroinitializer
    %4 = fmul reassoc nnan ninf nsz contract afn <vscale x 8 x half> %b, %3
    ret <vscale x 8 x half> %4
}
```
More instructions are generated with SVE2 than with Neon which fmul(index) is used instead of mov+fmul.
Option : `-mattr=+sve2 -O3`
```
fmul_index_nxv8:                        // @fmul_index_nxv8
        fmov    h2, #1.00000000
        fsub    h0, h2, h0
        mov     z0.h, h0
        fmul    z0.h, z1.h, z0.h
        ret
fmul_index_v8:                          // @fmul_index_v8
        fmov    h2, #1.00000000
        fsub    h0, h2, h0
        fmul    v0.8h, v1.8h, v0.h[0]
        ret
```

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 56088] [AArch64][SVE] Suboptimal code-gen for fmul(index)

Reply via email to