Issue 56088
Summary [AArch64][SVE] Suboptimal code-gen for fmul(index)
Labels
Assignees
Reporter stevesuzuki-arm
    https://godbolt.org/z/6E4Ts6zc7
```
define <vscale x 8 x half> @fmul_index_nxv8(half %a, <vscale x 8 x half> %b) #0 {
    %1 = fsub reassoc nnan ninf nsz contract afn half 0xH3C00, %a
    %2 = insertelement <vscale x 8 x half> undef, half %1, i64 0
    %3 = shufflevector <vscale x 8 x half> %2, <vscale x 8 x half> undef, <vscale x 8 x i32> zeroinitializer
    %4 = fmul reassoc nnan ninf nsz contract afn <vscale x 8 x half> %b, %3
    ret <vscale x 8 x half> %4
}
```
More instructions are generated with SVE2 than with Neon which fmul(index) is used instead of mov+fmul.
Option : `-mattr=+sve2 -O3`
```
fmul_index_nxv8:                        // @fmul_index_nxv8
        fmov    h2, #1.00000000
        fsub    h0, h2, h0
        mov     z0.h, h0
        fmul    z0.h, z1.h, z0.h
        ret
fmul_index_v8:                          // @fmul_index_v8
        fmov    h2, #1.00000000
        fsub    h0, h2, h0
        fmul    v0.8h, v1.8h, v0.h[0]
        ret
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to