Issue 86785
Summary [LoopVectorize] LoopVectorize produces redundant instructions due to IV widening in IndVarSimplify
Labels new issue
Assignees
Reporter LittleMeepo
    ```
void fun1(int* restrict A ,int* restrict B){
    int i;
    for(i = 1; i <= 1000; i++){
        if (i % 2 == 0) {
            A[i] = B[i] * B[i];
        } else {
 A[i] = B[i] + B[i];
        }
    }
}
```

Difference between GCC13.2 and LLVM:

https://godbolt.org/z/4v993aq9j

If set `-mllvm --indvars-widen-indvars=false` , the merge operation of p register will be simplified. The reason is that the type of IV is converted from i32 to i64 in IndVarSimplify. 
GCC uses `.s` format insts when `i` is `int`, uses `.d` format insts when `i` is `long long`.
Can LLVM achieve similar operations? Adding options manually doesn't look very smart after all.

```
.LBB0_1:                                // =>This Inner Loop Header: Depth=1
        mov     z3.d, z1.d
        add     z4.d, z0.d, z2.d
        and     z0.d, z0.d, #0x1
        add     z1.d, z1.d, z2.d
        and     z3.d, z3.d, #0x1
        cmpeq   p2.d, p0/z, z0.d, #0               // <-- redundant inst
        ld1w    { z0.s }, p1/z, [x11, x8, lsl #2]
        cmpeq   p3.d, p0/z, z3.d, #0               // <-- redundant inst
        mul     z3.s, z0.s, z0.s
        lsl z0.s, z0.s, #1
        uzp1    p2.s, p2.s, p3.s                        // <-- redundant inst
        mov     z0.s, p2/m, z3.s
        st1w    { z0.s }, p1, [x12, x8, lsl #2]
        mov     z0.d, z4.d
        incw x8
        cmp     x10, x8
        b.ne    .LBB0_1
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to