https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122103

--- Comment #14 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tamar Christina <[email protected]>:

https://gcc.gnu.org/g:7fcd3ed36c68d39b1d51137d5bdf0bd91b99be60

commit r16-6510-g7fcd3ed36c68d39b1d51137d5bdf0bd91b99be60
Author: Tamar Christina <[email protected]>
Date:   Mon Jan 5 20:55:34 2026 +0000

    vect: teach if-convert to predicate __builtin calls [PR122103]

    The following testcase

    void f (float *__restrict c, int *__restrict d, int n)
    {
        for (int i = 0; i < n; i++)
        {
          if (d[i] > 1000)
            c[i] = __builtin_sqrtf (c[i]);
        }
    }

    compiled with -O3 -march=armv9-a -fno-math-errno -ftrapping-math needs to
be
    predicated on the conditional.  It's invalid to execute the branch and use
a
    select to extract it later unless using -fno-trapping-math.

    This change in if-conversion changes what we used to generate:

      _26 = _4 > 1000;
      _34 = _33 + _2;
      _5 = (float *) _34;
      _6 = .MASK_LOAD (_5, 32B, _26, 0.0);
      _7 = __builtin_sqrtf (_6);
      .MASK_STORE (_5, 32B, _26, _7);

    into

      _26 = _4 > 1000;
      _34 = _33 + _2;
      _5 = (float *) _34;
      _6 = .MASK_LOAD (_5, 32B, _26, 0.0);
      _7 = .COND_SQRT (_26, _6, _6);
      .MASK_STORE (_5, 32B, _26, _7);

    which correctly results in

    .L3:
            ld1w    z0.s, p7/z, [x1, x3, lsl 2]
            cmpgt   p7.s, p7/z, z0.s, z31.s
            ld1w    z30.s, p7/z, [x0, x3, lsl 2]
            fsqrt   z30.s, p7/m, z30.s
            st1w    z30.s, p7, [x0, x3, lsl 2]
            incw    x3
            whilelo p7.s, w3, w2
            b.any   .L3

    instead of

    .L3:
            ld1w    z0.s, p7/z, [x1, x3, lsl 2]
            cmpgt   p7.s, p7/z, z0.s, z31.s
            ld1w    z30.s, p7/z, [x0, x3, lsl 2]
            fsqrt   z30.s, p6/m, z30.s
            st1w    z30.s, p7, [x0, x3, lsl 2]
            incw    x3
            whilelo p7.s, w3, w2
            b.any   .L3

    gcc/ChangeLog:

            PR tree-optimization/122103
            * tree-if-conv.cc (ifcvt_can_predicate): Support
gimple_call_builtin_p.
            (if_convertible_stmt_p, predicate_rhs_code,
            predicate_statements): Likewise.

    gcc/testsuite/ChangeLog:

            PR tree-optimization/122103
            * gcc.target/aarch64/sve/pr122103_1.c: New test.
            * gcc.target/aarch64/sve/pr122103_2.c: New test.
            * gcc.target/aarch64/sve/pr122103_3.c: New test.

Reply via email to