Re: [PATCH] Fix missed SLP reduction detection for commutative operations

Richard Biener Sat, 18 Oct 2025 16:48:46 -0700

On Mon, Oct 13, 2025 at 9:22 AM Eric Botcazou <[email protected]> wrote:
>
> Hi,
>
> the attached Ada testcase compiled for x86_64-w64-mingw32 with -O3 -msse2 -
> gnatn exposes an instability in the SLP reduction detection: in 32-bit mode,
> the inner loop is effectively vectorized (after being unrolled by cunrolli):
>
> r1.adb:6:9: optimized: loop vectorized using 16 byte vectors and unroll factor
> 1
>
> whereas it is not in 64-bit mode:
>
> r1.adb:6:9: missed: couldn't vectorize loop
>
> despite almost similar .ifcvt tree dumps.  The difference stems from this:
>
>   <bb 3> [local count: 268435456]:
>   # i_1 = PHI <0(5), i_8(6)>
>   # ret$F$1_17 = PHI <ret$F$1_31(5), _27(6)>
>   # ret$F$2_30 = PHI <ret$F$2_32(5), _35(6)>
>   # ret$F$3_21 = PHI <ret$F$3_37(5), _43(6)>
>   # ret$F$4_24 = PHI <ret$F$4_39(5), _16(6)>
>   i_8 = i_1 + 1;
>   _27 = ret$F$1_17 + ret$F$1_31;
>   _35 = ret$F$2_30 + ret$F$2_32;
>   _43 = ret$F$3_21 + ret$F$3_37;
>   _16 = ret$F$4_24 + ret$F$4_39;
>
> in the 32-bit .ifcvt tree dump and this:
>
>   <bb 3> [local count: 268435456]:
>   # i_1 = PHI <0(5), i_9(6)>
>   # ret$F$1_31 = PHI <ret$F$1_20(5), _38(6)>
>   # ret$F$2_29 = PHI <ret$F$2_19(5), _46(6)>
>   # ret$F$3_21 = PHI <ret$F$3_49(5), _54(6)>
>   # ret$F$4_47 = PHI <ret$F$4_41(5), _18(6)>
>   i_9 = i_1 + 1;
>   _38 = ret$F$1_20 + ret$F$1_31;
>   _46 = ret$F$2_19 + ret$F$2_29;
>   _54 = ret$F$3_21 + ret$F$3_49;
>   _18 = ret$F$4_41 + ret$F$4_47;
>
> in the 64-bit .ifcvt tree dump.  The first pattern is accepted as a reduction
> by vect_build_slp_tree_1 but the second is not:
>
> r1.adb:6:9: note:   Build SLP for _18 = ret$F$4_41 + ret$F$4_47;
> r1.adb:6:9: note:   Build SLP for _54 = ret$F$3_21 + ret$F$3_49;
> r1.adb:6:9: missed:   Build SLP failed: different reduc_idx 0 instead of 1 in
> _54 = ret$F$3_21 + ret$F$3_49;
> r1.adb:6:9: note:   Build SLP for _46 = ret$F$2_19 + ret$F$2_29;
> r1.adb:6:9: note:   Build SLP for _38 = ret$F$1_20 + ret$F$1_31;
>
> which indicates that, if the operands of the addition had been swapped in:
>
> _54 = ret$F$3_21 + ret$F$3_49;
>
> then the pattern would have been recognized.
>
> vect_build_slp_tree_1 already deals with this kind of issues for COND_EXPR:
>
>    Note COND_EXPR is possibly isomorphic to another one after swapping its
>    operands.  Set SWAP[i] to 1 if stmt I is COND_EXPR and isomorphic to
>    the first stmt by swapping the two operands of comparison; set SWAP[i]
>    to 2 if stmt I is isormorphic to the first stmt by inverting the code
>    of comparison.  Take A1 >= B1 ? X1 : Y1 as an exmple, it can be swapped
>    to (B1 <= A1 ? X1 : Y1); or be inverted to (A1 < B1) ? Y1 : X1.  */
>
> but not for commutative operations.
>
> Does this mean that some form of canonicalization should have happened earlier
> to make the reduction detection more robust for them?  Or does the function
> need to cope with them specifically, for example as in the attached patch that
> fixes in the instability for this case?  Or else should the reduction have
> been detected earlier?


I think the patch is wrong, it ignores the difference.  But then the
issue is that
in vect_build_slp_tree_2 when we try swapping operands of mismatched
stmts we do not handle swapping for the reduc-idx (because that wasn't possible
in the past).  I _think_ it should be possible to relax this now, aka

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 950c6fb798e..00250251881 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -2915,9 +2915,7 @@ out:
          && (nops == 2 || nops == 3)
          && oprnds_info[1]->first_dt == vect_internal_def
          && (is_gimple_assign (stmt_info->stmt)
-             || is_gimple_call (stmt_info->stmt))
-         /* Swapping operands for reductions breaks assumptions later on.  */
-         && STMT_VINFO_REDUC_IDX (stmt_info) == -1)
+             || is_gimple_call (stmt_info->stmt)))
        {
          /* See whether we can swap the matching or the non-matching
             stmt operands.  */


>
>
>         * tree-vect-slp.cc (vect_build_slp_tree_1): Accept swapped operands
>         for commutative operations in reduction statements.
>
>
> --
> Eric Botcazou

Re: [PATCH] Fix missed SLP reduction detection for commutative operations

Reply via email to