Hi Richard, Thanks for your comments and tips. fold_binary_op_with_conditional_arg performs the reverse transformation to this patch in certain situations:
/* Transform `a + (b ? x : y)' into `b ? (a + x) : (a + y)'. ... */ static tree fold_binary_op_with_conditional_arg (location_t loc, ... /* This transformation is only worthwhile if we don't have to wrap ARG in a SAVE_EXPR and the operation can be simplified without recursing on at least one of the branches once its pushed inside the COND_EXPR. */ if (!TREE_CONSTANT (arg) && (TREE_SIDE_EFFECTS (arg) ...) return NULL_TREE; ... For instance, this causes infinite recursion in gcc.dg/vect/fast-math-vect-call-2 because ARG is a float literal. Regards, Yuliang -----Original Message----- From: Richard Biener <richard.guent...@gmail.com> Sent: 20 September 2019 13:02 To: Yuliang Wang <yuliang.w...@arm.com> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Sandiford <richard.sandif...@arm.com> Subject: Re: [PATCH] Reduction of conditional operations for vectorization On Fri, Sep 20, 2019 at 10:09 AM Yuliang Wang <yuliang.w...@arm.com> wrote: > > Hi, > > ifcvt transforms the following conditional operation reduction pattern: > > if ( condition ) > a = a OP b; > else > a = a OP c; > > Into: > > a_1 = a OP b; > a_2 = a OP c; > a = condition ? a_1 : a_2; > > Where OP is one of { plus minus mult min max and ior eor }. > > This patch further optimizes the above to: > > a_0 = condition ? b : c; > a = a OP a_0; > > Which enables vectorization on AArch64. > Also supported are permutations of the above operand ordering subject to > commutativity of OP. > > Added new tests. Built and regression tested on aarch64-none-elf and > aarch64-linux-gnu. @@ -3206,7 +3206,41 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* !A ? B : C -> A ? C : B. */ (simplify (cnd (logical_inverted_value truth_valued_p@0) @1 @2) - (cnd @0 @2 @1))) + (cnd @0 @2 @1)) + + /* !A ? B : C -> A ? C : B. */ + (simplify + (cnd (logical_inverted_value truth_valued_p@0) @1 @2) (cnd @0 @2 + @1)) + looks like you duplicate the above pattern. Should have raised a warning in the genmatch run. The patch header shows you are not working against trunk? + (for op (plus minus mult + min max + bit_and bit_ior bit_xor) + (simplify + (cnd @0 (op @1 @2) (op @1 @3)) + (op @1 (cnd @0 @2 @3))) + (simplify + (cnd @0 (op @1 @2) (op @3 @2)) + (op (cnd @0 @1 @3) @2)) + (if (op != MINUS_EXPR) + (simplify + (cnd @0 (op @1 @2) (op @3 @1)) + (op @1 (cnd @0 @2 @3))) + (simplify + (cnd @0 (op @2 @1) (op @1 @3)) + (op @1 (cnd @0 @2 @3))))) if you would have dropped minus handling this simplifies to (for op (...) (simpify (cnd @0 (op:c @1 @2) (op:c @1 @3)) (op @1 (cnd @0 @2 @3))) you can then add minus special-cases if they are important (simplify (cnd @0 (minus @1 @2) (minus @1 @3)) ... (simplify (cnd @0 (minus @2 @1) (minus @3 @1)) I think that's clearer. + /* Hack: generic-match causes infinite recursion + by reverting this transformation when + i) -fno-trapping-math is enabled, and + ii) the common operand does not need to be wrapped in a SAVE_EXPR. + */ What's the specific transform that causes this? Yes, there are some left in fold-const.c. Thanks, Richard. > Best Regards, > Yuliang Wang > > > gcc/ChangeLog: > > 2019-09-19 Yuliang Wang <yuliang.w...@arm.com> > > * match.pd (for cnd (cond vec_cond)): New match statements for the > above patterns. > * doc/sourcebuild.texi (vect_condred_si): Document new target > selector. > > gcc/testsuite/ChangeLog: > > 2019-09-19 Yuliang Wang <yuliang.w...@arm.com> > > * gcc.target/aarch64/sve2/condred_1.c: New test. > * gcc.dg/vect/vect-condred-1.c: As above. > * gcc.dg/vect/vect-condred-2.c: As above. > * gcc.dg/vect/vect-condred-3.c: As above. > * gcc.dg/vect/vect-condred-4.c: As above. > * gcc.dg/vect/vect-condred-5.c: As above. > * gcc.dg/vect/vect-condred-6.c: As above. > * gcc.dg/vect/vect-condred-7.c: As above. > * gcc.dg/vect/vect-condred-8.c: As above. > * lib/target-supports.exp (check_effective_target_vect_condred_si): > Return true for AArch64 without SVE.