RE: [PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold
Committed, thanks Richard. Pan -Original Message- From: Gcc-patches On Behalf Of Richard Sandiford via Gcc-patches Sent: Thursday, August 24, 2023 6:34 PM To: Juzhe-Zhong Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de Subject: Re: [PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold Juzhe-Zhong writes: > Hi, Richard and Richi. > > Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. > It's supported in tree-ssa-math-opts.cc. However, GCC failed to support > COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. > > Consider this following case: > #define TEST_TYPE(TYPE) > \ > __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst, > \ > TYPE *__restrict a, \ > TYPE *__restrict b, int n) \ > { > \ > for (int i = 0; i < n; i++) > \ > dst[i] -= a[i] * b[i]; \ > } > > #define TEST_ALL() > \ > TEST_TYPE (float) > \ > > TEST_ALL () > > Gimple IR for RVV: > > ... > _39 = -vect__8.14_26; > vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, > vect__4.8_34, vect__4.8_34, _46, 0); > ... > > This is because this following piece of codes in tree-ssa-math-opts.cc: > > if (len) > fma_stmt > = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2, > addop, else_value, len, bias); > else if (cond) > fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1, > op2, addop, else_value); > else > fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop); > gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt)); > gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun, > use_stmt)); > gsi_replace (&gsi, fma_stmt, true); > /* Follow all SSA edges so that we generate FMS, FNMA and FNMS >regardless of where the negation occurs. */ > gimple *orig_stmt = gsi_stmt (gsi); > if (fold_stmt (&gsi, follow_all_ssa_edges)) > { > if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi))) > gcc_unreachable (); > update_stmt (gsi_stmt (gsi)); > } > > 'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA > COND_LEN_FNMA. > > This patch support STMT fold into: > > vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, > vect__4.8_34, { 0.0, ... }, _46, 0); > > Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments. > > Extend maximum num ops: > - static const unsigned int MAX_NUM_OPS = 5; > + static const unsigned int MAX_NUM_OPS = 7; > > Bootstrap and Regtest on X86 passed. > Tested on aarch64 Qemu. > > Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend. > > > gcc/ChangeLog: > > * genmatch.cc (decision_tree::gen): Support > COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. > * gimple-match-exports.cc (gimple_simplify): Ditto. > (gimple_resimplify6): New function. > (gimple_resimplify7): New function. > (gimple_match_op::resimplify): Support > COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. > (convert_conditional_op): Ditto. > (build_call_internal): Ditto. > (try_conditional_simplification): Ditto. > (gimple_extract): Ditto. > * gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto. > * internal-fn.cc (CASE): Ditto. OK, thanks. Richard > > --- > gcc/genmatch.cc | 2 +- > gcc/gimple-match-exports.cc | 123 ++-- > gcc/gimple-match.h | 16 - > gcc/internal-fn.cc | 7 +- > 4 files changed, 138 insertions(+), 10 deletions(-) > > diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc > index f46d2e1520d..a1925a747a7 100644 > --- a/gcc/genmatch.cc > +++ b/gcc/genmatch.cc > @@ -4052,7 +4052,7 @@ decision_tree::gen (vec &files, bool gimple) > } >fprintf (stderr, "removed %u duplicate tails\n", rcnt); > > - for (unsigned n = 1; n <=
Re: [PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold
Juzhe-Zhong writes: > Hi, Richard and Richi. > > Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. > It's supported in tree-ssa-math-opts.cc. However, GCC failed to support > COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. > > Consider this following case: > #define TEST_TYPE(TYPE) > \ > __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst, > \ > TYPE *__restrict a, \ > TYPE *__restrict b, int n) \ > { > \ > for (int i = 0; i < n; i++) > \ > dst[i] -= a[i] * b[i]; \ > } > > #define TEST_ALL() > \ > TEST_TYPE (float) > \ > > TEST_ALL () > > Gimple IR for RVV: > > ... > _39 = -vect__8.14_26; > vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, > vect__4.8_34, vect__4.8_34, _46, 0); > ... > > This is because this following piece of codes in tree-ssa-math-opts.cc: > > if (len) > fma_stmt > = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2, > addop, else_value, len, bias); > else if (cond) > fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1, > op2, addop, else_value); > else > fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop); > gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt)); > gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun, > use_stmt)); > gsi_replace (&gsi, fma_stmt, true); > /* Follow all SSA edges so that we generate FMS, FNMA and FNMS >regardless of where the negation occurs. */ > gimple *orig_stmt = gsi_stmt (gsi); > if (fold_stmt (&gsi, follow_all_ssa_edges)) > { > if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi))) > gcc_unreachable (); > update_stmt (gsi_stmt (gsi)); > } > > 'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA > COND_LEN_FNMA. > > This patch support STMT fold into: > > vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, > vect__4.8_34, { 0.0, ... }, _46, 0); > > Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments. > > Extend maximum num ops: > - static const unsigned int MAX_NUM_OPS = 5; > + static const unsigned int MAX_NUM_OPS = 7; > > Bootstrap and Regtest on X86 passed. > Tested on aarch64 Qemu. > > Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend. > > > gcc/ChangeLog: > > * genmatch.cc (decision_tree::gen): Support > COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. > * gimple-match-exports.cc (gimple_simplify): Ditto. > (gimple_resimplify6): New function. > (gimple_resimplify7): New function. > (gimple_match_op::resimplify): Support > COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. > (convert_conditional_op): Ditto. > (build_call_internal): Ditto. > (try_conditional_simplification): Ditto. > (gimple_extract): Ditto. > * gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto. > * internal-fn.cc (CASE): Ditto. OK, thanks. Richard > > --- > gcc/genmatch.cc | 2 +- > gcc/gimple-match-exports.cc | 123 ++-- > gcc/gimple-match.h | 16 - > gcc/internal-fn.cc | 7 +- > 4 files changed, 138 insertions(+), 10 deletions(-) > > diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc > index f46d2e1520d..a1925a747a7 100644 > --- a/gcc/genmatch.cc > +++ b/gcc/genmatch.cc > @@ -4052,7 +4052,7 @@ decision_tree::gen (vec &files, bool gimple) > } >fprintf (stderr, "removed %u duplicate tails\n", rcnt); > > - for (unsigned n = 1; n <= 5; ++n) > + for (unsigned n = 1; n <= 7; ++n) > { >bool has_kids_p = false; > > diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc > index 7aeb4ddb152..b36027b0bad 100644 > --- a/gcc/gimple-match-exports.cc > +++ b/gcc/gimple-match-exports.cc > @@ -60,6 +60,12 @@ extern bool gimple_simplify (gimple_match_op *, gimple_seq > *, tree (*)(tree), >code_helper, tree, tree, tree, tree, tree); > extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), >code_helper, tree, tree, tree, tree, tree, tree); > +extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), > + code_helper, tre
[PATCH V2] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold
Hi, Richard and Richi. Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. It's supported in tree-ssa-math-opts.cc. However, GCC failed to support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. Consider this following case: #define TEST_TYPE(TYPE)\ __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst,\ TYPE *__restrict a, \ TYPE *__restrict b, int n) \ {\ for (int i = 0; i < n; i++)\ dst[i] -= a[i] * b[i]; \ } #define TEST_ALL() \ TEST_TYPE (float)\ TEST_ALL () Gimple IR for RVV: ... _39 = -vect__8.14_26; vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, vect__4.8_34, vect__4.8_34, _46, 0); ... This is because this following piece of codes in tree-ssa-math-opts.cc: if (len) fma_stmt = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2, addop, else_value, len, bias); else if (cond) fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1, op2, addop, else_value); else fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop); gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt)); gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun, use_stmt)); gsi_replace (&gsi, fma_stmt, true); /* Follow all SSA edges so that we generate FMS, FNMA and FNMS regardless of where the negation occurs. */ gimple *orig_stmt = gsi_stmt (gsi); if (fold_stmt (&gsi, follow_all_ssa_edges)) { if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi))) gcc_unreachable (); update_stmt (gsi_stmt (gsi)); } 'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA > COND_LEN_FNMA. This patch support STMT fold into: vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, vect__4.8_34, { 0.0, ... }, _46, 0); Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments. Extend maximum num ops: - static const unsigned int MAX_NUM_OPS = 5; + static const unsigned int MAX_NUM_OPS = 7; Bootstrap and Regtest on X86 passed. Tested on aarch64 Qemu. Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend. gcc/ChangeLog: * genmatch.cc (decision_tree::gen): Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. * gimple-match-exports.cc (gimple_simplify): Ditto. (gimple_resimplify6): New function. (gimple_resimplify7): New function. (gimple_match_op::resimplify): Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. (convert_conditional_op): Ditto. (build_call_internal): Ditto. (try_conditional_simplification): Ditto. (gimple_extract): Ditto. * gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto. * internal-fn.cc (CASE): Ditto. --- gcc/genmatch.cc | 2 +- gcc/gimple-match-exports.cc | 123 ++-- gcc/gimple-match.h | 16 - gcc/internal-fn.cc | 7 +- 4 files changed, 138 insertions(+), 10 deletions(-) diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc index f46d2e1520d..a1925a747a7 100644 --- a/gcc/genmatch.cc +++ b/gcc/genmatch.cc @@ -4052,7 +4052,7 @@ decision_tree::gen (vec &files, bool gimple) } fprintf (stderr, "removed %u duplicate tails\n", rcnt); - for (unsigned n = 1; n <= 5; ++n) + for (unsigned n = 1; n <= 7; ++n) { bool has_kids_p = false; diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc index 7aeb4ddb152..b36027b0bad 100644 --- a/gcc/gimple-match-exports.cc +++ b/gcc/gimple-match-exports.cc @@ -60,6 +60,12 @@ extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), code_helper, tree, tree, tree, tree, tree); extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), code_helper, tree, tree, tree, tree, tree, tree); +extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), +code_helper, tree, tree, tree, tree, tree, tree, +tree); +extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), +code_helper, tree, tree, tree, tree, tree, tree, +