On Fri, Feb 27, 2026 at 1:15 AM Andrew Pinski <[email protected]> wrote: > > In this case, early phiopt would get rid of the user provided predicator > for hot/cold as it would remove the basic blocks. The easiest and best option > is > for early phi-opt don't do phi-opt if the middle basic-block(s) have either > a hot or cold predict statement. Then after inlining, jump threading will > most likely > happen and that will keep around the predictor. > > Note this only needs to be done for match_simplify_replacement and not the > other > phi-opt functions because currently only match_simplify_replacement is able > to skip > middle bb with predicator statements in it. > > This allows for MIN/MAX/ABS/NEG still even with the predicators there as > those will > less likely be jump threaded later on. The main thing that is rejected is > ssa names > that are alone where one of the comparisons operands is that one or if we > produce > a comparison from the phiopt. > > Changes since v1: > * v2: Only reject if the result was the comparison. > > OK? Bootstrapped and tested on x86_64-linux-gnu.
OK. Thanks, Richard. > PR tree-optimization/117935 > > gcc/ChangeLog: > > * tree-ssa-phiopt.cc (contains_hot_cold_predict): New function. > (match_simplify_replacement): Return early if early_p and one of > the middle bb(s) have a hot/cold predict statement. > > gcc/testsuite/ChangeLog: > > * gcc.dg/predict-24.c: New test. > * gcc.dg/predict-25.c: New test. > > Signed-off-by: Andrew Pinski <[email protected]> > --- > gcc/testsuite/gcc.dg/predict-24.c | 24 ++++++++++++++ > gcc/testsuite/gcc.dg/predict-25.c | 24 ++++++++++++++ > gcc/tree-ssa-phiopt.cc | 52 +++++++++++++++++++++++++++++++ > 3 files changed, 100 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/predict-24.c > create mode 100644 gcc/testsuite/gcc.dg/predict-25.c > > diff --git a/gcc/testsuite/gcc.dg/predict-24.c > b/gcc/testsuite/gcc.dg/predict-24.c > new file mode 100644 > index 00000000000..b2c8a77c323 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/predict-24.c > @@ -0,0 +1,24 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */ > +/* PR tree-optimization/117935 */ > + > +static inline bool has_value(bool b) > +{ > + if (b) > + { > + [[gnu::hot, gnu::unused]] label1: > + return true; > + } > + else > + return false; > +} > +/* The hot label should last until it gets inlined into value_or and > jump_threaded. */ > +int value_or(bool b, int def0, int def1) > +{ > + if (has_value(b)) > + return def0; > + else > + return def1; > +} > +/* { dg-final { scan-tree-dump-times "first match heuristics: 90.00%" 2 > "profile_estimate"} } */ > +/* { dg-final { scan-tree-dump-times "hot label heuristics of edge > \[0-9\]+->\[0-9]+: 90.00%" 2 "profile_estimate"} } */ > diff --git a/gcc/testsuite/gcc.dg/predict-25.c > b/gcc/testsuite/gcc.dg/predict-25.c > new file mode 100644 > index 00000000000..027774ca746 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/predict-25.c > @@ -0,0 +1,24 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */ > +/* PR tree-optimization/117935 */ > + > +static inline bool has_value(int b) > +{ > + if (b) > + { > + [[gnu::hot, gnu::unused]] label1: > + return true; > + } > + else > + return false; > +} > +/* The hot label should last until it gets inlined into value_or and > jump_threaded. */ > +int value_or(int b, int def0, int def1) > +{ > + if (has_value(b)) > + return def0; > + else > + return def1; > +} > +/* { dg-final { scan-tree-dump-times "first match heuristics: 90.00%" 2 > "profile_estimate"} } */ > +/* { dg-final { scan-tree-dump-times "hot label heuristics of edge > \[0-9\]+->\[0-9]+: 90.00%" 2 "profile_estimate"} } */ > diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc > index fcf44136d0a..0bf7e58b8f0 100644 > --- a/gcc/tree-ssa-phiopt.cc > +++ b/gcc/tree-ssa-phiopt.cc > @@ -55,6 +55,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree-ssa-propagate.h" > #include "tree-ssa-dce.h" > #include "tree-ssa-loop-niter.h" > +#include "gimple-predict.h" > > /* Return the singleton PHI in the SEQ of PHIs for edges E0 and E1. */ > > @@ -913,6 +914,27 @@ auto_flow_sensitive::~auto_flow_sensitive () > p.second.restore (p.first); > } > > +/* Returns true if BB contains an user provided predictor > + (PRED_HOT_LABEL/PRED_COLD_LABEL). */ > + > +static bool > +contains_hot_cold_predict (basic_block bb) > +{ > + gimple_stmt_iterator gsi; > + gsi = gsi_start_nondebug_after_labels_bb (bb); > + for (; !gsi_end_p (gsi); gsi_next_nondebug (&gsi)) > + { > + gimple *s = gsi_stmt (gsi); > + if (gimple_code (s) != GIMPLE_PREDICT) > + continue; > + auto predict = gimple_predict_predictor (s); > + if (predict == PRED_HOT_LABEL > + || predict == PRED_COLD_LABEL) > + return true; > + } > + return false; > +} > + > /* The function match_simplify_replacement does the main work of doing the > replacement using match and simplify. Return true if the replacement is > done. > Otherwise return false. > @@ -1006,6 +1028,36 @@ match_simplify_replacement (basic_block cond_bb, > basic_block middle_bb, > &seq); > } > > + /* For early phiopt, we don't want to lose user generated predictors > + if the phiopt is converting `if (a)` into `a` as that might > + be jump threaded later on so we want to keep around the > + predictors. */ > + if (early_p && result && TREE_CODE (result) == SSA_NAME) > + { > + bool check_it = false; > + tree cmp0 = gimple_cond_lhs (stmt); > + tree cmp1 = gimple_cond_rhs (stmt); > + if (result == cmp0 || result == cmp1) > + check_it = true; > + else if (gimple_seq_singleton_p (seq)) > + { > + gimple *stmt = gimple_seq_first_stmt (seq); > + if (is_gimple_assign (stmt) > + && result == gimple_assign_lhs (stmt) > + && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) > + == tcc_comparison) > + check_it = true; > + } > + if (!check_it) > + ; > + else if (contains_hot_cold_predict (middle_bb)) > + return false; > + else if (threeway_p > + && middle_bb != middle_bb_alt > + && contains_hot_cold_predict (middle_bb_alt)) > + return false; > + } > + > if (!result) > { > /* If we don't get back a MIN/MAX_EXPR still make sure the expression > -- > 2.43.0 >
