> v2 Changes:
> 1. Enable proportion orig_sum to the new nodes for self recursive node:
>    new_sum = (orig_sum + new_sum) \
>    * self_recursive_probability * (1 / param_ipa_cp_max_recursive_depth).
> 2. Add value range for param_ipa_cp_max_recursive_depth.
> 
> The performance of exchange2 built with PGO will decrease ~28% by r278808
> due to profile count set incorrectly.  The cloned nodes are updated to a
> very small count caused later pass cunroll fail to unroll the recursive
> function in exchange2.
> 
> digits_2 ->
> digits_2.constprop.0, digits_2.constprop.1, etc.
> 
> gcc/ChangeLog:
> 
>       2019-12-26  Luo Xiong Hu  <luo...@linux.ibm.com>
> 
>       * ipa-cp.c (update_profiling_info): Check self_scc node.
>       * params.opt (ipa-cp-max-recursive-depth): Set param range.
> ---
>  gcc/ipa-cp.c   | 25 +++++++++++++++++++++++++
>  gcc/params.opt |  2 +-
>  2 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 14064ae0034..947bf7c7199 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -4272,6 +4272,31 @@ update_profiling_info (struct cgraph_node *orig_node,
>                                             false);
>    new_sum = stats.count_sum;
>  
> +  class ipa_node_params *info = IPA_NODE_REF (orig_node);
> +  if (info && info->node_is_self_scc)
> +    {
> +      profile_count self_recursive_count;
> +
> +      /* The self recursive edge is on the orig_node.  */
> +      for (cs = orig_node->callees; cs; cs = cs->next_callee)
> +     if (ipa_edge_within_scc (cs))
> +       {
> +         cgraph_node *callee = cs->callee->function_symbol ();
> +         if (callee && cs->caller == cs->callee)
> +           {
> +             self_recursive_count = cs->count;
> +             break;
> +           }
> +       }

What happens here when there are multiple self recursive calls or when
the SCC has two mutually recursive functions?

I am still confused by the logic of this function.  I will take a deeper
look at your previous email.
> +
> +      /* Proportion count for self recursive node from all callers.  */
> +      new_sum
> +     = (orig_sum + new_sum).apply_scale (self_recursive_count, orig_sum);
> +
> +      /* Proportion count for specialized cloned node.  */
> +      new_sum = new_sum.apply_scale (1, param_ipa_cp_max_recursive_depth);
> +    }
> +
>    if (orig_node_count < orig_sum + new_sum)
>      {
>        if (dump_file)
> diff --git a/gcc/params.opt b/gcc/params.opt
> index d88ae0c468b..40a03b04580 100644
> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -199,7 +199,7 @@ Common Joined UInteger Var(param_ipa_cp_loop_hint_bonus) 
> Init(64) Param
>  Compile-time bonus IPA-CP assigns to candidates which make loop bounds or 
> strides known.
>  
>  -param=ipa-cp-max-recursive-depth=
> -Common Joined UInteger Var(param_ipa_cp_max_recursive_depth) Init(8) Param
> +Common Joined UInteger Var(param_ipa_cp_max_recursive_depth) Init(8) 
> IntegerRange(1, 10) Param
>  Maximum depth of recursive cloning for self-recursive function.

The values stats from 0 but I also wonder why 10 is a good upper bound?
If the function calls itself with one type of value (like depth-1) then
we may clone well over 10 times, if it calls itself with two different
sets then 8 is already quite high I would say...

I suppose the call probabilities will eventually drop to be very low,
but I am not quite sure about that because we do not handle any IP
frequency propagation.  Do we have some way to treat wide trees? Or do
we clone only all self recursive calls are the same?

Honza
>  
>  -param=ipa-cp-min-recursive-probability=
> -- 
> 2.21.0.777.g83232e3864
> 

Reply via email to