Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-12-01 Thread Feng Xue OS
Done.   -Thanks

Feng



From: Jeff Law 
Sent: Monday, December 2, 2019 4:33 AM
To: Feng Xue OS; Martin Jambor; Jan Hubicka; Richard Biener
Cc: luoxhu; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Support multi-versioning on self-recursive function 
(ipa/92133)

On 11/26/19 6:44 PM, Feng Xue OS wrote:
> Hi, Richard,
>
>   This patch is a not bugfix, while it is small. Martin and Honza are fine 
> with it.
> But now we are in stage 3, is it ok to commit?
Yes.  It was posted well in advance of the stage1->stage3 change.
Please commit it ASAP so the testers can pick it up.

Thanks,
jeff



Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-12-01 Thread Jeff Law
On 11/26/19 6:44 PM, Feng Xue OS wrote:
> Hi, Richard,
> 
>   This patch is a not bugfix, while it is small. Martin and Honza are fine 
> with it.
> But now we are in stage 3, is it ok to commit?
Yes.  It was posted well in advance of the stage1->stage3 change.
Please commit it ASAP so the testers can pick it up.

Thanks,
jeff



Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-27 Thread Feng Xue OS
Have no a complete idea for new cost model, but there are something we can try:
o.  adjust estimated profile for recursive function so that we can get a higher 
threshold.
o.  introduce a strength level for ipa-cp-clone, by default, it is same as 
current configuration,
and with larger number, ipa-cp works more aggressively.
o. integrate frequency information to computation of prop_time_benefit.

Feng


From: Jan Hubicka 
Sent: Wednesday, November 27, 2019 10:27 PM
To: Feng Xue OS
Cc: Martin Jambor; Richard Biener; luoxhu; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Support multi-versioning on self-recursive function 
(ipa/92133)

> 2019-11-15  Feng Xue 
>
> PR ipa/92133
> * doc/invoke.texi (ipa-cp-max-recursive-depth): Document new option.
> (ipa-cp-min-recursive-probability): Likewise.
> * params.opt (ipa-cp-max-recursive-depth): New.
> (ipa-cp-min-recursive-probability): Likewise.
> * ipa-cp.c (ipcp_lattice::add_value): Add two new parameters
> val_p and unlimited.
> (self_recursively_generated_p): New function.
> (get_val_across_arith_op): Likewise.
> (propagate_vals_across_arith_jfunc): Add constant propagation for
> self-recursive function.
> (incorporate_penalties): Do not penalize pure self-recursive function.
> (good_cloning_opportunity_p): Dump node_is_self_scc flag.
> (propagate_constants_topo): Set node_is_self_scc flag for cgraph node.
> (get_info_about_necessary_edges): Relax hotness check for edge to
> self-recursive function.
> * ipa-prop.h (ipa_node_params): Add new field node_is_self_scc.

OK, thanks!
do you have some plans on the better cost model for the recursive
cloning? Also it would be nice to have this info available in recursive
inliner and give it a higher priority when inlining is going to turn
previously recrusive call into non-recursive.

Honza


Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-27 Thread Jan Hubicka
> 2019-11-15  Feng Xue 
> 
> PR ipa/92133
> * doc/invoke.texi (ipa-cp-max-recursive-depth): Document new option.
> (ipa-cp-min-recursive-probability): Likewise.
> * params.opt (ipa-cp-max-recursive-depth): New.
> (ipa-cp-min-recursive-probability): Likewise.
> * ipa-cp.c (ipcp_lattice::add_value): Add two new parameters
> val_p and unlimited.
> (self_recursively_generated_p): New function.
> (get_val_across_arith_op): Likewise.
> (propagate_vals_across_arith_jfunc): Add constant propagation for
> self-recursive function.
> (incorporate_penalties): Do not penalize pure self-recursive function.
> (good_cloning_opportunity_p): Dump node_is_self_scc flag.
> (propagate_constants_topo): Set node_is_self_scc flag for cgraph node.
> (get_info_about_necessary_edges): Relax hotness check for edge to
> self-recursive function.
> * ipa-prop.h (ipa_node_params): Add new field node_is_self_scc.

OK, thanks!
do you have some plans on the better cost model for the recursive
cloning? Also it would be nice to have this info available in recursive
inliner and give it a higher priority when inlining is going to turn
previously recrusive call into non-recursive.

Honza


Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-26 Thread Feng Xue OS
Hi, Richard,

  This patch is a not bugfix, while it is small. Martin and Honza are fine with 
it.
But now we are in stage 3, is it ok to commit?

Thanks,
Feng


From: Feng Xue OS 
Sent: Monday, November 25, 2019 10:17 PM
To: Martin Jambor; Jan Hubicka
Cc: luoxhu; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Support multi-versioning on self-recursive function 
(ipa/92133)

Martin,

Thanks for your review. I updated the patch with your comments.

Feng
---
2019-11-15  Feng Xue 

PR ipa/92133
* doc/invoke.texi (ipa-cp-max-recursive-depth): Document new option.
(ipa-cp-min-recursive-probability): Likewise.
* params.opt (ipa-cp-max-recursive-depth): New.
(ipa-cp-min-recursive-probability): Likewise.
* ipa-cp.c (ipcp_lattice::add_value): Add two new parameters
val_p and unlimited.
(self_recursively_generated_p): New function.
(get_val_across_arith_op): Likewise.
(propagate_vals_across_arith_jfunc): Add constant propagation for
self-recursive function.
(incorporate_penalties): Do not penalize pure self-recursive function.
(good_cloning_opportunity_p): Dump node_is_self_scc flag.
(propagate_constants_topo): Set node_is_self_scc flag for cgraph node.
(get_info_about_necessary_edges): Relax hotness check for edge to
self-recursive function.
* ipa-prop.h (ipa_node_params): Add new field node_is_self_scc.
---

> The least important comment: Thanks for providing the ChangeLog but
> sending ChangeLog in the patch itself makes it difficult for people to
> apply the patch because the ChangeLog hunks will never apply cleanly.
> That's why people send them in the email body when they email
> gcc-patches.  Hopefully we'll be putting ChangeLogs only into the commit
> message soon and all of this will be a non-issue.
Ok.

>> +  if (val_pos_p)
>> +{
>> +  for (val = values; val && val != *val_pos_p; val = val->next);

> Please put empty statements on a separate line (I think there is one
> more in the patch IIRC).
Done.

>> +  if (val_pos_p)
>> +{
>> +  val->next = (*val_pos_p)->next;
>> +  (*val_pos_p)->next = val;
>> +  *val_pos_p = val;
>> +}

> I must say I am not a big fan of the val_pos_p parameter.  Would simply
> putting new values always at the end of the list work to reduce the
> iterations?  At this point the function has traversed all the values
> already when searching if it is present, so it can remember the last
> one and just add stuff after it.
We need a parameter to record address of the added value, which will be used
in constructing next one. To place new value at the end of list is a good idea,
thus meaning of val_pos_p becomes simpler, it is only an out parameter now.

>> +  if (!src->val || cs->caller != cs->callee->function_symbol ()
>> +   || src->val == val)
>> + return false;

> I'd suggest putting the condition calling function_symbol last.
Original condition order can reduce unnecessary evaluations on src->val==val,
which is false in most situation, while cs->caller != 
cs->callee->function_symbol ()
is true.

>> +  for (src_val = src_lat->values; src_val && src_val != val;
>> +src_val = src_val->next);

> I think that GNU code style dictates that when a for does not fit on a
> single line, the three bits have to be on a separate line each.  Plus
> please put the empty statement on its own line too.
Done.

>> +static tree
>> +get_val_across_arith_op (enum tree_code opcode,
>> +  tree opnd1_type,
>> +  tree opnd2,
>> +  ipcp_value *src_val,
>> +  tree res_type)

> The function should get at least a brief comment.
Done.

>> +   for (ipcp_value_source *s = src_val->sources; s;
>> +s = s->next)

> I'm afraid you'll have to reformat this for loop too.
Done.
>> + if (self_recursively_generated_p (src_val))
>> +   continue;

> I think you are missing a necessary call to
> dest_lat->set_contains_variable() here.  Either call it or initialize
> cstval to zero and call get_val_across_arith_op only when
> self_recursively_generated_p returns false;
Yes, this is a bug. Fixed.


0001-Enable-recursive-function-versioning.patch
Description: 0001-Enable-recursive-function-versioning.patch


Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-25 Thread Feng Xue OS
Martin,

Thanks for your review. I updated the patch with your comments.

Feng
---
2019-11-15  Feng Xue 

PR ipa/92133
* doc/invoke.texi (ipa-cp-max-recursive-depth): Document new option.
(ipa-cp-min-recursive-probability): Likewise.
* params.opt (ipa-cp-max-recursive-depth): New.
(ipa-cp-min-recursive-probability): Likewise.
* ipa-cp.c (ipcp_lattice::add_value): Add two new parameters
val_p and unlimited.
(self_recursively_generated_p): New function.
(get_val_across_arith_op): Likewise.
(propagate_vals_across_arith_jfunc): Add constant propagation for
self-recursive function.
(incorporate_penalties): Do not penalize pure self-recursive function.
(good_cloning_opportunity_p): Dump node_is_self_scc flag.
(propagate_constants_topo): Set node_is_self_scc flag for cgraph node.
(get_info_about_necessary_edges): Relax hotness check for edge to
self-recursive function.
* ipa-prop.h (ipa_node_params): Add new field node_is_self_scc.
---

> The least important comment: Thanks for providing the ChangeLog but
> sending ChangeLog in the patch itself makes it difficult for people to
> apply the patch because the ChangeLog hunks will never apply cleanly.
> That's why people send them in the email body when they email
> gcc-patches.  Hopefully we'll be putting ChangeLogs only into the commit
> message soon and all of this will be a non-issue.
Ok.

>> +  if (val_pos_p)
>> +{
>> +  for (val = values; val && val != *val_pos_p; val = val->next);

> Please put empty statements on a separate line (I think there is one
> more in the patch IIRC).
Done.

>> +  if (val_pos_p)
>> +{
>> +  val->next = (*val_pos_p)->next;
>> +  (*val_pos_p)->next = val;
>> +  *val_pos_p = val;
>> +}

> I must say I am not a big fan of the val_pos_p parameter.  Would simply
> putting new values always at the end of the list work to reduce the
> iterations?  At this point the function has traversed all the values
> already when searching if it is present, so it can remember the last
> one and just add stuff after it.
We need a parameter to record address of the added value, which will be used
in constructing next one. To place new value at the end of list is a good idea,
thus meaning of val_pos_p becomes simpler, it is only an out parameter now.

>> +  if (!src->val || cs->caller != cs->callee->function_symbol ()
>> +   || src->val == val)
>> + return false;

> I'd suggest putting the condition calling function_symbol last.
Original condition order can reduce unnecessary evaluations on src->val==val,
which is false in most situation, while cs->caller != 
cs->callee->function_symbol ()
is true. 

>> +  for (src_val = src_lat->values; src_val && src_val != val;
>> +src_val = src_val->next);

> I think that GNU code style dictates that when a for does not fit on a
> single line, the three bits have to be on a separate line each.  Plus
> please put the empty statement on its own line too.
Done.

>> +static tree
>> +get_val_across_arith_op (enum tree_code opcode,
>> +  tree opnd1_type,
>> +  tree opnd2,
>> +  ipcp_value *src_val,
>> +  tree res_type)

> The function should get at least a brief comment.
Done.

>> +   for (ipcp_value_source *s = src_val->sources; s;
>> +s = s->next)

> I'm afraid you'll have to reformat this for loop too.
Done.
>> + if (self_recursively_generated_p (src_val))
>> +   continue;

> I think you are missing a necessary call to
> dest_lat->set_contains_variable() here.  Either call it or initialize
> cstval to zero and call get_val_across_arith_op only when
> self_recursively_generated_p returns false;
Yes, this is a bug. Fixed.


0001-Enable-recursive-function-versioning.patch
Description: 0001-Enable-recursive-function-versioning.patch


Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-22 Thread Martin Jambor
Hi,

On Fri, Nov 15 2019, Feng Xue OS wrote:
> Honza,
>
> I made some changes: do not penalize self-recursive function, and
> add --param ipa-cp-min-recursive-probability, similar to recursive
> inline. Please review this new one.

The patch and its effect on exchange is intriguing, I only have a few
comments, see below, otherwise I am happy about it.

> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index fe8daf40888..c84f4d73ed6 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,23 @@
> +2019-11-15  Feng Xue 
> +
> + PR ipa/92133
> + * doc/invoke.texi (ipa-cp-max-recursive-depth): Document new option.
> + (ipa-cp-min-recursive-probability): Likewise.
> + * params.opt (ipa-cp-max-recursive-depth): New.
> + (ipa-cp-min-recursive-probability): Likewise.
> + * ipa-cp.c (ipcp_lattice::add_value): Add two new parameters
> + val_pos_p and unlimited.
> + (self_recursively_generated_p): New function.
> + (get_val_across_arith_op): Likewise.
> + (propagate_vals_across_arith_jfunc): Add constant propagation for
> + self-recursive function.
> + (incorporate_penalties): Do not penalize pure self-recursive function.
> + (good_cloning_opportunity_p): Dump node_is_self_scc flag.
> + (propagate_constants_topo): Set node_is_self_scc flag for cgraph node.
> + (get_info_about_necessary_edges): Relax hotness check for edge to
> + self-recursive function.
> + * ipa-prop.h (ipa_node_params): Add new field node_is_self_scc.
> +

The least important comment: Thanks for providing the ChangeLog but
sending ChangeLog in the patch itself makes it difficult for people to
apply the patch because the ChangeLog hunks will never apply cleanly.
That's why people send them in the email body when they email
gcc-patches.  Hopefully we'll be putting ChangeLogs only into the commit
message soon and all of this will be a non-issue.

>  2019-11-15  Feng Xue  
>  
>   PR ipa/92528
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index fe79ca2247a..c30adfd31c2 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -12045,6 +12045,13 @@ IPA-CP calculates its own score of cloning 
> profitability heuristics
>  and performs those cloning opportunities with scores that exceed
>  @option{ipa-cp-eval-threshold}.
>  
> +@item ipa-cp-max-recursive-depth
> +Maximum depth of recursive cloning for self-recursive function.
> +
> +@item ipa-cp-min-recursive-probability
> +Recursive cloning only when the probability of call being executed exceeds
> +the parameter.
> +
>  @item ipa-cp-recursion-penalty
>  Percentage penalty the recursive functions will receive when they
>  are evaluated for cloning.
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 8372dfaa771..bbf508bca6c 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -228,7 +228,9 @@ public:
>inline bool set_contains_variable ();
>bool add_value (valtype newval, cgraph_edge *cs,
> ipcp_value *src_val = NULL,
> -   int src_idx = 0, HOST_WIDE_INT offset = -1);
> +   int src_idx = 0, HOST_WIDE_INT offset = -1,
> +   ipcp_value **val_pos_p = NULL,
> +   bool unlimited = false);
>void print (FILE * f, bool dump_sources, bool dump_benefits);
>  };
>  
> @@ -1817,22 +1819,37 @@ allocate_and_init_ipcp_value 
> (ipa_polymorphic_call_context source)
>  /* Try to add NEWVAL to LAT, potentially creating a new ipcp_value for it.  
> CS,
> SRC_VAL SRC_INDEX and OFFSET are meant for add_source and have the same
> meaning.  OFFSET -1 means the source is scalar and not a part of an
> -   aggregate.  */
> +   aggregate.  If non-NULL, VAL_POS_P specifies position in value list,
> +   after which newly created ipcp_value will be inserted, and it is also
> +   used to record address of the added ipcp_value before function returns.
> +   UNLIMITED means whether value count should not exceed the limit given
> +   by PARAM_IPA_CP_VALUE_LIST_SIZE.  */
>  
>  template 
>  bool
>  ipcp_lattice::add_value (valtype newval, cgraph_edge *cs,
> ipcp_value *src_val,
> -   int src_idx, HOST_WIDE_INT offset)
> +   int src_idx, HOST_WIDE_INT offset,
> +   ipcp_value **val_pos_p,
> +   bool unlimited)
>  {
>ipcp_value *val;
>  
> +  if (val_pos_p)
> +{
> +  for (val = values; val && val != *val_pos_p; val = val->next);

Please put empty statements on a separate line (I think there is one
more in the patch IIRC).

> +  gcc_checking_assert (val);
> +}
> +
>if (bottom)
>  return false;
>  
>for (val = values; val; val = val->next)
>  if (values_equal_for_ipcp_p (val->value, newval))
>{
> + if (val_pos_p)
> +   *val_pos_p = val;
> +
>   if (ipa_edge_within_scc (cs))
> {
>   ipcp_value_source *s;
> @@ -1847,7 +1864,7 @@ ipcp_lattice::add_value 

Ping: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-21 Thread Feng Xue OS
Honza, Martin,

  Hope your more comments on this patch. Not sure you base option on it. I 
think this can be a start point for
recursive versioning. And later we definitely need further cost-model tuning 
not only on this, but also whole
 ipa-cp to enable more aggressive cloning.

Thanks,
Feng


From: Feng Xue OS 
Sent: Friday, November 15, 2019 11:32 PM
To: Jan Hubicka
Cc: luoxhu; gcc-patches@gcc.gnu.org; Martin Jambor
Subject: Re: [PATCH] Support multi-versioning on self-recursive function 
(ipa/92133)

Honza,

   I made some changes: do not penalize self-recursive function, and add 
--param ipa-cp-min-recursive-probability, similar to recursive inline. Please 
review this new one.

Thanks,
Feng


From: Jan Hubicka 
Sent: Friday, November 15, 2019 4:33 AM
To: Feng Xue OS
Cc: luoxhu; gcc-patches@gcc.gnu.org; Martin Jambor
Subject: Re: [PATCH] Support multi-versioning on self-recursive function 
(ipa/92133)

> >> Cost model used by self-recursive cloning is mainly based on existing 
> >> stuffs
> >> in ipa-cp cloning, size growth and time benefit are considered. But since
> >> recursive cloning is a more aggressive cloning, we will actually have 
> >> another
> >> problem, which is opposite to your concern.  By default, current parameter
> >> set used to control ipa-cp and recursive-inliner gives priority to code 
> >> size,
> >> not completely for performance. This makes ipa-cp behave somewhat
>
> > Yes, for a while the cost model is quite off.  On Firefox it does just
> > few clonings where code size increases so it desprately needs retuning.
>
> > But since rescursive cloning is quite a different case from normal one,
> > perhaps having independent set of limits would help in particular ...
> I did consider this way, but this seems to be contradictory for normal
> and recursive cloning.

We could definitly discuss cost model incrementally. It is safe to do
what you do currently (rely on the existing ipa-cp's overconservative
heuristics).

>
> > > Do you have some data on code size/performance effects of this change?
> > For spec2017, no obvious code size and performance change with default 
> > setting.
> > Specifically, for exchange2, with ipa-cp-eval-threshold=1 and 
> > ipcp-unit-growth=80,
> > performance +31%, size +7%, on aarch64.
>
> > ... it will help here since ipa-cp-eval-threshold value needed are quite 
> > off of what we need to do.
>
> > I wonder about the 80% of unit growth which is also more than we can
> > enable by default.  How it comes the overal size change is only 7%?
> 343624 -> 365632 (this contains debug info, -g)recursion-depth=8
> 273488 -> 273760 (no debug info)   recursion-depth=8

What seems bit odd is that ipcp's metrics ends up with 80% code growth.
I will try to look into it and see if I can think better what to do
about the costs.

Honza


Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-15 Thread Feng Xue OS
Honza,

   I made some changes: do not penalize self-recursive function, and add 
--param ipa-cp-min-recursive-probability, similar to recursive inline. Please 
review this new one.

Thanks,
Feng


From: Jan Hubicka 
Sent: Friday, November 15, 2019 4:33 AM
To: Feng Xue OS
Cc: luoxhu; gcc-patches@gcc.gnu.org; Martin Jambor
Subject: Re: [PATCH] Support multi-versioning on self-recursive function 
(ipa/92133)

> >> Cost model used by self-recursive cloning is mainly based on existing 
> >> stuffs
> >> in ipa-cp cloning, size growth and time benefit are considered. But since
> >> recursive cloning is a more aggressive cloning, we will actually have 
> >> another
> >> problem, which is opposite to your concern.  By default, current parameter
> >> set used to control ipa-cp and recursive-inliner gives priority to code 
> >> size,
> >> not completely for performance. This makes ipa-cp behave somewhat
>
> > Yes, for a while the cost model is quite off.  On Firefox it does just
> > few clonings where code size increases so it desprately needs retuning.
>
> > But since rescursive cloning is quite a different case from normal one,
> > perhaps having independent set of limits would help in particular ...
> I did consider this way, but this seems to be contradictory for normal
> and recursive cloning.

We could definitly discuss cost model incrementally. It is safe to do
what you do currently (rely on the existing ipa-cp's overconservative
heuristics).

>
> > > Do you have some data on code size/performance effects of this change?
> > For spec2017, no obvious code size and performance change with default 
> > setting.
> > Specifically, for exchange2, with ipa-cp-eval-threshold=1 and 
> > ipcp-unit-growth=80,
> > performance +31%, size +7%, on aarch64.
>
> > ... it will help here since ipa-cp-eval-threshold value needed are quite 
> > off of what we need to do.
>
> > I wonder about the 80% of unit growth which is also more than we can
> > enable by default.  How it comes the overal size change is only 7%?
> 343624 -> 365632 (this contains debug info, -g)recursion-depth=8
> 273488 -> 273760 (no debug info)   recursion-depth=8

What seems bit odd is that ipcp's metrics ends up with 80% code growth.
I will try to look into it and see if I can think better what to do
about the costs.

Honza


0001-recursive-clone.patch
Description: 0001-recursive-clone.patch


Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-14 Thread Jan Hubicka
> >> Cost model used by self-recursive cloning is mainly based on existing 
> >> stuffs
> >> in ipa-cp cloning, size growth and time benefit are considered. But since
> >> recursive cloning is a more aggressive cloning, we will actually have 
> >> another
> >> problem, which is opposite to your concern.  By default, current parameter
> >> set used to control ipa-cp and recursive-inliner gives priority to code 
> >> size,
> >> not completely for performance. This makes ipa-cp behave somewhat
> 
> > Yes, for a while the cost model is quite off.  On Firefox it does just
> > few clonings where code size increases so it desprately needs retuning.
> 
> > But since rescursive cloning is quite a different case from normal one,
> > perhaps having independent set of limits would help in particular ...
> I did consider this way, but this seems to be contradictory for normal
> and recursive cloning.

We could definitly discuss cost model incrementally. It is safe to do
what you do currently (rely on the existing ipa-cp's overconservative
heuristics).  

> 
> > > Do you have some data on code size/performance effects of this change?
> > For spec2017, no obvious code size and performance change with default 
> > setting.
> > Specifically, for exchange2, with ipa-cp-eval-threshold=1 and 
> > ipcp-unit-growth=80,
> > performance +31%, size +7%, on aarch64.
> 
> > ... it will help here since ipa-cp-eval-threshold value needed are quite 
> > off of what we need to do.
> 
> > I wonder about the 80% of unit growth which is also more than we can
> > enable by default.  How it comes the overal size change is only 7%?
> 343624 -> 365632 (this contains debug info, -g)recursion-depth=8
> 273488 -> 273760 (no debug info)   recursion-depth=8

What seems bit odd is that ipcp's metrics ends up with 80% code growth.
I will try to look into it and see if I can think better what to do
about the costs.

Honza


Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-14 Thread Feng Xue OS
>> Cost model used by self-recursive cloning is mainly based on existing stuffs
>> in ipa-cp cloning, size growth and time benefit are considered. But since
>> recursive cloning is a more aggressive cloning, we will actually have another
>> problem, which is opposite to your concern.  By default, current parameter
>> set used to control ipa-cp and recursive-inliner gives priority to code size,
>> not completely for performance. This makes ipa-cp behave somewhat

> Yes, for a while the cost model is quite off.  On Firefox it does just
> few clonings where code size increases so it desprately needs retuning.

> But since rescursive cloning is quite a different case from normal one,
> perhaps having independent set of limits would help in particular ...
I did consider this way, but this seems to be contradictory for normal
and recursive cloning.

> > Do you have some data on code size/performance effects of this change?
> For spec2017, no obvious code size and performance change with default 
> setting.
> Specifically, for exchange2, with ipa-cp-eval-threshold=1 and 
> ipcp-unit-growth=80,
> performance +31%, size +7%, on aarch64.

> ... it will help here since ipa-cp-eval-threshold value needed are quite off 
> of what we need to do.

> I wonder about the 80% of unit growth which is also more than we can
> enable by default.  How it comes the overal size change is only 7%?
343624 -> 365632 (this contains debug info, -g)recursion-depth=8
273488 -> 273760 (no debug info)   recursion-depth=8

Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-14 Thread Jan Hubicka
> Thanks for your review.
> 
> > In general the patch looks good to me, but I would like Martin Jambor to
> > comment on the ipa-prop/cp interfaces. However...
> 
> > +@item ipa-cp-max-recursion-depth
> > +Maximum depth of recursive cloning for self-recursive function.
> > +
> 
> > ... I believe we will need more careful cost model for this.  I think
> > we want to limit the overall growth for all the clones and also probably
> > enable this only when ipa-predicates things the individual clones will
> > actualy be faster by some non-trivial percentage. For recursive inliner
> > we have:
> 
> Cost model used by self-recursive cloning is mainly based on existing stuffs
> in ipa-cp cloning, size growth and time benefit are considered. But since
> recursive cloning is a more aggressive cloning, we will actually have another
> problem, which is opposite to your concern.  By default, current parameter
> set used to control ipa-cp and recursive-inliner gives priority to code size,
> not completely for performance. This makes ipa-cp behave somewhat

Yes, for a while the cost model is quite off.  On Firefox it does just
few clonings where code size increases so it desprately needs retuning.

But since rescursive cloning is quite a different case from normal one,
perhaps having independent set of limits would help in particular ...
> conservatively, and as a result, it can not trigger expected recursive cloning
> for the case in SPEC2017.exchange2 with default setting, blocked by both
> ipa-cp-eval-threshold and ipcp-unit-growth. The former is due to improper
> static profile estimation, and the latter is conflicted to aggressiveness of
> recursive cloning. Thus, we have to explicitly lower the threshold and raise
> percentage of unit-growth.
> 
> We might not reach the destination in one leap. This patch is just first step
> to enable recursive function versioning. And next, we still need further
> elaborate tuning on this.
> 
> > --param max-inline-recursive-depth which has similar meaning to your 
> > parameter
> >  (so perhaps similar name would be good)
> > --param min-inline-recursive-probability
> >  which requires the inlining to happen only across edges which are
> >  known to be taken with reasonable chance
> > --param max-inline-insns-recursive
> >  which specifies overall size after all the recursive inlining
> 
> > Those parameters are not parituclarly well tought out or tested, but
> > they may be good start.
> 
> > Do you have some data on code size/performance effects of this change?
> For spec2017, no obvious code size and performance change with default 
> setting.
> Specifically, for exchange2, with ipa-cp-eval-threshold=1 and 
> ipcp-unit-growth=80,
> performance +31%, size +7%, on aarch64.

... it will help here since ipa-cp-eval-threshold value needed are quite off of 
what we need to do.

I wonder about the 80% of unit growth which is also more than we can
enable by default.  How it comes the overal size change is only 7%?

Honza
> 
> Feng


Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-14 Thread Feng Xue OS
Thanks for your review.

> In general the patch looks good to me, but I would like Martin Jambor to
> comment on the ipa-prop/cp interfaces. However...

> +@item ipa-cp-max-recursion-depth
> +Maximum depth of recursive cloning for self-recursive function.
> +

> ... I believe we will need more careful cost model for this.  I think
> we want to limit the overall growth for all the clones and also probably
> enable this only when ipa-predicates things the individual clones will
> actualy be faster by some non-trivial percentage. For recursive inliner
> we have:

Cost model used by self-recursive cloning is mainly based on existing stuffs
in ipa-cp cloning, size growth and time benefit are considered. But since
recursive cloning is a more aggressive cloning, we will actually have another
problem, which is opposite to your concern.  By default, current parameter
set used to control ipa-cp and recursive-inliner gives priority to code size,
not completely for performance. This makes ipa-cp behave somewhat
conservatively, and as a result, it can not trigger expected recursive cloning
for the case in SPEC2017.exchange2 with default setting, blocked by both
ipa-cp-eval-threshold and ipcp-unit-growth. The former is due to improper
static profile estimation, and the latter is conflicted to aggressiveness of
recursive cloning. Thus, we have to explicitly lower the threshold and raise
percentage of unit-growth.

We might not reach the destination in one leap. This patch is just first step
to enable recursive function versioning. And next, we still need further
elaborate tuning on this.

> --param max-inline-recursive-depth which has similar meaning to your parameter
>  (so perhaps similar name would be good)
> --param min-inline-recursive-probability
>  which requires the inlining to happen only across edges which are
>  known to be taken with reasonable chance
> --param max-inline-insns-recursive
>  which specifies overall size after all the recursive inlining

> Those parameters are not parituclarly well tought out or tested, but
> they may be good start.

> Do you have some data on code size/performance effects of this change?
For spec2017, no obvious code size and performance change with default setting.
Specifically, for exchange2, with ipa-cp-eval-threshold=1 and 
ipcp-unit-growth=80,
performance +31%, size +7%, on aarch64.

Feng

Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-14 Thread Jan Hubicka
Hi,
I think the patch generally looks reasonable

+2019-11-13  Feng Xue 
+
+   PR ipa/92133
+   * doc/invoke.texi (ipa-cp-max-recursion-depth): Document new option.
+   * params.opt (ipa-cp-max-recursion-depth): New.
+   * ipa-cp.c (ipcp_lattice::add_value): Add two new parameters
+   val_pos_p and unlimited.
+   (self_recursively_generated_p): New function.
+   (get_val_across_arith_op): Likewise.
+   (propagate_vals_across_arith_jfunc): Add constant propagation for
+   self-recursive function.
+   (incorporate_penalties): Do not penalize pure self-recursive function.
+   (good_cloning_opportunity_p): Dump node_is_self_scc flag.
+   (propagate_constants_topo): Set node_is_self_scc flag for cgraph node.
+   (get_info_about_necessary_edges): Relax hotness check for edge to
+   self-recursive function.
+   * ipa-prop.h (ipa_node_params): Add new field node_is_self_scc.
+

In general the patch looks good to me, but I would like Martin Jambor to
comment on the ipa-prop/cp interfaces. However...
 
+@item ipa-cp-max-recursion-depth
+Maximum depth of recursive cloning for self-recursive function.
+

... I believe we will need more careful cost model for this.  I think
we want to limit the overall growth for all the clones and also probably
enable this only when ipa-predicates things the individual clones will
actualy be faster by some non-trivial percentage. For recursive inliner
we have:

--param max-inline-recursive-depth which has similar meaning to your parameter
  (so perhaps similar name would be good)
--param min-inline-recursive-probability
  which requires the inlining to happen only across edges which are
  known to be taken with reasonable chance
--param max-inline-insns-recursive
  which specifies overall size after all the recursive inlining

Those parameters are not parituclarly well tought out or tested, but
they may be good start.

Do you have some data on code size/performance effects of this change?

Honza


Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-10-24 Thread Feng Xue OS
Hi,

  Actually, this patch is not final one. Since the previous cp-by-ref patch is 
still on the way to the trunk,
I tailored it to only handle by-value recursion, so that it is decoupled with 
the previous patch, and can
be reviewed independently. I attached the final patch, you can have a try.

 CP propagation stage might generate values outside of recursion ranges, but 
cloning stage will not
make a duplicate for invalid recursion value, with which we do not need extra 
code for recursion range
analysis.

Feng


From: luoxhu 
Sent: Thursday, October 24, 2019 1:44 PM
To: Feng Xue OS; gcc-patches@gcc.gnu.org; Jan Hubicka; Martin Jambor
Subject: Re: [PATCH] Support multi-versioning on self-recursive function 
(ipa/92133)

Hi,


On 2019/10/17 16:23, Feng Xue OS wrote:
> IPA does not allow constant propagation on parameter that is used to control
> function recursion.
>
> recur_fn (i)
> {
>if ( !terminate_recursion (i))
>  {
>...
>recur_fn (i + 1);
>...
>  }
>...
> }
>
> This patch is composed to enable multi-versioning for self-recursive function,
> and versioning copies is limited by a specified option.
>
> Feng
> ---
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 045072e02ec..6255a766e4d 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -229,7 +229,9 @@ public:
> inline bool set_contains_variable ();
> bool add_value (valtype newval, cgraph_edge *cs,
> ipcp_value *src_val = NULL,
> -   int src_idx = 0, HOST_WIDE_INT offset = -1);
> +   int src_idx = 0, HOST_WIDE_INT offset = -1,
> +   ipcp_value **val_pos_p = NULL,
> +   bool unlimited = false);
> void print (FILE * f, bool dump_sources, bool dump_benefits);
>   };
>
> @@ -1579,22 +1581,37 @@ allocate_and_init_ipcp_value 
> (ipa_polymorphic_call_context source)
>   /* Try to add NEWVAL to LAT, potentially creating a new ipcp_value for it.  
> CS,
>  SRC_VAL SRC_INDEX and OFFSET are meant for add_source and have the same
>  meaning.  OFFSET -1 means the source is scalar and not a part of an
> -   aggregate.  */
> +   aggregate.  If non-NULL, VAL_POS_P specifies position in value list,
> +   after which newly created ipcp_value will be inserted, and it is also
> +   used to record address of the added ipcp_value before function returns.
> +   UNLIMITED means whether value count should not exceed the limit given
> +   by PARAM_IPA_CP_VALUE_LIST_SIZE.  */
>
>   template 
>   bool
>   ipcp_lattice::add_value (valtype newval, cgraph_edge *cs,
> ipcp_value *src_val,
> -   int src_idx, HOST_WIDE_INT offset)
> +   int src_idx, HOST_WIDE_INT offset,
> +   ipcp_value **val_pos_p,
> +   bool unlimited)
>   {
> ipcp_value *val;
>
> +  if (val_pos_p)
> +{
> +  for (val = values; val && val != *val_pos_p; val = val->next);
> +  gcc_checking_assert (val);
> +}
> +
> if (bottom)
>   return false;
>
> for (val = values; val; val = val->next)
>   if (values_equal_for_ipcp_p (val->value, newval))
> {
> + if (val_pos_p)
> +   *val_pos_p = val;
> +
>   if (ipa_edge_within_scc (cs))
> {
>   ipcp_value_source *s;
> @@ -1609,7 +1626,8 @@ ipcp_lattice::add_value (valtype newval, 
> cgraph_edge *cs,
>   return false;
> }
>
> -  if (values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE))
> +  if (!unlimited
> +  && values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE))
>   {
> /* We can only free sources, not the values themselves, because 
> sources
>of other values in this SCC might point to them.   */
> @@ -1623,6 +1641,9 @@ ipcp_lattice::add_value (valtype newval, 
> cgraph_edge *cs,
>   }
>   }
>
> +  if (val_pos_p)
> + *val_pos_p = NULL;
> +
> values = NULL;
> return set_to_bottom ();
>   }
> @@ -1630,8 +1651,54 @@ ipcp_lattice::add_value (valtype newval, 
> cgraph_edge *cs,
> values_count++;
> val = allocate_and_init_ipcp_value (newval);
> val->add_source (cs, src_val, src_idx, offset);
> -  val->next = values;
> -  values = val;
> +  if (val_pos_p)
> +{
> +  val->next = (*val_pos_p)->next;
> +  (*val_pos_p)->next = val;
> +  *val_pos_p = val;
> +}
> +  else
> +{
> +  val->next = values;
> +  values = val;
> +}
> +
> +  return true;
> +}
> +
> +

Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-10-23 Thread luoxhu
Hi,


On 2019/10/17 16:23, Feng Xue OS wrote:
> IPA does not allow constant propagation on parameter that is used to control
> function recursion.
> 
> recur_fn (i)
> {
>if ( !terminate_recursion (i))
>  {
>...
>recur_fn (i + 1);
>...
>  }
>...
> }
> 
> This patch is composed to enable multi-versioning for self-recursive function,
> and versioning copies is limited by a specified option.
> 
> Feng
> ---
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 045072e02ec..6255a766e4d 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -229,7 +229,9 @@ public:
> inline bool set_contains_variable ();
> bool add_value (valtype newval, cgraph_edge *cs,
> ipcp_value *src_val = NULL,
> -   int src_idx = 0, HOST_WIDE_INT offset = -1);
> +   int src_idx = 0, HOST_WIDE_INT offset = -1,
> +   ipcp_value **val_pos_p = NULL,
> +   bool unlimited = false);
> void print (FILE * f, bool dump_sources, bool dump_benefits);
>   };
>   
> @@ -1579,22 +1581,37 @@ allocate_and_init_ipcp_value 
> (ipa_polymorphic_call_context source)
>   /* Try to add NEWVAL to LAT, potentially creating a new ipcp_value for it.  
> CS,
>  SRC_VAL SRC_INDEX and OFFSET are meant for add_source and have the same
>  meaning.  OFFSET -1 means the source is scalar and not a part of an
> -   aggregate.  */
> +   aggregate.  If non-NULL, VAL_POS_P specifies position in value list,
> +   after which newly created ipcp_value will be inserted, and it is also
> +   used to record address of the added ipcp_value before function returns.
> +   UNLIMITED means whether value count should not exceed the limit given
> +   by PARAM_IPA_CP_VALUE_LIST_SIZE.  */
>   
>   template 
>   bool
>   ipcp_lattice::add_value (valtype newval, cgraph_edge *cs,
> ipcp_value *src_val,
> -   int src_idx, HOST_WIDE_INT offset)
> +   int src_idx, HOST_WIDE_INT offset,
> +   ipcp_value **val_pos_p,
> +   bool unlimited)
>   {
> ipcp_value *val;
>   
> +  if (val_pos_p)
> +{
> +  for (val = values; val && val != *val_pos_p; val = val->next);
> +  gcc_checking_assert (val);
> +}
> +
> if (bottom)
>   return false;
>   
> for (val = values; val; val = val->next)
>   if (values_equal_for_ipcp_p (val->value, newval))
> {
> + if (val_pos_p)
> +   *val_pos_p = val;
> +
>   if (ipa_edge_within_scc (cs))
> {
>   ipcp_value_source *s;
> @@ -1609,7 +1626,8 @@ ipcp_lattice::add_value (valtype newval, 
> cgraph_edge *cs,
>   return false;
> }
>   
> -  if (values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE))
> +  if (!unlimited
> +  && values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE))
>   {
> /* We can only free sources, not the values themselves, because 
> sources
>of other values in this SCC might point to them.   */
> @@ -1623,6 +1641,9 @@ ipcp_lattice::add_value (valtype newval, 
> cgraph_edge *cs,
>   }
>   }
>   
> +  if (val_pos_p)
> + *val_pos_p = NULL;
> +
> values = NULL;
> return set_to_bottom ();
>   }
> @@ -1630,8 +1651,54 @@ ipcp_lattice::add_value (valtype newval, 
> cgraph_edge *cs,
> values_count++;
> val = allocate_and_init_ipcp_value (newval);
> val->add_source (cs, src_val, src_idx, offset);
> -  val->next = values;
> -  values = val;
> +  if (val_pos_p)
> +{
> +  val->next = (*val_pos_p)->next;
> +  (*val_pos_p)->next = val;
> +  *val_pos_p = val;
> +}
> +  else
> +{
> +  val->next = values;
> +  values = val;
> +}
> +
> +  return true;
> +}
> +
> +/* Return true, if a ipcp_value VAL is orginated from parameter value of
> +   self-feeding recursive function by applying non-passthrough arithmetic
> +   transformation.  */
> +
> +static bool
> +self_recursively_generated_p (ipcp_value *val)
> +{
> +  class ipa_node_params *info = NULL;
> +
> +  for (ipcp_value_source *src = val->sources; src; src = src->next)
> +{
> +  cgraph_edge *cs = src->cs;
> +
> +  if (!src->val || cs->caller != cs->callee->function_symbol ()
> +   || src->val == val)
> + return false;
> +
> +  if (!info)
> + info = IPA_NODE_REF (cs->caller);
> +
> +  class ipcp_param_lattices *plats = ipa_get_parm_lattices (info,
> + src->index);
> +  ipcp_lattice *src_lat = src->offset == -1 ? >itself
> +   : plats->aggs;

Thanks for the patch.
This function doesn't handle the by-ref case after rebasing this patch to your
previous ipa-cp by-ref of arith patch as below (Also some conflicts need fix):

foo (int * p) { ...  return foo(*() + 1); }

It will cause value explosion.

Secondly, the self_recursive 

Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-10-17 Thread Feng Xue OS
> I noticed similar issue when analyzing the SPEC, self-recursive function is
> not versioned and posted my observations in 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92074.

> Generally, this could be implemented well by your patch, while I am
> wondering whether it is OK to convert the recursive function to
> non-recursive function in a independent pass after ipa-cp and ipa-sra instead
> of reuse the ipa-cp framework?
> The reason is sometimes the argument is passed-by-reference, and
> ipa-sra runs after ipa-cp, so this kind of optimization may not be done in
> WPA.  What's your idea about this, please?   Thanks.

Function versioning is done in ipa-cp, there is nothing special for recursive 
function.
So adding a dedicated pass for recursive seems to be redundant.

We might not need to resort to ipa-sra to resolve concern you mentioned. 
Original
ipa-cp already supports a simple kind of propagation on by-ref argument, who 
must
be defined by constant. And for an extended form as:  *arg = *param OP 
constant,  I've
created a tracker PR91682,  also composed a patch:
https://gcc.gnu.org/ml/gcc-patches/2019-09/msg01189.html.

Feng

Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-10-17 Thread luoxhu
Hi Feng,

On 2019/10/17 16:23, Feng Xue OS wrote:
> IPA does not allow constant propagation on parameter that is used to control
> function recursion.
> 
> recur_fn (i)
> {
>if ( !terminate_recursion (i))
>  {
>...
>recur_fn (i + 1);
>...
>  }
>...
> }
> 
> This patch is composed to enable multi-versioning for self-recursive function,
> and versioning copies is limited by a specified option.

I noticed similar issue when analyzing the SPEC, self-recursive function is
not versioned and posted my observations in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92074.

Generally, this could be implemented well by your patch, while I am
wondering whether it is OK to convert the recursive function to
non-recursive function in a independent pass after ipa-cp and ipa-sra instead
of reuse the ipa-cp framework?
The reason is sometimes the argument is passed-by-reference, and
ipa-sra runs after ipa-cp, so this kind of optimization may not be done in
WPA.  What's your idea about this, please?   Thanks. 


Thanks
Xiong Hu

> 
> Feng
> ---
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 045072e02ec..6255a766e4d 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -229,7 +229,9 @@ public:
> inline bool set_contains_variable ();
> bool add_value (valtype newval, cgraph_edge *cs,
> ipcp_value *src_val = NULL,
> -   int src_idx = 0, HOST_WIDE_INT offset = -1);
> +   int src_idx = 0, HOST_WIDE_INT offset = -1,
> +   ipcp_value **val_pos_p = NULL,
> +   bool unlimited = false);
> void print (FILE * f, bool dump_sources, bool dump_benefits);
>   };
>   
> @@ -1579,22 +1581,37 @@ allocate_and_init_ipcp_value 
> (ipa_polymorphic_call_context source)
>   /* Try to add NEWVAL to LAT, potentially creating a new ipcp_value for it.  
> CS,
>  SRC_VAL SRC_INDEX and OFFSET are meant for add_source and have the same
>  meaning.  OFFSET -1 means the source is scalar and not a part of an
> -   aggregate.  */
> +   aggregate.  If non-NULL, VAL_POS_P specifies position in value list,
> +   after which newly created ipcp_value will be inserted, and it is also
> +   used to record address of the added ipcp_value before function returns.
> +   UNLIMITED means whether value count should not exceed the limit given
> +   by PARAM_IPA_CP_VALUE_LIST_SIZE.  */
>   
>   template 
>   bool
>   ipcp_lattice::add_value (valtype newval, cgraph_edge *cs,
> ipcp_value *src_val,
> -   int src_idx, HOST_WIDE_INT offset)
> +   int src_idx, HOST_WIDE_INT offset,
> +   ipcp_value **val_pos_p,
> +   bool unlimited)
>   {
> ipcp_value *val;
>   
> +  if (val_pos_p)
> +{
> +  for (val = values; val && val != *val_pos_p; val = val->next);
> +  gcc_checking_assert (val);
> +}
> +
> if (bottom)
>   return false;
>   
> for (val = values; val; val = val->next)
>   if (values_equal_for_ipcp_p (val->value, newval))
> {
> + if (val_pos_p)
> +   *val_pos_p = val;
> +
>   if (ipa_edge_within_scc (cs))
> {
>   ipcp_value_source *s;
> @@ -1609,7 +1626,8 @@ ipcp_lattice::add_value (valtype newval, 
> cgraph_edge *cs,
>   return false;
> }
>   
> -  if (values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE))
> +  if (!unlimited
> +  && values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE))
>   {
> /* We can only free sources, not the values themselves, because 
> sources
>of other values in this SCC might point to them.   */
> @@ -1623,6 +1641,9 @@ ipcp_lattice::add_value (valtype newval, 
> cgraph_edge *cs,
>   }
>   }
>   
> +  if (val_pos_p)
> + *val_pos_p = NULL;
> +
> values = NULL;
> return set_to_bottom ();
>   }
> @@ -1630,8 +1651,54 @@ ipcp_lattice::add_value (valtype newval, 
> cgraph_edge *cs,
> values_count++;
> val = allocate_and_init_ipcp_value (newval);
> val->add_source (cs, src_val, src_idx, offset);
> -  val->next = values;
> -  values = val;
> +  if (val_pos_p)
> +{
> +  val->next = (*val_pos_p)->next;
> +  (*val_pos_p)->next = val;
> +  *val_pos_p = val;
> +}
> +  else
> +{
> +  val->next = values;
> +  values = val;
> +}
> +
> +  return true;
> +}
> +
> +/* Return true, if a ipcp_value VAL is orginated from parameter value of
> +   self-feeding recursive function by applying non-passthrough arithmetic
> +   transformation.  */
> +
> +static bool
> +self_recursively_generated_p (ipcp_value *val)
> +{
> +  class ipa_node_params *info = NULL;
> +
> +  for (ipcp_value_source *src = val->sources; src; src = src->next)
> +{
> +  cgraph_edge *cs = src->cs;
> +
> +  if (!src->val || cs->caller != cs->callee->function_symbol ()
> +   || src->val == val)
> + return false;
> +
> + 

[PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-10-17 Thread Feng Xue OS
IPA does not allow constant propagation on parameter that is used to control
function recursion. 

recur_fn (i)
{
  if ( !terminate_recursion (i))
{
  ...  
  recur_fn (i + 1);
  ...
}
  ...
}

This patch is composed to enable multi-versioning for self-recursive function,
and versioning copies is limited by a specified option.

Feng
---
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 045072e02ec..6255a766e4d 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -229,7 +229,9 @@ public:
   inline bool set_contains_variable ();
   bool add_value (valtype newval, cgraph_edge *cs,
  ipcp_value *src_val = NULL,
- int src_idx = 0, HOST_WIDE_INT offset = -1);
+ int src_idx = 0, HOST_WIDE_INT offset = -1,
+ ipcp_value **val_pos_p = NULL,
+ bool unlimited = false);
   void print (FILE * f, bool dump_sources, bool dump_benefits);
 };
 
@@ -1579,22 +1581,37 @@ allocate_and_init_ipcp_value 
(ipa_polymorphic_call_context source)
 /* Try to add NEWVAL to LAT, potentially creating a new ipcp_value for it.  CS,
SRC_VAL SRC_INDEX and OFFSET are meant for add_source and have the same
meaning.  OFFSET -1 means the source is scalar and not a part of an
-   aggregate.  */
+   aggregate.  If non-NULL, VAL_POS_P specifies position in value list,
+   after which newly created ipcp_value will be inserted, and it is also
+   used to record address of the added ipcp_value before function returns.
+   UNLIMITED means whether value count should not exceed the limit given
+   by PARAM_IPA_CP_VALUE_LIST_SIZE.  */
 
 template 
 bool
 ipcp_lattice::add_value (valtype newval, cgraph_edge *cs,
  ipcp_value *src_val,
- int src_idx, HOST_WIDE_INT offset)
+ int src_idx, HOST_WIDE_INT offset,
+ ipcp_value **val_pos_p,
+ bool unlimited)
 {
   ipcp_value *val;
 
+  if (val_pos_p)
+{
+  for (val = values; val && val != *val_pos_p; val = val->next);
+  gcc_checking_assert (val);
+}
+
   if (bottom)
 return false;
 
   for (val = values; val; val = val->next)
 if (values_equal_for_ipcp_p (val->value, newval))
   {
+   if (val_pos_p)
+ *val_pos_p = val;
+
if (ipa_edge_within_scc (cs))
  {
ipcp_value_source *s;
@@ -1609,7 +1626,8 @@ ipcp_lattice::add_value (valtype newval, 
cgraph_edge *cs,
return false;
   }
 
-  if (values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE))
+  if (!unlimited
+  && values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE))
 {
   /* We can only free sources, not the values themselves, because sources
 of other values in this SCC might point to them.   */
@@ -1623,6 +1641,9 @@ ipcp_lattice::add_value (valtype newval, 
cgraph_edge *cs,
}
}
 
+  if (val_pos_p)
+   *val_pos_p = NULL;
+
   values = NULL;
   return set_to_bottom ();
 }
@@ -1630,8 +1651,54 @@ ipcp_lattice::add_value (valtype newval, 
cgraph_edge *cs,
   values_count++;
   val = allocate_and_init_ipcp_value (newval);
   val->add_source (cs, src_val, src_idx, offset);
-  val->next = values;
-  values = val;
+  if (val_pos_p)
+{
+  val->next = (*val_pos_p)->next;
+  (*val_pos_p)->next = val;
+  *val_pos_p = val;
+}
+  else
+{
+  val->next = values;
+  values = val;
+}
+
+  return true;
+}
+
+/* Return true, if a ipcp_value VAL is orginated from parameter value of
+   self-feeding recursive function by applying non-passthrough arithmetic
+   transformation.  */
+
+static bool
+self_recursively_generated_p (ipcp_value *val)
+{
+  class ipa_node_params *info = NULL;
+
+  for (ipcp_value_source *src = val->sources; src; src = src->next)
+{
+  cgraph_edge *cs = src->cs;
+
+  if (!src->val || cs->caller != cs->callee->function_symbol ()
+ || src->val == val)
+   return false;
+
+  if (!info)
+   info = IPA_NODE_REF (cs->caller);
+
+  class ipcp_param_lattices *plats = ipa_get_parm_lattices (info,
+   src->index);
+  ipcp_lattice *src_lat = src->offset == -1 ? >itself
+ : plats->aggs;
+  ipcp_value *src_val;
+
+  for (src_val = src_lat->values; src_val && src_val != val;
+  src_val = src_val->next);
+
+  if (!src_val)
+   return false;
+}
+
   return true;
 }
 
@@ -1649,20 +1716,72 @@ propagate_vals_across_pass_through (cgraph_edge *cs, 
ipa_jump_func *jfunc,
   ipcp_value *src_val;
   bool ret = false;
 
-  /* Do not create new values when propagating within an SCC because if there
- are arithmetic functions with circular dependencies, there is infinite
- number of them and we would just make lattices bottom.  If this condition
- is ever relaxed