Re: Patch ping (Re: [PATCH PR82965/PR83991]Fix invalid profile count in vectorization peeling)

2018-03-11 Thread Paul Hua
On Fri, Mar 9, 2018 at 10:51 PM, Bin.Cheng  wrote:
> On Fri, Mar 9, 2018 at 10:25 AM, Paul Hua  wrote:
>> It's looks fixed
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82965#c12  on mips64el.
> Hmm, is it fixed? or is it exposed now on mips64el?  I read the latter
> from the comment.

It is fixed, Bootstrap and test passed on mips64el.

Thanks,

Paul Hua

> I think the issue is like explained, but haven't dug into when/why it
> is triggered in vect_peeling only for some targets.
> Honza asked couple questions about my patch offline, I will revisit
> the patch, see how to address
> his concern.
>
> Thanks,
> bin
>>
>> Thanks.
>>
>> On Mon, Feb 26, 2018 at 8:02 PM, Bin.Cheng  wrote:
>>> Ping^2
>>>
>>> Thanks,
>>> bin
>>>
>>> On Mon, Feb 19, 2018 at 5:14 PM, Jakub Jelinek  wrote:
 Hi!

 Honza, do you think you could have a look at this P1 fix?

 Thanks.

 On Wed, Jan 31, 2018 at 10:03:51AM +, Bin Cheng wrote:
> Hi,
> This patch fixes invalid profile count information in vectorization 
> peeling.
> Current implementation is a bit confusing for me since it tries to compute
> an overall probability based on scaling probability and change of 
> estimated
> niters.  This patch does it in two steps.  Firstly it does the scaling, 
> then
> it adjusts to new estimated niters by simply adjusting loop's latch count
> information; scaling the loop's count information by the proportion
> new_estimated_niters/old_estimate_niters.  Of course we have to adjust 
> loop
> latch's count information back after scaling.
>
> Bootstrap and test on x86_64 and AArch64.  gcc.dg/vect/pr79347.c is fixed
> for both PR82965 and PR83991.  Is this OK?
>
> Thanks,
> bin
>
> 2018-01-30  Bin Cheng  
>
>   PR tree-optimization/82965
>   PR tree-optimization/83991
>   * cfgloopmanip.c (scale_loop_profile): Further scale loop's profile
>   information if the loop was predicted to iterate too many times.

> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
> index b9b76d8..1f560b8 100644
> --- a/gcc/cfgloopmanip.c
> +++ b/gcc/cfgloopmanip.c
> @@ -509,7 +509,7 @@ scale_loop_profile (struct loop *loop, 
> profile_probability p,
>   gcov_type iteration_bound)
>  {
>gcov_type iterations = expected_loop_iterations_unbounded (loop);
> -  edge e;
> +  edge e, preheader_e;
>edge_iterator ei;
>
>if (dump_file && (dump_flags & TDF_DETAILS))
> @@ -521,77 +521,66 @@ scale_loop_profile (struct loop *loop, 
> profile_probability p,
>  (int)iteration_bound, (int)iterations);
>  }
>
> +  /* Scale the probabilities.  */
> +  scale_loop_frequencies (loop, p);
> +
>/* See if loop is predicted to iterate too many times.  */
> -  if (iteration_bound && iterations > 0
> -  && p.apply (iterations) > iteration_bound)
> +  if (iteration_bound == 0 || iterations <= 0
> +  || p.apply (iterations) <= iteration_bound)
> +return;
> +
> +  e = single_exit (loop);
> +  preheader_e = loop_preheader_edge (loop);
> +  profile_count count_in = preheader_e->count ();
> +  if (e && preheader_e
> +  && count_in > profile_count::zero ()
> +  && loop->header->count.initialized_p ())
>  {
> -  /* Fixing loop profile for different trip count is not trivial; 
> the exit
> -  probabilities has to be updated to match and frequencies 
> propagated down
> -  to the loop body.
> -
> -  We fully update only the simple case of loop with single exit that 
> is
> -  either from the latch or BB just before latch and leads from BB 
> with
> -  simple conditional jump.   This is OK for use in vectorizer.  */
> -  e = single_exit (loop);
> -  if (e)
> - {
> -   edge other_e;
> -   profile_count count_delta;
> +  edge other_e;
> +  profile_count count_delta;
>
> -  FOR_EACH_EDGE (other_e, ei, e->src->succs)
> - if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
> - && e != other_e)
> -   break;
> +  FOR_EACH_EDGE (other_e, ei, e->src->succs)
> + if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
> + && e != other_e)
> +   break;
>
> -   /* Probability of exit must be 1/iterations.  */
> -   count_delta = e->count ();
> -   e->probability = profile_probability::always ()
> +  /* Probability of exit must be 1/iterations.  */
> +  count_delta = e->count ();
> +  e->probability = profile_probability::always ()
>   .apply_scale (1, 

Re: Patch ping (Re: [PATCH PR82965/PR83991]Fix invalid profile count in vectorization peeling)

2018-03-09 Thread Bin.Cheng
On Fri, Mar 9, 2018 at 10:25 AM, Paul Hua  wrote:
> It's looks fixed
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82965#c12  on mips64el.
Hmm, is it fixed? or is it exposed now on mips64el?  I read the latter
from the comment.
I think the issue is like explained, but haven't dug into when/why it
is triggered in vect_peeling only for some targets.
Honza asked couple questions about my patch offline, I will revisit
the patch, see how to address
his concern.

Thanks,
bin
>
> Thanks.
>
> On Mon, Feb 26, 2018 at 8:02 PM, Bin.Cheng  wrote:
>> Ping^2
>>
>> Thanks,
>> bin
>>
>> On Mon, Feb 19, 2018 at 5:14 PM, Jakub Jelinek  wrote:
>>> Hi!
>>>
>>> Honza, do you think you could have a look at this P1 fix?
>>>
>>> Thanks.
>>>
>>> On Wed, Jan 31, 2018 at 10:03:51AM +, Bin Cheng wrote:
 Hi,
 This patch fixes invalid profile count information in vectorization 
 peeling.
 Current implementation is a bit confusing for me since it tries to compute
 an overall probability based on scaling probability and change of estimated
 niters.  This patch does it in two steps.  Firstly it does the scaling, 
 then
 it adjusts to new estimated niters by simply adjusting loop's latch count
 information; scaling the loop's count information by the proportion
 new_estimated_niters/old_estimate_niters.  Of course we have to adjust loop
 latch's count information back after scaling.

 Bootstrap and test on x86_64 and AArch64.  gcc.dg/vect/pr79347.c is fixed
 for both PR82965 and PR83991.  Is this OK?

 Thanks,
 bin

 2018-01-30  Bin Cheng  

   PR tree-optimization/82965
   PR tree-optimization/83991
   * cfgloopmanip.c (scale_loop_profile): Further scale loop's profile
   information if the loop was predicted to iterate too many times.
>>>
 diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
 index b9b76d8..1f560b8 100644
 --- a/gcc/cfgloopmanip.c
 +++ b/gcc/cfgloopmanip.c
 @@ -509,7 +509,7 @@ scale_loop_profile (struct loop *loop, 
 profile_probability p,
   gcov_type iteration_bound)
  {
gcov_type iterations = expected_loop_iterations_unbounded (loop);
 -  edge e;
 +  edge e, preheader_e;
edge_iterator ei;

if (dump_file && (dump_flags & TDF_DETAILS))
 @@ -521,77 +521,66 @@ scale_loop_profile (struct loop *loop, 
 profile_probability p,
  (int)iteration_bound, (int)iterations);
  }

 +  /* Scale the probabilities.  */
 +  scale_loop_frequencies (loop, p);
 +
/* See if loop is predicted to iterate too many times.  */
 -  if (iteration_bound && iterations > 0
 -  && p.apply (iterations) > iteration_bound)
 +  if (iteration_bound == 0 || iterations <= 0
 +  || p.apply (iterations) <= iteration_bound)
 +return;
 +
 +  e = single_exit (loop);
 +  preheader_e = loop_preheader_edge (loop);
 +  profile_count count_in = preheader_e->count ();
 +  if (e && preheader_e
 +  && count_in > profile_count::zero ()
 +  && loop->header->count.initialized_p ())
  {
 -  /* Fixing loop profile for different trip count is not trivial; the 
 exit
 -  probabilities has to be updated to match and frequencies propagated 
 down
 -  to the loop body.
 -
 -  We fully update only the simple case of loop with single exit that 
 is
 -  either from the latch or BB just before latch and leads from BB with
 -  simple conditional jump.   This is OK for use in vectorizer.  */
 -  e = single_exit (loop);
 -  if (e)
 - {
 -   edge other_e;
 -   profile_count count_delta;
 +  edge other_e;
 +  profile_count count_delta;

 -  FOR_EACH_EDGE (other_e, ei, e->src->succs)
 - if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
 - && e != other_e)
 -   break;
 +  FOR_EACH_EDGE (other_e, ei, e->src->succs)
 + if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
 + && e != other_e)
 +   break;

 -   /* Probability of exit must be 1/iterations.  */
 -   count_delta = e->count ();
 -   e->probability = profile_probability::always ()
 +  /* Probability of exit must be 1/iterations.  */
 +  count_delta = e->count ();
 +  e->probability = profile_probability::always ()
   .apply_scale (1, iteration_bound);
 -   other_e->probability = e->probability.invert ();
 -   count_delta -= e->count ();
 -
 -   /* If latch exists, change its count, since we changed
 -  probability of exit.  Theoretically we should update everything 
 from
 -  

Re: Patch ping (Re: [PATCH PR82965/PR83991]Fix invalid profile count in vectorization peeling)

2018-03-09 Thread Paul Hua
It's looks fixed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82965#c12  on mips64el.

Thanks.

On Mon, Feb 26, 2018 at 8:02 PM, Bin.Cheng  wrote:
> Ping^2
>
> Thanks,
> bin
>
> On Mon, Feb 19, 2018 at 5:14 PM, Jakub Jelinek  wrote:
>> Hi!
>>
>> Honza, do you think you could have a look at this P1 fix?
>>
>> Thanks.
>>
>> On Wed, Jan 31, 2018 at 10:03:51AM +, Bin Cheng wrote:
>>> Hi,
>>> This patch fixes invalid profile count information in vectorization peeling.
>>> Current implementation is a bit confusing for me since it tries to compute
>>> an overall probability based on scaling probability and change of estimated
>>> niters.  This patch does it in two steps.  Firstly it does the scaling, then
>>> it adjusts to new estimated niters by simply adjusting loop's latch count
>>> information; scaling the loop's count information by the proportion
>>> new_estimated_niters/old_estimate_niters.  Of course we have to adjust loop
>>> latch's count information back after scaling.
>>>
>>> Bootstrap and test on x86_64 and AArch64.  gcc.dg/vect/pr79347.c is fixed
>>> for both PR82965 and PR83991.  Is this OK?
>>>
>>> Thanks,
>>> bin
>>>
>>> 2018-01-30  Bin Cheng  
>>>
>>>   PR tree-optimization/82965
>>>   PR tree-optimization/83991
>>>   * cfgloopmanip.c (scale_loop_profile): Further scale loop's profile
>>>   information if the loop was predicted to iterate too many times.
>>
>>> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
>>> index b9b76d8..1f560b8 100644
>>> --- a/gcc/cfgloopmanip.c
>>> +++ b/gcc/cfgloopmanip.c
>>> @@ -509,7 +509,7 @@ scale_loop_profile (struct loop *loop, 
>>> profile_probability p,
>>>   gcov_type iteration_bound)
>>>  {
>>>gcov_type iterations = expected_loop_iterations_unbounded (loop);
>>> -  edge e;
>>> +  edge e, preheader_e;
>>>edge_iterator ei;
>>>
>>>if (dump_file && (dump_flags & TDF_DETAILS))
>>> @@ -521,77 +521,66 @@ scale_loop_profile (struct loop *loop, 
>>> profile_probability p,
>>>  (int)iteration_bound, (int)iterations);
>>>  }
>>>
>>> +  /* Scale the probabilities.  */
>>> +  scale_loop_frequencies (loop, p);
>>> +
>>>/* See if loop is predicted to iterate too many times.  */
>>> -  if (iteration_bound && iterations > 0
>>> -  && p.apply (iterations) > iteration_bound)
>>> +  if (iteration_bound == 0 || iterations <= 0
>>> +  || p.apply (iterations) <= iteration_bound)
>>> +return;
>>> +
>>> +  e = single_exit (loop);
>>> +  preheader_e = loop_preheader_edge (loop);
>>> +  profile_count count_in = preheader_e->count ();
>>> +  if (e && preheader_e
>>> +  && count_in > profile_count::zero ()
>>> +  && loop->header->count.initialized_p ())
>>>  {
>>> -  /* Fixing loop profile for different trip count is not trivial; the 
>>> exit
>>> -  probabilities has to be updated to match and frequencies propagated 
>>> down
>>> -  to the loop body.
>>> -
>>> -  We fully update only the simple case of loop with single exit that is
>>> -  either from the latch or BB just before latch and leads from BB with
>>> -  simple conditional jump.   This is OK for use in vectorizer.  */
>>> -  e = single_exit (loop);
>>> -  if (e)
>>> - {
>>> -   edge other_e;
>>> -   profile_count count_delta;
>>> +  edge other_e;
>>> +  profile_count count_delta;
>>>
>>> -  FOR_EACH_EDGE (other_e, ei, e->src->succs)
>>> - if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
>>> - && e != other_e)
>>> -   break;
>>> +  FOR_EACH_EDGE (other_e, ei, e->src->succs)
>>> + if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
>>> + && e != other_e)
>>> +   break;
>>>
>>> -   /* Probability of exit must be 1/iterations.  */
>>> -   count_delta = e->count ();
>>> -   e->probability = profile_probability::always ()
>>> +  /* Probability of exit must be 1/iterations.  */
>>> +  count_delta = e->count ();
>>> +  e->probability = profile_probability::always ()
>>>   .apply_scale (1, iteration_bound);
>>> -   other_e->probability = e->probability.invert ();
>>> -   count_delta -= e->count ();
>>> -
>>> -   /* If latch exists, change its count, since we changed
>>> -  probability of exit.  Theoretically we should update everything 
>>> from
>>> -  source of exit edge to latch, but for vectorizer this is enough. 
>>>  */
>>> -   if (loop->latch
>>> -   && loop->latch != e->src)
>>> - {
>>> -   loop->latch->count += count_delta;
>>> - }
>>> - }
>>> +  other_e->probability = e->probability.invert ();
>>>
>>>/* Roughly speaking we want to reduce the loop body profile by the
>>>difference of loop iterations.  We however can do better if
>>>we look at the actual profile, if it is available.  */
>>> -  p = p.apply_scale 

Re: Patch ping (Re: [PATCH PR82965/PR83991]Fix invalid profile count in vectorization peeling)

2018-02-26 Thread Bin.Cheng
Ping^2

Thanks,
bin

On Mon, Feb 19, 2018 at 5:14 PM, Jakub Jelinek  wrote:
> Hi!
>
> Honza, do you think you could have a look at this P1 fix?
>
> Thanks.
>
> On Wed, Jan 31, 2018 at 10:03:51AM +, Bin Cheng wrote:
>> Hi,
>> This patch fixes invalid profile count information in vectorization peeling.
>> Current implementation is a bit confusing for me since it tries to compute
>> an overall probability based on scaling probability and change of estimated
>> niters.  This patch does it in two steps.  Firstly it does the scaling, then
>> it adjusts to new estimated niters by simply adjusting loop's latch count
>> information; scaling the loop's count information by the proportion
>> new_estimated_niters/old_estimate_niters.  Of course we have to adjust loop
>> latch's count information back after scaling.
>>
>> Bootstrap and test on x86_64 and AArch64.  gcc.dg/vect/pr79347.c is fixed
>> for both PR82965 and PR83991.  Is this OK?
>>
>> Thanks,
>> bin
>>
>> 2018-01-30  Bin Cheng  
>>
>>   PR tree-optimization/82965
>>   PR tree-optimization/83991
>>   * cfgloopmanip.c (scale_loop_profile): Further scale loop's profile
>>   information if the loop was predicted to iterate too many times.
>
>> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
>> index b9b76d8..1f560b8 100644
>> --- a/gcc/cfgloopmanip.c
>> +++ b/gcc/cfgloopmanip.c
>> @@ -509,7 +509,7 @@ scale_loop_profile (struct loop *loop, 
>> profile_probability p,
>>   gcov_type iteration_bound)
>>  {
>>gcov_type iterations = expected_loop_iterations_unbounded (loop);
>> -  edge e;
>> +  edge e, preheader_e;
>>edge_iterator ei;
>>
>>if (dump_file && (dump_flags & TDF_DETAILS))
>> @@ -521,77 +521,66 @@ scale_loop_profile (struct loop *loop, 
>> profile_probability p,
>>  (int)iteration_bound, (int)iterations);
>>  }
>>
>> +  /* Scale the probabilities.  */
>> +  scale_loop_frequencies (loop, p);
>> +
>>/* See if loop is predicted to iterate too many times.  */
>> -  if (iteration_bound && iterations > 0
>> -  && p.apply (iterations) > iteration_bound)
>> +  if (iteration_bound == 0 || iterations <= 0
>> +  || p.apply (iterations) <= iteration_bound)
>> +return;
>> +
>> +  e = single_exit (loop);
>> +  preheader_e = loop_preheader_edge (loop);
>> +  profile_count count_in = preheader_e->count ();
>> +  if (e && preheader_e
>> +  && count_in > profile_count::zero ()
>> +  && loop->header->count.initialized_p ())
>>  {
>> -  /* Fixing loop profile for different trip count is not trivial; the 
>> exit
>> -  probabilities has to be updated to match and frequencies propagated 
>> down
>> -  to the loop body.
>> -
>> -  We fully update only the simple case of loop with single exit that is
>> -  either from the latch or BB just before latch and leads from BB with
>> -  simple conditional jump.   This is OK for use in vectorizer.  */
>> -  e = single_exit (loop);
>> -  if (e)
>> - {
>> -   edge other_e;
>> -   profile_count count_delta;
>> +  edge other_e;
>> +  profile_count count_delta;
>>
>> -  FOR_EACH_EDGE (other_e, ei, e->src->succs)
>> - if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
>> - && e != other_e)
>> -   break;
>> +  FOR_EACH_EDGE (other_e, ei, e->src->succs)
>> + if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
>> + && e != other_e)
>> +   break;
>>
>> -   /* Probability of exit must be 1/iterations.  */
>> -   count_delta = e->count ();
>> -   e->probability = profile_probability::always ()
>> +  /* Probability of exit must be 1/iterations.  */
>> +  count_delta = e->count ();
>> +  e->probability = profile_probability::always ()
>>   .apply_scale (1, iteration_bound);
>> -   other_e->probability = e->probability.invert ();
>> -   count_delta -= e->count ();
>> -
>> -   /* If latch exists, change its count, since we changed
>> -  probability of exit.  Theoretically we should update everything 
>> from
>> -  source of exit edge to latch, but for vectorizer this is enough.  
>> */
>> -   if (loop->latch
>> -   && loop->latch != e->src)
>> - {
>> -   loop->latch->count += count_delta;
>> - }
>> - }
>> +  other_e->probability = e->probability.invert ();
>>
>>/* Roughly speaking we want to reduce the loop body profile by the
>>difference of loop iterations.  We however can do better if
>>we look at the actual profile, if it is available.  */
>> -  p = p.apply_scale (iteration_bound, iterations);
>> -
>> -  if (loop->header->count.initialized_p ())
>> - {
>> -   profile_count count_in = profile_count::zero ();
>> +  p = profile_probability::always ();
>>
>> -   FOR_EACH_EDGE (e, ei, loop->header->preds)
>> - if (e->src != loop->latch)
>> 

Patch ping (Re: [PATCH PR82965/PR83991]Fix invalid profile count in vectorization peeling)

2018-02-19 Thread Jakub Jelinek
Hi!

Honza, do you think you could have a look at this P1 fix?

Thanks.

On Wed, Jan 31, 2018 at 10:03:51AM +, Bin Cheng wrote:
> Hi,
> This patch fixes invalid profile count information in vectorization peeling.
> Current implementation is a bit confusing for me since it tries to compute
> an overall probability based on scaling probability and change of estimated
> niters.  This patch does it in two steps.  Firstly it does the scaling, then
> it adjusts to new estimated niters by simply adjusting loop's latch count
> information; scaling the loop's count information by the proportion
> new_estimated_niters/old_estimate_niters.  Of course we have to adjust loop
> latch's count information back after scaling.
> 
> Bootstrap and test on x86_64 and AArch64.  gcc.dg/vect/pr79347.c is fixed
> for both PR82965 and PR83991.  Is this OK?
> 
> Thanks,
> bin
> 
> 2018-01-30  Bin Cheng  
> 
>   PR tree-optimization/82965
>   PR tree-optimization/83991
>   * cfgloopmanip.c (scale_loop_profile): Further scale loop's profile
>   information if the loop was predicted to iterate too many times.

> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
> index b9b76d8..1f560b8 100644
> --- a/gcc/cfgloopmanip.c
> +++ b/gcc/cfgloopmanip.c
> @@ -509,7 +509,7 @@ scale_loop_profile (struct loop *loop, 
> profile_probability p,
>   gcov_type iteration_bound)
>  {
>gcov_type iterations = expected_loop_iterations_unbounded (loop);
> -  edge e;
> +  edge e, preheader_e;
>edge_iterator ei;
>  
>if (dump_file && (dump_flags & TDF_DETAILS))
> @@ -521,77 +521,66 @@ scale_loop_profile (struct loop *loop, 
> profile_probability p,
>  (int)iteration_bound, (int)iterations);
>  }
>  
> +  /* Scale the probabilities.  */
> +  scale_loop_frequencies (loop, p);
> +
>/* See if loop is predicted to iterate too many times.  */
> -  if (iteration_bound && iterations > 0
> -  && p.apply (iterations) > iteration_bound)
> +  if (iteration_bound == 0 || iterations <= 0
> +  || p.apply (iterations) <= iteration_bound)
> +return;
> +
> +  e = single_exit (loop);
> +  preheader_e = loop_preheader_edge (loop);
> +  profile_count count_in = preheader_e->count ();
> +  if (e && preheader_e
> +  && count_in > profile_count::zero ()
> +  && loop->header->count.initialized_p ())
>  {
> -  /* Fixing loop profile for different trip count is not trivial; the 
> exit
> -  probabilities has to be updated to match and frequencies propagated 
> down
> -  to the loop body.
> -
> -  We fully update only the simple case of loop with single exit that is
> -  either from the latch or BB just before latch and leads from BB with
> -  simple conditional jump.   This is OK for use in vectorizer.  */
> -  e = single_exit (loop);
> -  if (e)
> - {
> -   edge other_e;
> -   profile_count count_delta;
> +  edge other_e;
> +  profile_count count_delta;
>  
> -  FOR_EACH_EDGE (other_e, ei, e->src->succs)
> - if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
> - && e != other_e)
> -   break;
> +  FOR_EACH_EDGE (other_e, ei, e->src->succs)
> + if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
> + && e != other_e)
> +   break;
>  
> -   /* Probability of exit must be 1/iterations.  */
> -   count_delta = e->count ();
> -   e->probability = profile_probability::always ()
> +  /* Probability of exit must be 1/iterations.  */
> +  count_delta = e->count ();
> +  e->probability = profile_probability::always ()
>   .apply_scale (1, iteration_bound);
> -   other_e->probability = e->probability.invert ();
> -   count_delta -= e->count ();
> -
> -   /* If latch exists, change its count, since we changed
> -  probability of exit.  Theoretically we should update everything 
> from
> -  source of exit edge to latch, but for vectorizer this is enough.  
> */
> -   if (loop->latch
> -   && loop->latch != e->src)
> - {
> -   loop->latch->count += count_delta;
> - }
> - }
> +  other_e->probability = e->probability.invert ();
>  
>/* Roughly speaking we want to reduce the loop body profile by the
>difference of loop iterations.  We however can do better if
>we look at the actual profile, if it is available.  */
> -  p = p.apply_scale (iteration_bound, iterations);
> -
> -  if (loop->header->count.initialized_p ())
> - {
> -   profile_count count_in = profile_count::zero ();
> +  p = profile_probability::always ();
>  
> -   FOR_EACH_EDGE (e, ei, loop->header->preds)
> - if (e->src != loop->latch)
> -   count_in += e->count ();
> -
> -   if (count_in > profile_count::zero () )
> - {
> -   p = count_in.probability_in (loop->header->count.apply_scale
> -   

[PATCH PR82965/PR83991]Fix invalid profile count in vectorization peeling

2018-01-31 Thread Bin Cheng
Hi,
This patch fixes invalid profile count information in vectorization peeling.
Current implementation is a bit confusing for me since it tries to compute
an overall probability based on scaling probability and change of estimated
niters.  This patch does it in two steps.  Firstly it does the scaling, then
it adjusts to new estimated niters by simply adjusting loop's latch count
information; scaling the loop's count information by the proportion
new_estimated_niters/old_estimate_niters.  Of course we have to adjust loop
latch's count information back after scaling.

Bootstrap and test on x86_64 and AArch64.  gcc.dg/vect/pr79347.c is fixed
for both PR82965 and PR83991.  Is this OK?

Thanks,
bin

2018-01-30  Bin Cheng  

PR tree-optimization/82965
PR tree-optimization/83991
* cfgloopmanip.c (scale_loop_profile): Further scale loop's profile
information if the loop was predicted to iterate too many times.diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index b9b76d8..1f560b8 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -509,7 +509,7 @@ scale_loop_profile (struct loop *loop, profile_probability 
p,
gcov_type iteration_bound)
 {
   gcov_type iterations = expected_loop_iterations_unbounded (loop);
-  edge e;
+  edge e, preheader_e;
   edge_iterator ei;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -521,77 +521,66 @@ scale_loop_profile (struct loop *loop, 
profile_probability p,
   (int)iteration_bound, (int)iterations);
 }
 
+  /* Scale the probabilities.  */
+  scale_loop_frequencies (loop, p);
+
   /* See if loop is predicted to iterate too many times.  */
-  if (iteration_bound && iterations > 0
-  && p.apply (iterations) > iteration_bound)
+  if (iteration_bound == 0 || iterations <= 0
+  || p.apply (iterations) <= iteration_bound)
+return;
+
+  e = single_exit (loop);
+  preheader_e = loop_preheader_edge (loop);
+  profile_count count_in = preheader_e->count ();
+  if (e && preheader_e
+  && count_in > profile_count::zero ()
+  && loop->header->count.initialized_p ())
 {
-  /* Fixing loop profile for different trip count is not trivial; the exit
-probabilities has to be updated to match and frequencies propagated 
down
-to the loop body.
-
-We fully update only the simple case of loop with single exit that is
-either from the latch or BB just before latch and leads from BB with
-simple conditional jump.   This is OK for use in vectorizer.  */
-  e = single_exit (loop);
-  if (e)
-   {
- edge other_e;
- profile_count count_delta;
+  edge other_e;
+  profile_count count_delta;
 
-  FOR_EACH_EDGE (other_e, ei, e->src->succs)
-   if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
-   && e != other_e)
- break;
+  FOR_EACH_EDGE (other_e, ei, e->src->succs)
+   if (!(other_e->flags & (EDGE_ABNORMAL | EDGE_FAKE))
+   && e != other_e)
+ break;
 
- /* Probability of exit must be 1/iterations.  */
- count_delta = e->count ();
- e->probability = profile_probability::always ()
+  /* Probability of exit must be 1/iterations.  */
+  count_delta = e->count ();
+  e->probability = profile_probability::always ()
.apply_scale (1, iteration_bound);
- other_e->probability = e->probability.invert ();
- count_delta -= e->count ();
-
- /* If latch exists, change its count, since we changed
-probability of exit.  Theoretically we should update everything 
from
-source of exit edge to latch, but for vectorizer this is enough.  
*/
- if (loop->latch
- && loop->latch != e->src)
-   {
- loop->latch->count += count_delta;
-   }
-   }
+  other_e->probability = e->probability.invert ();
 
   /* Roughly speaking we want to reduce the loop body profile by the
 difference of loop iterations.  We however can do better if
 we look at the actual profile, if it is available.  */
-  p = p.apply_scale (iteration_bound, iterations);
-
-  if (loop->header->count.initialized_p ())
-   {
- profile_count count_in = profile_count::zero ();
+  p = profile_probability::always ();
 
- FOR_EACH_EDGE (e, ei, loop->header->preds)
-   if (e->src != loop->latch)
- count_in += e->count ();
-
- if (count_in > profile_count::zero () )
-   {
- p = count_in.probability_in (loop->header->count.apply_scale
-(iteration_bound, 1));
-   }
-   }
+  count_in = count_in.apply_scale (iteration_bound, 1);
+  p = count_in.probability_in (loop->header->count);
   if (!(p > profile_probability::never ()))
p = profile_probability::very_unlikely ();
-}