On Sat, Jul 21, 2018 at 3:28 AM Bin.Cheng <amker.ch...@gmail.com> wrote:
>
> On Tue, Jul 17, 2018 at 2:08 AM, Kelvin Nilsen <kdnil...@linux.ibm.com> wrote:
> > Thanks for looking at this for me.  In simplifying the test case for a bug 
> > report, I've narrowed the "problem" to integer overflow considerations.  My 
> > len variable is declared int, and the target has 64-bit pointers.  I'm 
> > gathering that the "manual transformation" I quoted below is not considered 
> > "equivalent" to the original source code due to different integer overflow 
> > behaviors.  If I redeclare len to be unsigned long long, then I 
> > automatically get the optimizations that I was originally expecting.
> >
> > I suppose this is really NOT a bug?
> As your test case demonstrates, it is caused by wrapping unsigned int32.
> >
> > Is there a compiler optimization flag that allows the optimizer to ignore 
> > array index integer overflow in considering legal optimizations?
> I am not aware of one for unsigned integer, and I guess it won't be
> introduced in the future either?

We've had -funsafe-loop-optimizations in the past but that only
concerned niter analysis, not scalar evolution analysis
which is likely required here.

And no, there's no plan to re-introduce those.

Richard.

> Thanks,
> bin
> >
> >
> >
> > On 7/13/18 9:14 PM, Bin.Cheng wrote:
> >> On Fri, Jul 13, 2018 at 6:04 AM, Kelvin Nilsen <kdnil...@linux.ibm.com> 
> >> wrote:
> >>> A somewhat old "issue report" pointed me to the code generated for a 
> >>> 4-fold manually unrolled version of the following loop:
> >>>
> >>>>                       while (++len != len_limit) /* this is loop */
> >>>>                               if (pb[len] != cur[len])
> >>>>                                       break;
> >>>
> >>> As unrolled, the loop appears as:
> >>>
> >>>>                 while (++len != len_limit) /* this is loop */ {
> >>>>                   if (pb[len] != cur[len])
> >>>>                     break;
> >>>>                   if (++len == len_limit)  /* unrolled 2nd iteration */
> >>>>                     break;
> >>>>                   if (pb[len] != cur[len])
> >>>>                     break;
> >>>>                   if (++len == len_limit)  /* unrolled 3rd iteration */
> >>>>                     break;
> >>>>                   if (pb[len] != cur[len])
> >>>>                     break;
> >>>>                   if (++len == len_limit)  /* unrolled 4th iteration */
> >>>>                     break;
> >>>>                   if (pb[len] != cur[len])
> >>>>                     break;
> >>>>                 }
> >>>
> >>> In examining the behavior of tree-ssa-loop-ivopts.c, I've discovered the 
> >>> only induction variable candidates that are being considered are all 
> >>> forms of the len variable.  We are not considering any induction 
> >>> variables to represent the address expressions &pb[len] and &cur[len].
> >>>
> >>> I rewrote the source code for this loop to make the addressing 
> >>> expressions more explicit, as in the following:
> >>>
> >>>>       cur++;
> >>>>       while (++pb != last_pb) /* this is loop */ {
> >>>>       if (*pb != *cur)
> >>>>         break;
> >>>>       ++cur;
> >>>>       if (++pb == last_pb)  /* unrolled 2nd iteration */
> >>>>         break;
> >>>>       if (*pb != *cur)
> >>>>         break;
> >>>>       ++cur;
> >>>>       if (++pb == last_pb)  /* unrolled 3rd iteration */
> >>>>         break;
> >>>>       if (*pb != *cur)
> >>>>         break;
> >>>>       ++cur;
> >>>>       if (++pb == last_pb)  /* unrolled 4th iteration */
> >>>>         break;
> >>>>       if (*pb != *cur)
> >>>>         break;
> >>>>       ++cur;
> >>>>       }
> >>>
> >>> Now, gcc does a better job of identifying the "address expression 
> >>> induction variables".  This version of the loop runs about 10% faster 
> >>> than the original on my target architecture.
> >>>
> >>> This would seem to be a textbook pattern for the induction variable 
> >>> analysis.  Does anyone have any thoughts on the best way to add these 
> >>> candidates to the set of induction variables that are considered by 
> >>> tree-ssa-loop-ivopts.c?
> >>>
> >>> Thanks in advance for any suggestions.
> >>>
> >> Hi,
> >> Could you please file a bug with your original slow test code
> >> attached?  I tried to construct meaningful test case from your code
> >> snippet but not successful.  There is difference in generated
> >> assembly, but it's not that fundamental.  So a bug with preprocessed
> >> test would be high appreciated.
> >> I think there are two potential issues in cost computation for such
> >> case: invariant expression and iv uses outside of loop handled as
> >> inside uses.
> >>
> >> Thanks,
> >> bin
> >>
> >>

Reply via email to