Re: [RFC] Induction variable candidates not sufficiently general

Bin.Cheng Fri, 20 Jul 2018 18:29:21 -0700

On Tue, Jul 17, 2018 at 2:08 AM, Kelvin Nilsen <kdnil...@linux.ibm.com> wrote:
> Thanks for looking at this for me.  In simplifying the test case for a bug 
> report, I've narrowed the "problem" to integer overflow considerations.  My 
> len variable is declared int, and the target has 64-bit pointers.  I'm 
> gathering that the "manual transformation" I quoted below is not considered 
> "equivalent" to the original source code due to different integer overflow 
> behaviors.  If I redeclare len to be unsigned long long, then I automatically 
> get the optimizations that I was originally expecting.
>
> I suppose this is really NOT a bug?
As your test case demonstrates, it is caused by wrapping unsigned int32.
>
> Is there a compiler optimization flag that allows the optimizer to ignore 
> array index integer overflow in considering legal optimizations?
I am not aware of one for unsigned integer, and I guess it won't be
introduced in the future either?


Thanks,
bin
>
>
>
> On 7/13/18 9:14 PM, Bin.Cheng wrote:
>> On Fri, Jul 13, 2018 at 6:04 AM, Kelvin Nilsen <kdnil...@linux.ibm.com> 
>> wrote:
>>> A somewhat old "issue report" pointed me to the code generated for a 4-fold 
>>> manually unrolled version of the following loop:
>>>
>>>>                       while (++len != len_limit) /* this is loop */
>>>>                               if (pb[len] != cur[len])
>>>>                                       break;
>>>
>>> As unrolled, the loop appears as:
>>>
>>>>                 while (++len != len_limit) /* this is loop */ {
>>>>                   if (pb[len] != cur[len])
>>>>                     break;
>>>>                   if (++len == len_limit)  /* unrolled 2nd iteration */
>>>>                     break;
>>>>                   if (pb[len] != cur[len])
>>>>                     break;
>>>>                   if (++len == len_limit)  /* unrolled 3rd iteration */
>>>>                     break;
>>>>                   if (pb[len] != cur[len])
>>>>                     break;
>>>>                   if (++len == len_limit)  /* unrolled 4th iteration */
>>>>                     break;
>>>>                   if (pb[len] != cur[len])
>>>>                     break;
>>>>                 }
>>>
>>> In examining the behavior of tree-ssa-loop-ivopts.c, I've discovered the 
>>> only induction variable candidates that are being considered are all forms 
>>> of the len variable.  We are not considering any induction variables to 
>>> represent the address expressions &pb[len] and &cur[len].
>>>
>>> I rewrote the source code for this loop to make the addressing expressions 
>>> more explicit, as in the following:
>>>
>>>>       cur++;
>>>>       while (++pb != last_pb) /* this is loop */ {
>>>>       if (*pb != *cur)
>>>>         break;
>>>>       ++cur;
>>>>       if (++pb == last_pb)  /* unrolled 2nd iteration */
>>>>         break;
>>>>       if (*pb != *cur)
>>>>         break;
>>>>       ++cur;
>>>>       if (++pb == last_pb)  /* unrolled 3rd iteration */
>>>>         break;
>>>>       if (*pb != *cur)
>>>>         break;
>>>>       ++cur;
>>>>       if (++pb == last_pb)  /* unrolled 4th iteration */
>>>>         break;
>>>>       if (*pb != *cur)
>>>>         break;
>>>>       ++cur;
>>>>       }
>>>
>>> Now, gcc does a better job of identifying the "address expression induction 
>>> variables".  This version of the loop runs about 10% faster than the 
>>> original on my target architecture.
>>>
>>> This would seem to be a textbook pattern for the induction variable 
>>> analysis.  Does anyone have any thoughts on the best way to add these 
>>> candidates to the set of induction variables that are considered by 
>>> tree-ssa-loop-ivopts.c?
>>>
>>> Thanks in advance for any suggestions.
>>>
>> Hi,
>> Could you please file a bug with your original slow test code
>> attached?  I tried to construct meaningful test case from your code
>> snippet but not successful.  There is difference in generated
>> assembly, but it's not that fundamental.  So a bug with preprocessed
>> test would be high appreciated.
>> I think there are two potential issues in cost computation for such
>> case: invariant expression and iv uses outside of loop handled as
>> inside uses.
>>
>> Thanks,
>> bin
>>
>>

Re: [RFC] Induction variable candidates not sufficiently general

Reply via email to