On 11/10/2021 1:35 AM, Richard Biener via Gcc-patches wrote:
On Tue, Nov 9, 2021 at 5:25 PM Navid Rahimi <navidrah...@microsoft.com> wrote:
Hi Richard,

Thank you so much for your detailed feedback. I am attaching another version of 
the patch which does include all the changes you mentioned.

Bellow you can see my response to your feedbacks:

the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
constants come last - you can then remove the 'c'
Fixed. I was not aware of the canonical order.

you can use INTEGRAL_TYPE_P (type).
Fixed. Didn't know about "type" either.

this test is not necessary
Fixed.

But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
it looks like a conflict with the x * (1 + b) <-> x + x * b transform
(fold_plusminus_mult)?  That said, the special case of one
definitely makes the result cheaper (one less operation).
For this special case, it does remove operation indeed. But I was not able to reproduce it for any other constant [1]. If it was possible 
to do it with other constants I would've changed the pattern to have be more general like "x * (C + y/x) - y -> C*x - y % x". 
Basically anything other than 1 wasn't a win. Regarding the "x * (1 + b) <-> x + x * b" transformation, it appears to me 
when there is a "- y" at the end "x * (1 + b)", there is opportunity to optimize. Without that "- y" I was 
not able to make any operation more performant. Either direction, looks like same amount of computation.

1) https://compiler-explorer.com/z/dWsq7Tzf4

Please move the pattern next to the most relatest which I think is
Fixed.

the return value of 1 is an unreliable way to fail, please instead call
__builtin_abort ();
Fixed.

do we really need -O3 for this?  I think that -O should be enough here?
We don't specifically need that. But I realized that the optimization can 
happen in two different level at the compiler. It seems if you spread the 
computation over multiple statement like:
   int c = a/b;
   int y = b * (1 + c);
   return y - a;

instead of :
   return b * (1 + a / b) - a;

Then you have to have at least -O1 to have it optimized. Granted, I am not 
doing that in the testcase. In the new patch I am changing it to -O. Let me 
know if you have any suggestions.
-O is fine, generally at -O0 we shouldn't expect such transforms to
happen (but they still do, of course).

The patch looks OK now.
I've pushed this to the trunk for Navid.

jeff

Reply via email to