Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176]

Richard Biener via Gcc-patches Wed, 28 Jul 2021 02:59:35 -0700

On Tue, Jun 29, 2021 at 1:10 AM Victor Tong <vit...@microsoft.com> wrote:
>
> Thanks Richard and Marc.
>
> I wrote the following test case to compare the outputs of fn1() and 
> fn1NoOpt() below with my extra pattern being applied. I tested the two 
> functions with all of the integers from INT_MIN to INT_MAX.
>
> long
> fn1 (int x)
> {
>   return 42L - (long)(42 - x);
> }
>
> #pragma GCC push_options
> #pragma GCC optimize ("O0")
> long
> fn1NoOpt (int x)
> {
>   volatile int y = (42 - x);
>   return 42L - (long)y;
> }
> #pragma GCC pop_options
>
> int main ()
> {
>         for (long i=INT_MIN; i<=INT_MAX;i++)
>         {
>                 auto valNoOpt = fn1NoOpt(i);
>                 auto valOpt = fn1(i);
>                 if (valNoOpt != valOpt)
>                         printf("valOpt=%ld, valNoOpt=%ld\n", valOpt, 
> valNoOpt);
>         }
>         return 0;
> }
>
> I saw that the return values of fn1() and fn1NoOpt() differed when the input 
> was between INT_MIN and INT_MIN+42 inclusive. When passing values in this 
> range to fn1NoOpt(), a signed overflow is triggered which causes the value to 
> differ (undefined behavior). This seems to go in line with what Marc 
> described and I think the transformation is correct in the scenario above. I 
> do think that type casts that result in truncation (i.e. from a higher 
> precision to a lower one) or with unsigned types will result in an incorrect 
> transformation so those scenarios need to be avoided.
>
> Given that the extra pattern I'm adding is taking advantage the undefined 
> behavior of signed integer overflow, I'm considering keeping the existing 
> nop_convert pattern in place and adding a new pattern to cover these new 
> cases. I'd also like to avoid touching nop_convert given that it's used in a 
> number of other patterns.
>
> This is the pattern I have currently:
>
>   (simplify
>     (minus (convert1? @0) (convert2? (minus (convert3? @2) @1)))
>     (if (operand_equal_p(@0, @2, 0)


The operand_equal_p should be reflected by using @0 in place of @2.

>         && INTEGRAL_TYPE_P (type)
>         && TYPE_OVERFLOW_UNDEFINED(type)
>         && !TYPE_OVERFLOW_SANITIZED(type)
>         && INTEGRAL_TYPE_P (TREE_TYPE(@1))
>         && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@1))
>         && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@1))
>         && !TYPE_UNSIGNED (TREE_TYPE (@1))
>         && !TYPE_UNSIGNED (type)

please group checks on common argument.  I think a single test
on INTEGRAL_TYPE_P (type) is enough, it could be ANY_INTEGRAL_TYPE_P
to include vector and complex types.

>         && TYPE_PRECISION (TREE_TYPE (@1)) <= TYPE_PRECISION (type)
>         && INTEGRAL_TYPE_P (TREE_TYPE(@0))
>         && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@0))

why is this testing TREE_TYPE (@0)?  The outer subtract is using 'type',
the inner subtract uses TREE_TYPE (@1) though you could place
a capture on the minus to make what you test more obvious, like

  (minus (convert1? @0) (convert2? (minus@3 (convert3? @2) @1)))

TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@3))

there's one set of checks too much I think.

>         && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@0))
>         && !TYPE_UNSIGNED (TREE_TYPE (@0))

we only ever have TYPE_OVERFLOW_UNDEFINED on singed types so
the !TYPE_UNSIGNED checks are superfluous.

>         && TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
>         && TREE_TYPE(@1) == TREE_TYPE(@2))

This check means that convert3 is never present since a MINUS requires
compatible types.

Sorry for the late reply.

Note the pattern will necessarily be partly redundant with the

  (simplify
   (minus (nop_convert1? (minus (nop_convert2? @0) @1)) @0)
   (if (!ANY_INTEGRAL_TYPE_P (type)
        || TYPE_OVERFLOW_WRAPS (type))
   (negate (view_convert @1))
   (view_convert (negate @1))))

pattern.  Once we'd "inline" nop_convert genmatch would complain
about this.

>     (convert @1)))
>
> Is there a more concise/better way of writing the pattern? I was looking for 
> similar checks in match.pd and I couldn't find anything that I could leverage.
>
> I also kept my pattern to the specific scenario I'm seeing with the 
> regression to lower the risk of something breaking. I've limited @1 and @2 to 
> have the same type.
>
> I'm also in favor of adding/running computer verification to make sure the 
> transformation is legal. I've written some tests to verify that the pattern 
> is being applied in the right scenarios and not being applied in others, but 
> I think there are too many possibilities to manually write them all. Is there 
> anything in GCC that can be used to verify that match.pd transformations are 
> correct? I'm thinking of something like Alive 
> https://github.com/AliveToolkit/alive2.
>
> Thanks,
> Victor
>
>
>
> From: Richard Biener <richard.guent...@gmail.com>
> Sent: Monday, June 21, 2021 12:08 AM
> To: Marc Glisse <marc.gli...@inria.fr>
> Cc: Victor Tong <vit...@microsoft.com>; gcc-patches@gcc.gnu.org 
> <gcc-patches@gcc.gnu.org>
> Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division 
> followed by multiply [PR95176]
>
> On Sat, Jun 19, 2021 at 7:05 PM Marc Glisse <marc.gli...@inria.fr> wrote:
> >
> > On Fri, 18 Jun 2021, Richard Biener wrote:
> >
> > >> Option 2: Add a new pattern to support scenarios that the existing 
> > >> nop_convert pattern bails out on.
> > >>
> > >> Existing pattern:
> > >>
> > >> (simplify
> > >>    (minus (nop_convert1? @0) (nop_convert2? (minus (nop_convert3? @@0) 
> > >> @1)))
> > >>    (view_convert @1))
> >
> > I tried to check with a program when
> >
> > T3 x;
> > T1 y;
> > (T2)x-(T2)((T1)x-y)
> >
> > can be safely replaced with
> >
> > (T2)y
> >
> > From the output, it looks like this is safe when T1 is at least as large
> > as T2. It is wrong when T1 is unsigned and smaller than T2. And when T1 is
> > signed and smaller than T2, it is ok if T3 is the same type as T1 (signed
> > then) or has strictly less precision (any sign), and not in other cases.
> >
> > Note that this is when signed implies undefined overflow and unsigned
> > implies wrapping, and I wouldn't put too much faith in this recently
> > dusted program. And it doesn't say how to write the match.pd pattern with
> > '?', "@@", disabling it if TYPE_OVERFLOW_SANITIZED, etc.
> >
> > Mostly, I wanted to say that if we are going to go handle more than
> > nop_convert for more than just 1 or 2 easy transformations, I think some
> > kind of computer verification would be useful, it would save a lot of time
> > and headaches.
>
> True.  I wonder if auto-generating such tests from match.pd rules would
> be a good project to work on (GSoC?).  When there's define_match
> involved things get a little tricky, but then one item on the TODO list
> is "inlining" those at the use site (exploding the decision tree even more).
>
> Richard.
>
> > (I just check by brute force all possible precisions (from 1 to 6) and
> > signedness for T1, T2 and T3, all possible values for x and y, compute
> > the before and after formulas, accepting if there is UB before, rejecting
> > if there is UB after (and not before), and manually try to see a pattern
> > in the list of types that work)
> >
> > --
> > Marc Glisse

Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176]

Reply via email to