Hi Richard, Thanks for reviewing my patch. I did a search online and you're right -- there isn't a vector modulo instruction. I'll remove the X * (Y / X) --> Y - (Y % X) pattern and the existing X - (X / Y) * Y --> X % Y from triggering on vector types.
I looked into why the following pattern isn't triggering: (simplify (minus @0 (nop_convert1? (minus (nop_convert2? @0) @1))) (view_convert @1)) The nop_converts expand into tree_nop_conversion_p checks. In fn2() of the testsuite/gcc.dg/fold-minus-6.c, the expression during generic matching looks like: 42 - (long int) (42 - 42 % x) When looking at the right-hand side of the expression (the (long int) (42 - 42 % x)), the tree_nop_conversion_p check fails because of the type precision difference. The expression inside of the cast has a 32-bit precision and the outer expression has a 64-bit precision. I looked around at other patterns and it seems like nop_convert and view_convert are used because of underflow/overflow concerns. I'm not familiar with the two constructs. What's the difference between using them and checking TYPE_OVERFLOW_UNDEFINED? In the scenario above, since TYPE_OVERFLOW_UNDEFINED is true, the second pattern that I added (X - (X - Y) --> Y) gets triggered. Thanks, Victor From: Richard Biener <richard.guent...@gmail.com> Sent: Tuesday, April 27, 2021 1:29 AM To: Victor Tong <vit...@microsoft.com> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org> Subject: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176] On Thu, Apr 1, 2021 at 1:03 AM Victor Tong via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hello, > > This patch fixes PR tree-optimization/95176. A new pattern in match.pd was > added to transform "a * (b / a)" --> "b - (b % a)". A new test case was also > added to cover this scenario. > > The new pattern interfered with the existing pattern of "X - (X / Y) * Y". In > some cases (such as in fn4() in gcc/testsuite/gcc.dg/fold-minus-6.c), the new > pattern is applied causing the existing pattern to no longer apply. This > results in worse code generation because the expression is left as "X - (X - > Y)". An additional subtraction pattern of "X - (X - Y) --> Y" was added to > this patch to avoid this regression. > > I also didn't remove the existing pattern because it triggered in more cases > than the new pattern because of a tree_invariant_p check that's inserted by > genmatch for the new pattern. Yes, we do not handle using Y multiple times when it might contain side-effects in GENERIC folding (comments in genmatch suggest we can use save_expr but we don't implement this [anymore]). On GIMPLE there's also the issue that your new pattern creates a complex expression which makes it failed to be used by value-numbering for example where the old pattern was OK (eventually, if no conversion was required). So indeed it looks OK to preserve both. I wonder why you needed the +/* X - (X - Y) --> Y */ +(simplify + (minus (convert1? @0) (convert2? (minus @@0 @1))) + (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) && TYPE_OVERFLOW_UNDEFINED(type)) + (convert @1))) pattern since it should be handled by /* Match patterns that allow contracting a plus-minus pair irrespective of overflow issues. */ /* (A +- B) - A -> +- B */ /* (A +- B) -+ B -> A */ /* A - (A +- B) -> -+ B */ /* A +- (B -+ A) -> +- B */ in particular (simplify (minus @0 (nop_convert1? (minus (nop_convert2? @0) @1))) (view_convert @1)) if there's supported cases missing I'd rather extend this pattern than replicating it. +/* X * (Y / X) is the same as Y - (Y % X). */ +(simplify + (mult:c (convert1? @0) (convert2? (trunc_div @1 @@0))) + (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) + (minus (convert @1) (convert (trunc_mod @1 @0))))) note that if you're allowing vector types you have to use (view_convert ...) in the transform and you also need to make sure that the target can expand the modulo - I suspect that's an issue with the existing pattern as well. I don't know of any vector ISA that supports modulo (or integer division, that is). Restricting the patterns to integer types is probably the most sensible solution. Thanks, Richard. > I verified that all "make -k check" tests pass when targeting > x86_64-pc-linux-gnu. > > 2021-03-31 Victor Tong <vit...@microsoft.com> > > gcc/ChangeLog: > > * match.pd: Two new patterns: One to optimize division followed by >multiply and the other to avoid a regression as explained above > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/20030807-10.c: Update existing test to look for a >subtraction because a shift is no longer emitted > * gcc.dg/pr95176.c: New test to cover optimizing division followed by >multiply > > I don't have write access to the GCC repo but I've completed the FSF > paperwork as I plan to make more contributions in the future. I'm looking for > a sponsorship from an existing GCC maintainer before applying for write > access. > > Thanks, > Victor