On Tue, Aug 29, 2017 at 1:35 PM, Jackson Woodruff <jackson.woodr...@foss.arm.com> wrote: > Hi all, > > Apologies again to those CC'ed, who (again) received this twice. > > Joseph: Yes you are correct. I misread the original thread, now fixed. > > Richard: I've moved the optimizations out of fold-const.c. One has been > replicated in match.pd, and the other (x / C +- y / C -> (x +- y) / C) I've > deleted as it only introduced a new optimization when running with the flags > '-O0 -funsafe-math-optimizations'.
Hmm, how did you verify that, that it only adds sth with -O0 -funsafe-math-optimizations? Is that because in GIMPLE the reassoc pass should do this transform? You added +/* Simplify x / (- y) to -x / y. */ +(simplify + (rdiv @0 (negate @1)) + (rdiv (negate @0) @1)) for /* (-A) / (-B) -> A / B */ if (TREE_CODE (arg0) == NEGATE_EXPR && negate_expr_p (arg1)) return fold_build2_loc (loc, RDIV_EXPR, type, TREE_OPERAND (arg0, 0), negate_expr (arg1)); if (TREE_CODE (arg1) == NEGATE_EXPR && negate_expr_p (arg0)) return fold_build2_loc (loc, RDIV_EXPR, type, negate_expr (arg0), TREE_OPERAND (arg1, 0)); presumably? This isn't equivalent. It's more like (simplify (rdiv (negate_expr_p @0) (negate @1)) (rdiv (negate @0) @1)) (simplify (rdiv (negate @0) (negate_expr_p @1)) (rdiv @0 (negate @1))) and you should remove the corresponding fold-const.c code. Please do these changes independently to aid bisecting in case of issues. (if (flag_reciprocal_math) - /* Convert (A/B)/C to A/(B*C) */ + /* Convert (A/B)/C to A/(B*C) where neither B nor C are constant. */ (simplify (rdiv (rdiv:s @0 @1) @2) - (rdiv @0 (mult @1 @2))) + (if (TREE_CODE (@1) != REAL_CST && TREE_CODE (@1) != REAL_CST) + (rdiv @0 (mult @1 @2)))) why? I guess to avoid ping-poning with + /* Simplify x / (C * y) to (x / C) / y where C is a constant. */ + (simplify + (rdiv @0 + (mult @1 REAL_CST@2)) + (rdiv (rdiv @0 @2) @1)) ? If so why not just disallow for @1 == REAL_CST? > On O1 and up, the pattern that replaces 'x / C' with 'x * (1 / C)' > is enabled and then the pattern is covered by that and > (x * C +- y * C -> C * (x +- y)) (which is already present in match.pd) > > I have also updated the testcase for those optimizations to use 'O1' to > avoid that case. > > > On 08/24/2017 10:06 PM, Jeff Law wrote: >> >> On 08/17/2017 03:55 AM, Wilco Dijkstra wrote: >>> >>> Richard Biener wrote: >>>> >>>> On Tue, Aug 15, 2017 at 4:11 PM, Wilco Dijkstra <wilco.dijks...@arm.com> >>>> wrote: >>>>> >>>>> Richard Biener wrote: >>>>>>> >>>>>>> We also change the association of >>>>>>> >>>>>>> x / (y * C) -> (x / C) / y >>>>>>> >>>>>>> If C is a constant. >>>>>> >>>>>> >>>>>> Why's that profitable? >>>>> >>>>> >>>>> It enables (x * C1) / (y * C2) -> (x * C1/C2) / y for example. >>>>> Also 1/y is now available to the reciprocal optimization, see >>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71026 for details. >>>> >>>> >>>> Sure, but on its own it's going to be slower. So this isn't the >>>> correct way to enable those followup transforms. >>> >>> >>> How can it be any slower? It's one division and one multiply in both >>> cases. >> >> x / (y * C) -> (x / C) / y >> >> Goes from one division and one multiplication to two divisions. I'm >> guessing that's what Richi is (reasonably) concerned about. >> >> So it may be the case that we need more complex pattern matching here >> for when to perform the first transformation to ultimately enable the >> second. >> > > The only case where we don't remove the division but still execute this > pattern is when run with'-O0 -freciprocal-math'. > > As long as we have 'O1' or greater (and -freciprocal-math), that is enough > to enable the removal of the reciprocal. I don't see this. I presume you mean this happens in the recip pass? But we don't optimize this when optimizing for size (but the above pattern still applies) or when targetm.min_divisions_for_recip_mul is too large. So I still think this is the wrong place to do this and instead the recip pass should be extended. > > Jackson. > >> >> Jeff >> >