[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 Martin Liška changed: What|Removed |Added Status|NEW |RESOLVED CC||marxin at gcc dot gnu.org Resolution|--- |FIXED --- Comment #30 from Martin Liška --- Implemented on trunk.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #29 from Martin Liška --- Author: marxin Date: Mon Sep 16 14:22:16 2019 New Revision: 275749 URL: https://gcc.gnu.org/viewcvs?rev=275749=gcc=rev Log: Fix PR88784, middle end is missing some optimizations about unsigned 2019-09-16 Li Jia He Qi Feng PR middle-end/88784 * match.pd (x > y && x != XXX_MIN): Optimize into 'x > y'. (x > y && x == XXX_MIN): Optimize into 'false'. (x <= y && x == XXX_MIN): Optimize into 'x == XXX_MIN'. (x < y && x != XXX_MAX): Optimize into 'x < y'. (x < y && x == XXX_MAX): Optimize into 'false'. (x >= y && x == XXX_MAX): Optimize into 'x == XXX_MAX'. (x > y || x != XXX_MIN): Optimize into 'x != XXX_MIN'. (x <= y || x != XXX_MIN): Optimize into 'true'. (x <= y || x == XXX_MIN): Optimize into 'x <= y'. (x < y || x != XXX_MAX): Optimize into 'x != XXX_MAX'. (x >= y || x != XXX_MAX): Optimize into 'true'. (x >= y || x == XXX_MAX): Optimize into 'x >= y'. 2019-09-16 Li Jia He Qi Feng PR middle-end/88784 * gcc.dg/pr88784-1.c: New testcase. * gcc.dg/pr88784-2.c: New testcase. * gcc.dg/pr88784-3.c: New testcase. * gcc.dg/pr88784-4.c: New testcase. * gcc.dg/pr88784-5.c: New testcase. * gcc.dg/pr88784-6.c: New testcase. * gcc.dg/pr88784-7.c: New testcase. * gcc.dg/pr88784-8.c: New testcase. * gcc.dg/pr88784-9.c: New testcase. * gcc.dg/pr88784-10.c: New testcase. * gcc.dg/pr88784-11.c: New testcase. * gcc.dg/pr88784-12.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/pr88784-1.c trunk/gcc/testsuite/gcc.dg/pr88784-10.c trunk/gcc/testsuite/gcc.dg/pr88784-11.c trunk/gcc/testsuite/gcc.dg/pr88784-12.c trunk/gcc/testsuite/gcc.dg/pr88784-2.c trunk/gcc/testsuite/gcc.dg/pr88784-3.c trunk/gcc/testsuite/gcc.dg/pr88784-4.c trunk/gcc/testsuite/gcc.dg/pr88784-5.c trunk/gcc/testsuite/gcc.dg/pr88784-6.c trunk/gcc/testsuite/gcc.dg/pr88784-7.c trunk/gcc/testsuite/gcc.dg/pr88784-8.c trunk/gcc/testsuite/gcc.dg/pr88784-9.c Modified: trunk/gcc/ChangeLog trunk/gcc/match.pd trunk/gcc/testsuite/ChangeLog
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #28 from rguenther at suse dot de --- On Tue, 18 Jun 2019, helijia at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 > > --- Comment #27 from Li Jia He --- > Created attachment 46495 > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46495=edit > [v2] try to fix this issue in ifcombine(and_comparisons_1 and > or_comparisons_1) > > This patch is similar to the previous patch, try to fix this issue in > ifcombine(and_comparisons_1 and or_comparisons_1). Would you like to help me > see what kind of question this patch might have again ? It looks reasonable from a quick look. Please post patches to gcc-patc...@gcc.gnu.org though.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #27 from Li Jia He --- Created attachment 46495 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46495=edit [v2] try to fix this issue in ifcombine(and_comparisons_1 and or_comparisons_1) This patch is similar to the previous patch, try to fix this issue in ifcombine(and_comparisons_1 and or_comparisons_1). Would you like to help me see what kind of question this patch might have again ?
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #26 from rguenther at suse dot de --- On Tue, 11 Jun 2019, helijia at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 > > --- Comment #25 from Li Jia He --- > Indeed, this patch cannot catch all variants that appear. > > I found that the optimize_vec_cond_expr function in the tree-ssa-reassoc.c > file > will > call maybe_fold_and_comparisons and maybe_fold_or_comparisons, so just this > patch > can also handle the non-branchy cases without adding those pattern to > match.pd. > > Indeed if we add the corresponding pattern to match.pd file and it would be > better to let ifcombine identify these patterns. I will try to re-writing > ifcombine to identify these patterns. Note this is a complex task without either building the non-brancy version in GENERIC or GIMPLE which is what the code wanted to avoid in the first place. I think improving both match.pd and maybe_fold_and_comparisons is OK at this point.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #25 from Li Jia He --- Indeed, this patch cannot catch all variants that appear. I found that the optimize_vec_cond_expr function in the tree-ssa-reassoc.c file will call maybe_fold_and_comparisons and maybe_fold_or_comparisons, so just this patch can also handle the non-branchy cases without adding those pattern to match.pd. Indeed if we add the corresponding pattern to match.pd file and it would be better to let ifcombine identify these patterns. I will try to re-writing ifcombine to identify these patterns.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #24 from rguenther at suse dot de --- On Tue, 11 Jun 2019, helijia at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 > > --- Comment #23 from Li Jia He --- > Created attachment 46477 > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46477=edit > try to fix this issue in ifcombine(and_comparisons_1 and or_comparisons_1) > > I am trying to solve this issue directly in ifcombine. Would you like to help > me see what kind of question this patch might have ? In principle it looks good. It might not catch all variants that appear, it might arrive with Y < X && X != 0 for example. The ifcombine code also only handles the branchy case so the match.pd patterns are still required to handle the non-branchy cases. I understand you don't want to deal with re-writing ifcombine to not rely on its hand-written patterns.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #23 from Li Jia He --- Created attachment 46477 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46477=edit try to fix this issue in ifcombine(and_comparisons_1 and or_comparisons_1) I am trying to solve this issue directly in ifcombine. Would you like to help me see what kind of question this patch might have ?
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #22 from Qi Feng --- Two more similar ones: x <= y && x == ( 0 or XXX_MIN ) --> x == ( 0 or XXX_MIN ) x >= y && x == ( UXXX_MAX or XXX_MAX ) --> x == ( UXXX_MAX or XXX_MAX )
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #21 from rguenther at suse dot de --- On Fri, 31 May 2019, ffengqi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 > > --- Comment #20 from Qi Feng --- > I have tried to merge signed and unsigned together: > > /* x > y && x != ( 0 or XXX_MIN ) --> x > y */ > (for and (truth_and bit_and) > (simplify > (and:c (gt:c@3 @0 @1) (ne @0 INTEGER_CST@2)) > (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))) >(if ((TYPE_UNSIGNED (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@1)) > && integer_zerop (@2)) > || (!TYPE_UNSIGNED (TREE_TYPE(@0)) && !TYPE_UNSIGNED (TREE_TYPE(@1)) > && wi::eq_p (wi::to_wide (@2), > wi::min_value (TYPE_PRECISION (TREE_TYPE (@2)), > SIGNED > @3 > > /* x > y || x != ( 0 or XXX_MIN ) --> x != ( 0 or XXX_MIN ) */ > (for or (truth_or bit_ior) > (simplify > (or:c (gt:c @0 @1) (ne@3 @0 INTEGER_CST@2)) > (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))) >(if ((TYPE_UNSIGNED (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@1)) > && integer_zerop (@2)) > || (!TYPE_UNSIGNED (TREE_TYPE(@0)) && !TYPE_UNSIGNED (TREE_TYPE(@1)) > && wi::eq_p (wi::to_wide (@2), > wi::min_value (TYPE_PRECISION (TREE_TYPE (@2)), > SIGNED > @3 > > /* x < y && x != ( UXXX_MAX or XXX_MAX ) --> x < y */ > (for and (truth_and bit_and) > (simplify > (and:c (lt:c@3 @0 @1) (ne @0 INTEGER_CST@2)) > (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))) >(if ((TYPE_UNSIGNED (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@1)) > && wi::eq_p (wi::to_wide (@2), > wi::max_value (TYPE_PRECISION (TREE_TYPE (@2)), > UNSIGNED))) > || (!TYPE_UNSIGNED (TREE_TYPE(@0)) && !TYPE_UNSIGNED (TREE_TYPE(@1)) > && wi::eq_p (wi::to_wide (@2), > wi::max_value (TYPE_PRECISION (TREE_TYPE (@2)), > SIGNED > @3 > > (That's not all of them, for that would be too long and I think it's not > necessary.) > > I also tried it on a x86 laptop, seems it does work. But I got a bootstrap > issue. I don't know if it's caused by my patch or the version of gcc I used to > compile. You can always try to bootstrap the same revision without your patch. > Another problem is that I can't craft some c/c++ code to test truth_and. Maybe > it's used by other languages? Is it necessary to use truth_and along side > bit_and? The C family will end up with truth_andif. > I have to make this work on a ppc64le machine, could you give me some hints of > where to look into? Upthread I mentioned LOGICAL_OP_NON_SHORT_CIRCUIT. For the testsuite we have --param logical-op-non-short-circuit=1 which you could put into dg-options. Upthread I said that the ifcombine workers should probably be rewritten to support the patterns from match.pd. An initial patch doesn't have to solve everything ;)
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #20 from Qi Feng --- I have tried to merge signed and unsigned together: /* x > y && x != ( 0 or XXX_MIN ) --> x > y */ (for and (truth_and bit_and) (simplify (and:c (gt:c@3 @0 @1) (ne @0 INTEGER_CST@2)) (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))) (if ((TYPE_UNSIGNED (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@1)) && integer_zerop (@2)) || (!TYPE_UNSIGNED (TREE_TYPE(@0)) && !TYPE_UNSIGNED (TREE_TYPE(@1)) && wi::eq_p (wi::to_wide (@2), wi::min_value (TYPE_PRECISION (TREE_TYPE (@2)), SIGNED @3 /* x > y || x != ( 0 or XXX_MIN ) --> x != ( 0 or XXX_MIN ) */ (for or (truth_or bit_ior) (simplify (or:c (gt:c @0 @1) (ne@3 @0 INTEGER_CST@2)) (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))) (if ((TYPE_UNSIGNED (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@1)) && integer_zerop (@2)) || (!TYPE_UNSIGNED (TREE_TYPE(@0)) && !TYPE_UNSIGNED (TREE_TYPE(@1)) && wi::eq_p (wi::to_wide (@2), wi::min_value (TYPE_PRECISION (TREE_TYPE (@2)), SIGNED @3 /* x < y && x != ( UXXX_MAX or XXX_MAX ) --> x < y */ (for and (truth_and bit_and) (simplify (and:c (lt:c@3 @0 @1) (ne @0 INTEGER_CST@2)) (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1))) (if ((TYPE_UNSIGNED (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@1)) && wi::eq_p (wi::to_wide (@2), wi::max_value (TYPE_PRECISION (TREE_TYPE (@2)), UNSIGNED))) || (!TYPE_UNSIGNED (TREE_TYPE(@0)) && !TYPE_UNSIGNED (TREE_TYPE(@1)) && wi::eq_p (wi::to_wide (@2), wi::max_value (TYPE_PRECISION (TREE_TYPE (@2)), SIGNED @3 (That's not all of them, for that would be too long and I think it's not necessary.) I also tried it on a x86 laptop, seems it does work. But I got a bootstrap issue. I don't know if it's caused by my patch or the version of gcc I used to compile. Another problem is that I can't craft some c/c++ code to test truth_and. Maybe it's used by other languages? Is it necessary to use truth_and along side bit_and? I have to make this work on a ppc64le machine, could you give me some hints of where to look into? BTW, the following tests may be useful if you want test it on your machine: #include /* x > y && x != ( 0 or INT_MIN ) --> x > y */ _Bool f0 (unsigned x, unsigned y) { return x > y && x != 0; } _Bool f1 (unsigned x, unsigned y) { return y < x && x != 0; } _Bool f2 (unsigned x, unsigned y) { return x != 0 && x > y; } _Bool f3 (unsigned x, unsigned y) { return x != 0 && y < x; } _Bool f4 (int x, int y) { return x > y && x != INT_MIN; } _Bool f5 (int x, int y) { return y < x && x != INT_MIN; } _Bool f6 (int x, int y) { return x != INT_MIN && x > y; } _Bool f7 (int x, int y) { return x != INT_MIN && y < x; } _Bool f8 (unsigned char x, unsigned char y) { return x > y && x != 0; } /* x > y || x != ( 0 or XXX_MIN ) --> x != ( 0 or XXX_MIN ) */ _Bool f10 (unsigned x, unsigned y) { return x > y || x != 0; } _Bool f11 (int x, int y) { return x > y || x != INT_MIN; } _Bool f12 (unsigned char x, unsigned char y) { return x > y || x != 0; } _Bool f13 (signed char x, signed char y) { return x > y || x != SCHAR_MIN; } /* x < y && x != ( UXXX_MAX or XXX_MAX ) --> x < y */ _Bool f20 (unsigned x, unsigned y) { return x < y && x != UINT_MAX; } _Bool f21 (int x, int y) { return x < y && x != INT_MAX; } _Bool f22 (unsigned char x, unsigned char y) { return x < y && x != UCHAR_MAX; } _Bool f23 (signed char x, signed char y) { return x < y && x != SCHAR_MIN; }
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #19 from rguenther at suse dot de --- On Mon, 27 May 2019, ffengqi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 > > --- Comment #18 from Qi Feng --- > I only learned gcc for about 2 months, and I have to say that I don't fully > understand what you guys were saying. Is match.pd the right place to fix this > issue? Am I in the wrong direction? Yes, match.pd is the correct place to add this kind of simplifications. No, please don't use *_{and,or}if codes there if possible. Yes, as I said upthread it might be necessary to rewrite some of GCCs infrastructure to get the pattern applied in all situations.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #18 from Qi Feng --- I only learned gcc for about 2 months, and I have to say that I don't fully understand what you guys were saying. Is match.pd the right place to fix this issue? Am I in the wrong direction?
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #17 from rguenther at suse dot de --- On Fri, 24 May 2019, glisse at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 > > --- Comment #16 from Marc Glisse --- > (In reply to rguent...@suse.de from comment #15) > > It can matter because we might have gimplified the && to > > control flow. > > Indeed. You want to look at the forwprop1 dump to see what gimple-match would > need to match (and does match on x86_64). > > When X > Y && X != 0 is represented with control flow, in theory, VRP could > deduce a range [1,inf] for X from X>Y, but in practice it uses a symbolic > range > [y_3(D) + 1, +INF] from which we fail to deduce X!=0. I think the new ranger > thing that doesn't try to do symbolic ranges would handle that better ;-) Note that in theory the ifcombine pass should pick up the opportunity to simplify the comparison via maybe_fold_and_comparisons. It looks like that code is quite old and could need a rewrite to match-and-simplify. But of course it exists to avoid creating trees for the combination parts which still has a point in match-and-simplify times. An opportunity for an alternate code-generator from match.pd since the patterns are still valid sources for the transform just the input is bigger exploded forms.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #16 from Marc Glisse --- (In reply to rguent...@suse.de from comment #15) > It can matter because we might have gimplified the && to > control flow. Indeed. You want to look at the forwprop1 dump to see what gimple-match would need to match (and does match on x86_64). When X > Y && X != 0 is represented with control flow, in theory, VRP could deduce a range [1,inf] for X from X>Y, but in practice it uses a symbolic range [y_3(D) + 1, +INF] from which we fail to deduce X!=0. I think the new ranger thing that doesn't try to do symbolic ranges would handle that better ;-)
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #15 from rguenther at suse dot de --- On Fri, 24 May 2019, ffengqi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 > > --- Comment #14 from Qi Feng --- > Checking .original and .optimized file is just a quick method I use to check > whether an optimization happened (if not happen in first and last pass, > probably it didn't happen). I didn't mean or think the transform should > happen > in these passes. > > Using the second pattern, the result of transform already showed in .original > file, so I think it maybe happened earlier (like in GENERIC). > > And I tried the first pattern again using a totally clean debug build > (bootstrap disabled), still didn't see the transformation happen. I don't > know > what went wrong. :( > > (I tested it on a ppc64le machine, don't know if that matters) It can matter because we might have gimplified the && to control flow.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #14 from Qi Feng --- Checking .original and .optimized file is just a quick method I use to check whether an optimization happened (if not happen in first and last pass, probably it didn't happen). I didn't mean or think the transform should happen in these passes. Using the second pattern, the result of transform already showed in .original file, so I think it maybe happened earlier (like in GENERIC). And I tried the first pattern again using a totally clean debug build (bootstrap disabled), still didn't see the transformation happen. I don't know what went wrong. :( (I tested it on a ppc64le machine, don't know if that matters)
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #13 from rguenther at suse dot de --- On Fri, 24 May 2019, glisse at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 > > --- Comment #12 from Marc Glisse --- > (In reply to Qi Feng from comment #11) > > I tried 2 patterns for the following test > > > > /* 1. x > y && x != 0 --> x > y */ > > /* 2. y < x && x != 0 --> y < x */ > > /* 3. x != 0 && x > y --> x > y */ > > /* 4. x != 0 && y < x --> y < x */ > > > > _Bool f1 (unsigned x, unsigned y) > > { > > return x > y && x != 0; > > } > > > > _Bool f2 (unsigned x, unsigned y) > > { > > return y < x && x != 0; > > } > > > > _Bool f3 (unsigned x, unsigned y) > > { > > return x != 0 && x > y; > > } > > > > _Bool f4 (unsigned x, unsigned y) > > { > > return x != 0 && y < x; > > } > > > > The first one is > > > > (for and (truth_and bit_and) > > (simplify > > (and:c (gt:c@2 @0 @1) (ne @0 integer_zerop)) > > (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0)) > >&& INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED > > (TREE_TYPE(@1))) > >@2))) > > > > The transformations did not happen as I checked the dump files of > > -fdump-tree-{original,optimized}. > > It isn't supposed to be done in original anyway. It does work in optimized > (even forwprop1) for me. Did you forget to pass -O? Did you look at some old > dump file? > > (I think you could use ANY_INTEGRAL_TYPE_P, this seems valid for vectors) I would have expected fold to first change the TRUTH_ANDIF to a TRUTH_AND and then your pattern match. But maybe I misremember that we have such transformation for cases where the 2nd operand doesn't have side-effects. While genmatch inserts checks for and rejects operands with side-effects I still wouldn't use truth_andif here. As Marc says, expecting the transform in .original is probably premature. OTOH whether it is applicable on GIMPLE in the end might depend on LOGICAL_OP_NON_SHORT_CIRCUIT and BRANCH_COST.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #12 from Marc Glisse --- (In reply to Qi Feng from comment #11) > I tried 2 patterns for the following test > > /* 1. x > y && x != 0 --> x > y */ > /* 2. y < x && x != 0 --> y < x */ > /* 3. x != 0 && x > y --> x > y */ > /* 4. x != 0 && y < x --> y < x */ > > _Bool f1 (unsigned x, unsigned y) > { > return x > y && x != 0; > } > > _Bool f2 (unsigned x, unsigned y) > { > return y < x && x != 0; > } > > _Bool f3 (unsigned x, unsigned y) > { > return x != 0 && x > y; > } > > _Bool f4 (unsigned x, unsigned y) > { > return x != 0 && y < x; > } > > The first one is > > (for and (truth_and bit_and) > (simplify > (and:c (gt:c@2 @0 @1) (ne @0 integer_zerop)) > (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0)) >&& INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED > (TREE_TYPE(@1))) >@2))) > > The transformations did not happen as I checked the dump files of > -fdump-tree-{original,optimized}. It isn't supposed to be done in original anyway. It does work in optimized (even forwprop1) for me. Did you forget to pass -O? Did you look at some old dump file? (I think you could use ANY_INTEGRAL_TYPE_P, this seems valid for vectors)
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #11 from Qi Feng --- I tried 2 patterns for the following test /* 1. x > y && x != 0 --> x > y */ /* 2. y < x && x != 0 --> y < x */ /* 3. x != 0 && x > y --> x > y */ /* 4. x != 0 && y < x --> y < x */ _Bool f1 (unsigned x, unsigned y) { return x > y && x != 0; } _Bool f2 (unsigned x, unsigned y) { return y < x && x != 0; } _Bool f3 (unsigned x, unsigned y) { return x != 0 && x > y; } _Bool f4 (unsigned x, unsigned y) { return x != 0 && y < x; } The first one is (for and (truth_and bit_and) (simplify (and:c (gt:c@2 @0 @1) (ne @0 integer_zerop)) (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED (TREE_TYPE(@1))) @2))) The transformations did not happen as I checked the dump files of -fdump-tree-{original,optimized}. And the second one is (simplify (truth_andif:C (gt:c@2 @0 @1) (ne @0 integer_zerop)) (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED (TREE_TYPE(@1))) @2)) The transformations did happen for all the 4 functions. And the dump of -fdump-tree-original showes they already happened, so I guess the pattern is matched before the process get to GIMPLE.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #10 from Richard Biener --- (In reply to Qi Feng from comment #9) > And there's another problem. Take `x > y && x != 0 --> x > y' for > example, I would also like to do > >x < y && y != 0 --> x < y >x != 0 && x > y --> x > y >y != 0 && x < y --> x < y > > If the constant always comes in as the second operand is incorrect, these > would have to be doubled. > > I tried to add :c to truth_andif, but got the `operation is not commutative' > error. I also tried to make truth_andif commutative by modifying > genmatch.c, but again, I don't know it well, afraid that I would break > something. > > The patterns I wrote looks like: > > /* x > y && x != 0 --> x > y >Only for unsigned x and y. */ > (simplify > (truth_andif:c (gt@2 @0 @1) (ne @0 integer_zerop)) > (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0)) > && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED > (TREE_TYPE(@1))) >@2)) > > I have to wrote 4 of this with minor modification for a single > transformation. If there's better way to do it, please do leave a comment. I think first of all you do _not_ want to use truth_andif since that will only prevail iff x or y have side-effects. To match on GIMPLE you want bit_and instead/as well since all truth_ stuff doesn't prevail there. And obviously truth_andif is _not_ commutative. You can use :C if you want to force it though. Both truth_and and bit_and are commutative. So sth like (for and (truth_and bit_and) (for ltgtop (lt le) (simplify (and:c (ltgtop:c@2 @0 @1) (ne @0 integer_zerop)) (if (...) @2))) should cover all of x < y && y != 0 --> x < y x != 0 && x > y --> x > y y != 0 && x < y --> x < y x < y && y != 0 --> x < y note that from (and (lt:c@2 @0 @1) (ne @0 integer_zerop)) we generate (and (lt@2 @0 @1) (ne @0 integer_zerop)) (and (gt@2 @1 @0) (ne @0 integer_zerop)) so :c will ensure the semantically same operation will be present with swapped operands. As opposed to :C which would do lt@2 @1 @0.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #9 from Qi Feng --- And there's another problem. Take `x > y && x != 0 --> x > y' for example, I would also like to do x < y && y != 0 --> x < y x != 0 && x > y --> x > y y != 0 && x < y --> x < y If the constant always comes in as the second operand is incorrect, these would have to be doubled. I tried to add :c to truth_andif, but got the `operation is not commutative' error. I also tried to make truth_andif commutative by modifying genmatch.c, but again, I don't know it well, afraid that I would break something. The patterns I wrote looks like: /* x > y && x != 0 --> x > y Only for unsigned x and y. */ (simplify (truth_andif:c (gt@2 @0 @1) (ne @0 integer_zerop)) (if (INTEGRAL_TYPE_P (TREE_TYPE(@0)) && TYPE_UNSIGNED (TREE_TYPE(@0)) && INTEGRAL_TYPE_P (TREE_TYPE(@1)) && TYPE_UNSIGNED (TREE_TYPE(@1))) @2)) I have to wrote 4 of this with minor modification for a single transformation. If there's better way to do it, please do leave a comment.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #8 from rguenther at suse dot de --- On Wed, 22 May 2019, ffengqi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 > > --- Comment #7 from Qi Feng --- > I add some patterns in match.pd which handles the original 5 transformations. > But I don't the language used in match.pd well, the patterns I wrote are very > similar. Sometimes iteration helps but sometimes it obfuscates things too much. match.pd also runs through the preprocessor (OK, I didn't really suggest to use macros to merge things ;)) > And I haven't found predicates for constant values other than zero (INT_MIN, > LONG_MIN, LLONG_MIN etc.). wi:eq_p (wi::to_wide (@1), wi::max_value (TYPE_PRECISION (TREE_TYPE (@1)), SIGNED)) (ok, a bit long...)
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #7 from Qi Feng --- I add some patterns in match.pd which handles the original 5 transformations. But I don't the language used in match.pd well, the patterns I wrote are very similar. And I haven't found predicates for constant values other than zero (INT_MIN, LONG_MIN, LLONG_MIN etc.).
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #6 from Fredrik Hederstierna --- Created attachment 46397 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46397=edit Some more patterns Looking into this I found some more places where it seems to be non-optimal code, maybe separate issue, but these are also example of equal evaluation for unsigned types ? Test 1 (x > y) || (x > (y / N)) equal to (x > (y / N)) Test 2 (x > y) || (x > (y >> N)) equal to (x > (y >> N)) Test 3 (x > y) && (x > (y / N)) equal to (x > y) Test 4 (x > y) && (x > (y >> N)) equal to (x > y) One thing to consider here maybe that depending on optimizing for size or speed, then the order of evaluation can be changed, so like if some operation is costy, then it could be avoided to obtain higher speed possibly assuming it will accept arguments prior in list I guess. But when optimizing for size, then I think always the more simplified expression would apply? Example for arm using above expressions, (code attached) : 0: b510push{r4, lr} 2: 000bmovsr3, r1 4: 0004movsr4, r0 6: 2001movsr0, #1 8: 428ccmp r4, r1 a: d807bhi.n 1c c: 2103movsr1, #3 e: 0018movsr0, r3 10: f7ff fffe bl 0 <__aeabi_uidiv> 14: b2c0uxtbr0, r0 16: 42a0cmp r0, r4 18: 4180sbcsr0, r0 1a: 4240negsr0, r0 1c: bd10pop {r4, pc} 001e : 1e: b510push{r4, lr} 20: 0004movsr4, r0 22: 0008movsr0, r1 24: 2103movsr1, #3 26: f7ff fffe bl 0 <__aeabi_uidiv> 2a: b2c0uxtbr0, r0 2c: 42a0cmp r0, r4 2e: 4180sbcsr0, r0 30: 4240negsr0, r0 32: bd10pop {r4, pc} 0034 : 34: 0003movsr3, r0 36: 2001movsr0, #1 38: 428bcmp r3, r1 3a: d803bhi.n 44 3c: 08c9lsrsr1, r1, #3 3e: 4299cmp r1, r3 40: 4189sbcsr1, r1 42: 4248negsr0, r1 44: 4770bx lr 0046 : 46: 08c9lsrsr1, r1, #3 48: 4281cmp r1, r0 4a: 4180sbcsr0, r0 4c: 4240negsr0, r0 4e: 4770bx lr 0050 : 50: b510push{r4, lr} 52: 000bmovsr3, r1 54: 0004movsr4, r0 56: 2000movsr0, #0 58: 428ccmp r4, r1 5a: d907bls.n 6c 5c: 2103movsr1, #3 5e: 0018movsr0, r3 60: f7ff fffe bl 0 <__aeabi_uidiv> 64: b2c0uxtbr0, r0 66: 42a0cmp r0, r4 68: 4180sbcsr0, r0 6a: 4240negsr0, r0 6c: bd10pop {r4, pc} 006e : 6e: 4281cmp r1, r0 70: 4180sbcsr0, r0 72: 4240negsr0, r0 74: 4770bx lr 0076 : 76: 0003movsr3, r0 78: 2000movsr0, #0 7a: 428bcmp r3, r1 7c: d903bls.n 86 7e: 08c9lsrsr1, r1, #3 80: 4299cmp r1, r3 82: 4189sbcsr1, r1 84: 4248negsr0, r1 86: 4770bx lr 0088 : 88: 4281cmp r1, r0 8a: 4180sbcsr0, r0 8c: 4240negsr0, r0 8e: 4770bx lr
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #5 from Qi Feng --- (In reply to Qi Feng from comment #4) > The fourth to the last should be: > > x < y || x != INT_MAX --> x != UINT_MAX > > sorry for the typo. x < y || x != INT_MAX --> x != INT_MAX typo again...
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #4 from Qi Feng --- The fourth to the last should be: x < y || x != INT_MAX --> x != UINT_MAX sorry for the typo.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 --- Comment #3 from Qi Feng --- I have extended the transformations as following, the first five are the original ones: * unsigned Use UINT_MAX for demonstration, similar for UCHAR_MAX, USHRT_MAX, UINT_MAX, ULONG_MAX, ULLONG_MAX x > y && x != 0 --> x > y x > y || x != 0 --> x != 0 x <= y || x != 0 --> true x <= y || x == 0 --> x <= y x > y && x == 0 --> false x < y && x != UINT_MAX--> x < y x < y || x != UINT_MAX--> x != UINT_MAX x >= y || x != UINT_MAX--> true x >= y || x == UINT_MAX--> x >= y x < y && x == UINT_MAX--> false * signed Use INT_MIN, INT_MAX for demonstration, similar for SCHAR_MIN, SHRT_MIN, INT_MIN, LONG_MIN, LLONG_MIN, SCHAR_MAX, SHRT_MAX, INT_MAX, LONG_MAX, LLONG_MAX x > y && x != INT_MIN --> x > y x > y || x != INT_MIN --> x != INT_MIN x <= y || x != INT_MIN --> true x <= y || x == INT_MIN --> x <= y x > y && x == INT_MIN --> false x < y && x != INT_MAX --> x < y x < y || x != INT_MAX --> x != UINT_MAX x >= y || x != INT_MAX --> true x >= y || x == INT_MAX --> x >= y x < y && x == INT_MAX --> false Want to know if I missed something.
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 Qi Feng changed: What|Removed |Added CC||ffengqi at gcc dot gnu.org --- Comment #2 from Qi Feng --- I tried some modifications in and_comparisons_1 and or_comparisons_1. It seems that `||' is transformed to `&&' somewhere. And that makes the changes in or_comparisons_1 noneffective. Should I find the place where the transformation happens and make changes before that?
[Bug middle-end/88784] Middle end is missing some optimizations about unsigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88784 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-10 Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Confirmed. Similar for singed and X > Y && X != INT_MIN, etc. ifcombine is one place to fix (maybe_fold_and_comparisons and friends), match.pd in case this gets lowered to BIT_AND/IOR.