[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 Martin Liška changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #14 from Martin Liška --- Fixed.
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 --- Comment #13 from CVS Commits --- The master branch has been updated by Martin Liska : https://gcc.gnu.org/g:7e846b0f13b8a111484eb3a330044726b9d7ad79 commit r12-5625-g7e846b0f13b8a111484eb3a330044726b9d7ad79 Author: Martin Liska Date: Mon Nov 29 16:24:12 2021 +0100 Change if-to-switch-conversion test. PR tree-optimization/103278 gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/if-to-switch-5.c: Make the test acceptable by targets with no jump-tables.
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 --- Comment #12 from rguenther at suse dot de --- works for me
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 --- Comment #11 from Martin Liška --- (In reply to Richard Biener from comment #9) > So for > > rl78-elf > if-to-switch-5 > > we fail the JT because it's not enabled for the target. Yes. > w/o the CD-DCE > change we build a bit-test from > > ;; Canonical GIMPLE case clusters: 34 39 44 46 58 59 60 62 92 > > which we then expand into a new switch without any bit tests(?). No, the message just tells what's the canonical switch form and the transformation doesn't happen. > After the CD-DCE change we have improved(?) clusters: > > ;; Canonical GIMPLE case clusters: 34 39 44 46 58-60 62 92 Which is nicer! > > and thus the output is no longer(?) beneficial. Yes, because BT counts the number of edges in the costing model and this is definitelly more precise. Does not make sense doing a BT for 10-30 even though it's theoretically 21 edges. > > I wonder if the merging of clusters done as part of the sorting is > counter-productive since it distorts the profitability? If we check against > the > size of the unmerged clusters like with the following the testcase > will pass again. I also notice we don't take advantage of the > merging and create The costing is not improved by the change. > > switch (c_3(D)) [INV], case 34: [INV], case 39: > [INV], case 44: [INV], case 46: [INV], case 58: > [INV], case 59: [INV], case 60: [INV], case 62: [INV], > case 92: [INV]> > > instead of one 'case 58-60:' > > diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc > index 16fabef7ca0..b64a1915e15 100644 > --- a/gcc/gimple-if-to-switch.cc > +++ b/gcc/gimple-if-to-switch.cc > @@ -240,7 +240,7 @@ if_chain::is_beneficial () > >vec output > = jump_table_cluster::find_jump_tables (filtered_clusters); > - bool r = output.length () < filtered_clusters.length (); > + bool r = output.length () < clusters.length (); >if (r) > { >dump_clusters (&output, "JT can be built"); > @@ -251,7 +251,7 @@ if_chain::is_beneficial () > output.release (); > >output = bit_test_cluster::find_bit_tests (filtered_clusters); > - r = output.length () < filtered_clusters.length (); > + r = output.length () < clusters.length (); >if (r) > dump_clusters (&output, "BT can be built"); > > > Martin, can you please look into this? Sure, I tend to change the test in the following way: gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c index ceeae908821..54771e64e59 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c @@ -4,8 +4,8 @@ int crud (unsigned char c) { return (((int) c == 46) || (int) c == 44) -|| (int) c == 58) || (int) c == 59) || (int) c == 60) - || (int) c == 62) || (int) c == 34) || (int) c == 92) +|| (int) c == 58) || (int) c == 60) || (int) c == 62) + || (int) c == 64) || (int) c == 34) || (int) c == 92) || (int) c == 39) != 0); }
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 Martin Liška changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #10 from Martin Liška --- Lemme take a look.
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 Richard Biener changed: What|Removed |Added Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|RESOLVED|NEW Resolution|FIXED |--- --- Comment #9 from Richard Biener --- So for rl78-elf if-to-switch-5 we fail the JT because it's not enabled for the target. w/o the CD-DCE change we build a bit-test from ;; Canonical GIMPLE case clusters: 34 39 44 46 58 59 60 62 92 which we then expand into a new switch without any bit tests(?). After the CD-DCE change we have improved(?) clusters: ;; Canonical GIMPLE case clusters: 34 39 44 46 58-60 62 92 and thus the output is no longer(?) beneficial. I wonder if the merging of clusters done as part of the sorting is counter-productive since it distorts the profitability? If we check against the size of the unmerged clusters like with the following the testcase will pass again. I also notice we don't take advantage of the merging and create switch (c_3(D)) [INV], case 34: [INV], case 39: [INV], case 44: [INV], case 46: [INV], case 58: [INV], case 59: [INV], case 60: [INV], case 62: [INV], case 92: [INV]> instead of one 'case 58-60:' diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc index 16fabef7ca0..b64a1915e15 100644 --- a/gcc/gimple-if-to-switch.cc +++ b/gcc/gimple-if-to-switch.cc @@ -240,7 +240,7 @@ if_chain::is_beneficial () vec output = jump_table_cluster::find_jump_tables (filtered_clusters); - bool r = output.length () < filtered_clusters.length (); + bool r = output.length () < clusters.length (); if (r) { dump_clusters (&output, "JT can be built"); @@ -251,7 +251,7 @@ if_chain::is_beneficial () output.release (); output = bit_test_cluster::find_bit_tests (filtered_clusters); - r = output.length () < filtered_clusters.length (); + r = output.length () < clusters.length (); if (r) dump_clusters (&output, "BT can be built"); Martin, can you please look into this?
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 --- Comment #8 from Jeffrey A. Law --- Adjusting the threshold didn't help the tests on the other targets. I'll have to dig a little deeper into those.
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Richard Biener --- So I believe other if-to-switch fails on "weird" targets are caused by the same issue - Jeff, maybe you can peek into the dumps to figure the magic case-values-threshold value to use there? Original reported case fixed.
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 --- Comment #6 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:e28afbb90f8eca31d0a658e92e2007eb0db2a964 commit r12-5365-ge28afbb90f8eca31d0a658e92e2007eb0db2a964 Author: Richard Biener Date: Thu Nov 18 09:49:38 2021 +0100 testsuite/103278 - adjust gcc.dg/tree-ssa/if-to-switch-3.c Analysis shows that after the CD-DCE change we produce better code which makes if-to-switch run into case-values-threshold on some architectures, thus the switch is deemed to simple to be worth generating. The following statically provides --param case-values-threshold to make the testcase less target dependent. 2021-11-18 Richard Biener PR testsuite/103278 * gcc.dg/tree-ssa/if-to-switch-3.c: Supply --param case-values-threshold=4.
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 Richard Biener changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment #5 from Richard Biener --- So the main issue is that while CD-DCE tries to undo the factorization by running cleanup CFG, this process has the choice between two forwarder blocks to remove - one pre-existing to disambiguate the two edges from the last test into the PHI and the one we created. But CFG cleanup simply picks the first candidate which, in some cases is not the newly created one. That results in different handling of if-to-switch cluster processing. In particular the ->m_has_forward_bb handling seems important. If we do diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc index 16fabef7ca0..157c5f6f10b 100644 --- a/gcc/gimple-if-to-switch.cc +++ b/gcc/gimple-if-to-switch.cc @@ -219,8 +219,7 @@ if_chain::is_beneficial () { simple_cluster *right = static_cast (clusters[i]); tree type = TREE_TYPE (left->get_low ()); - if (!left->m_has_forward_bb - && !right->m_has_forward_bb + if (left->m_has_forward_bb == right->m_has_forward_bb && left->m_case_bb == right->m_case_bb) { if (wi::eq_p (wi::to_wide (right->get_low ()) - wi::to_wide which seems more natural then even with the original IL we don't get any if-to-switch as we seem to fail JT building because the number of clusters is then just 4 which is lower than case_values_threshold () which is 5. If we supply --param case-values-threshold=4 to the testcase it is optimized (that overrides the target default) with the result switch (aChar_10(D)) [INV], case 9 ... 10: [INV], case 12: [INV], case 13: [INV], case 32: [INV], case 48: [INV]> : : : # iftmp.0_9 = PHI <1(3), 0(2)> : return iftmp.0_9; compared to the following before the CD-DCE change which looks clearly worse (but even the above has unmerged case 12 and 13?!) switch (aChar_10(D)) [INV], case 9 ... 10: [INV], case 12: [INV], case 13: [INV], case 32: [INV], case 48: [INV]> : : goto ; [100.00%] : : goto ; [100.00%] : : : # iftmp.0_9 = PHI <1(4), 0(2), 1(3), 1(5)> : return iftmp.0_9; so this looks like a testcase issue to me.
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #4 from Richard Biener --- I will have a look
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 --- Comment #3 from Jeffrey A. Law --- Note we also see these regressions: rl78-elf if-to-switch-5 if-to-switch-9 xstormy16-elf if-to-switch-9 sh3-linux-gnu sh3eb-linux-gnu gcc.target/sh/pr51244-19.c, but I think this is fixable with a trivial sh.md change
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Target Milestone|--- |12.0 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2021-11-16 --- Comment #2 from Andrew Pinski --- Confirmed. >;; Canonical GIMPLE case clusters: 9-10 12-13 32 48 I suspect there is some cost issue going on here. DCE is now able to combine empty bb's into one which causes iftoswitch to have a cost fit.
[Bug tree-optimization/103278] [12 Regression] Recent change to cddce inhibits switch optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103278 Andrew Pinski changed: What|Removed |Added CC||seurer at gcc dot gnu.org --- Comment #1 from Andrew Pinski --- *** Bug 103293 has been marked as a duplicate of this bug. ***