Re: [PATCH] Add one more pass_convert_switch late.
On 11/19/19 3:48 PM, Richard Biener wrote: Sure. As said in the if-to-switch pass "review" all these (now three) passes should share an intermediate representation and costing they then generate code from (either switch-converted lookups, switches or optimized if-chains). There's no reason to actually generate a switch from an if-chain if we just want to optimize the if-chain (it might be convenient for the code generator though, not sure). Likewise if the switch ends up switch-converted there's no reason to produce the switch first. But first and foremost sharing costing and transform decision code which means having some unified representation of the IL semantics is what is needed for this. Hi. I'm slowly but surely planning to continue working on that in next stage1. There are concerns that should be addressed: 1) if-chain to switch conversion pass made conversion unconditionally without analysis if a switch can be transformed in a clever way (cswitch, jump-table (JT), bit-test (BT)). That can be solved by having an abstract analysis model. One problematic thing is cswitch conversion that requires various CFG constraints and I'm not sure that early tree pass can prove that a switch can eventually become CSWITCH. 2) we observed that current switch expansion can't support tree reassoc tricks: optimize_range_tests_xor and optimize_range_tests_diff I can add that to switch expansion but I see one problem. The switch expansion builds top-level decision tree that contains JT, BT or a simple equality comparison nodes. And each node handles only a consecutive interval. On the contrary the XOR and DIFF tricks can handle any 2 comparisons. Moreover, these tests generate a if-chain that goes at the beginning of a comparison. There's one counter-example which presents the problem: For foo we get: Optimizing range tests a_20(D) -[33, 33] and -[41, 41] into (a_20(D) & -9) != 33 Optimizing range tests a_20(D) -[80, 80] and -[88, 88] into (a_20(D) & -9) != 80 Optimizing range tests a_20(D) -[99, 99] and -[115, 115] into (a_20(D) & -17) != 99 Optimizing range tests a_20(D) -[130, 130] and -[162, 162] into (a_20(D) & -33) != 130 Optimizing range tests a_20(D) -[200, 200] and -[216, 216] into (a_20(D) & -17) != 200 Optimizing range tests a_20(D) -[3, 3] and -[5, 5] into ((unsigned int) a_20(D) + 4294967293 & 4294967293) != 0 so a chain of 6 comparisons. For bar we can generate a 3 BTs with the following balanced tree on the top: ;; BT:3-41 21.7% (guessed) subtree: 21.7% (guessed)) ;;BT:80-130 27.1% (guessed) subtree: 65.0% (guessed)) ;; BT:162-216 16.3% (guessed) subtree: 16.3% (guessed)) There can be also situations where a simple balanced tree can be faster than a series of the XOR and DIFF tests. So it's not an easy task to integrate XOR and DIFF tricks into switch conversion with a reasonable costing model. 3) I can imagine a huge number if-else chain can benefit from simple balanced tree expansion. It can be even better with PGO. So maybe we can transform these huge if-chains into switch unconditionally? Examples: gcc/c-family/c-common.c:2302 gcc/c-family/c-common.c:2871 Thoughts? Thanks, Martin #define T(v, step) a == v || a == (v + step) #define C(v, step) case v: case (v + step) int foo(int a) { if (T(3, 2) || T(33, 8) || T(80, 8) || T(99, 16) || T(130, 32) || T(200, 16)) return 1; else return 4; } int bar(int a) { switch (a) { C(3, 2): C(33, 8): C(80, 8): C(99, 16): C(130, 32): C(200, 16): return 1; default: return 4; } }
Re: [PATCH] Add one more pass_convert_switch late.
On Tue, Nov 19, 2019 at 2:54 PM Martin Liška wrote: > > On 11/19/19 10:25 AM, Richard Biener wrote: > > On Mon, Nov 18, 2019 at 10:17 AM Martin Liška wrote: > >> > >> On 11/14/19 1:15 PM, Richard Biener wrote: > >>> Hmm. I was thinking of moving the pass instead of adding another one. > >>> What's the reason to run switch-conversion during early optimization > >>> again? > >> > >> I'm adding CC, as he's the author of cswitch pass and he probably knows > >> more about the pass placement. But I bet early conversion to array access > >> rapidly simplifies CFG and other passes can benefit from that. > >> > >>> > >>> But then it's probably not too bad... (and somehow I'd still like to see > >>> switch-conversion, switch lowering and if-to-switch be "integrated" > >>> somehow, analyzing the IL and then outputting optimized if/switch > >>> according to the same cost metric). > >> > >> Well, my intuition is that gswitch statement is a canonical representation > >> of a switch. And so that one wants to have if-to-switch early in the > >> pipeline > >> and switch lowering very late? > > > > Just to throw in a tiny thing here - while a switch seems a natural > > canonical > > representation it is actually a poor one for most GCC passes since they're > > going to give up on switches and it's CFG complexity. GCCs switch > > representation also isn't exactly good (remember all that label lookup > > stuff you have to do). Switches are a quite high-level representation > > and to me only "good" if we are actually going to emit a jump-table. > > I agree with you here. My simple counter example would be a user input > where a switch will be explicitly used :) > > > Given converting a series of ifs to a switch is "hard" (the opposite > > is easy) going for a switch early (before CFG transforms mess up > > things too much) kind-of makes sense, but lowering back non-jump table > > ones soon is equally important. > > What we can do is to share decision login in between iftoswitch pass and > cswitch/switch lowering. That way we can convert chains of if statements > that will be interesting for any of the later passes. > > One comment here is that 'if-elseif -> gswitch -> lowering to ifs' can > still produce a faster code as be expand smart to a decision tree. Sure. As said in the if-to-switch pass "review" all these (now three) passes should share an intermediate representation and costing they then generate code from (either switch-converted lookups, switches or optimized if-chains). There's no reason to actually generate a switch from an if-chain if we just want to optimize the if-chain (it might be convenient for the code generator though, not sure). Likewise if the switch ends up switch-converted there's no reason to produce the switch first. But first and foremost sharing costing and transform decision code which means having some unified representation of the IL semantics is what is needed for this. > The question still remains what do we want for GCC 10? A selective > iftoswitch conversion? Not sure if we really need it for GCC 10 - was there any pressing reason other than PRxyz asks for it? Thanks, Richard. > Thanks, > Martin > > > > > That's of course all unrelated to adding another switch-conversion pass. > > (and yes, switch-converted IL is more friendly than a switch) > > > > Richard. > > > >> Martin > >> >
Re: [PATCH] Add one more pass_convert_switch late.
On 11/19/19 11:25 AM, Jakub Jelinek wrote: The point I'm trying to make is that with if-to-switch, if cswitch doesn't handle it, we have a regression. If user writes it as a switch, it is a missed optimization if we generate worse code than we could, but not a regression. Sure, but the question is how common are these conditions that can reassoc transform using the fancy popcount techniques? On the other handy, if-to-switch can optimize more switch statements to e.g. jump tables. As mentioned in the previous email, I can imagine doing gswitch from a if chain only if cswitch/switch lowering can handle it. Martin
Re: [PATCH] Add one more pass_convert_switch late.
On 11/19/19 10:25 AM, Richard Biener wrote: On Mon, Nov 18, 2019 at 10:17 AM Martin Liška wrote: On 11/14/19 1:15 PM, Richard Biener wrote: Hmm. I was thinking of moving the pass instead of adding another one. What's the reason to run switch-conversion during early optimization again? I'm adding CC, as he's the author of cswitch pass and he probably knows more about the pass placement. But I bet early conversion to array access rapidly simplifies CFG and other passes can benefit from that. But then it's probably not too bad... (and somehow I'd still like to see switch-conversion, switch lowering and if-to-switch be "integrated" somehow, analyzing the IL and then outputting optimized if/switch according to the same cost metric). Well, my intuition is that gswitch statement is a canonical representation of a switch. And so that one wants to have if-to-switch early in the pipeline and switch lowering very late? Just to throw in a tiny thing here - while a switch seems a natural canonical representation it is actually a poor one for most GCC passes since they're going to give up on switches and it's CFG complexity. GCCs switch representation also isn't exactly good (remember all that label lookup stuff you have to do). Switches are a quite high-level representation and to me only "good" if we are actually going to emit a jump-table. I agree with you here. My simple counter example would be a user input where a switch will be explicitly used :) Given converting a series of ifs to a switch is "hard" (the opposite is easy) going for a switch early (before CFG transforms mess up things too much) kind-of makes sense, but lowering back non-jump table ones soon is equally important. What we can do is to share decision login in between iftoswitch pass and cswitch/switch lowering. That way we can convert chains of if statements that will be interesting for any of the later passes. One comment here is that 'if-elseif -> gswitch -> lowering to ifs' can still produce a faster code as be expand smart to a decision tree. The question still remains what do we want for GCC 10? A selective iftoswitch conversion? Thanks, Martin That's of course all unrelated to adding another switch-conversion pass. (and yes, switch-converted IL is more friendly than a switch) Richard. Martin
Re: [PATCH] Add one more pass_convert_switch late.
On Tue, Nov 19, 2019 at 11:20:16AM +0100, Martin Liška wrote: > On 11/19/19 10:12 AM, Jakub Jelinek wrote: > > On Mon, Nov 18, 2019 at 03:34:35PM +0100, Martin Liška wrote: > > > > Now, I believe with the if to gswitch optimization these will only > > > > rarely > > > > kick in, because the IL will have switches that reassoc doesn't handle, > > > > instead of series of ifs. > > > > > > Yes, so my question is whether reassoc can handle gswitch statements > > > similar > > > to series of if statements? Note that use can write explicitly something > > > like: > > > > I don't see how. Either the cswitch pass runs early, and then you need to > > take it into account when deciding whether to expand it as series of ifs or > > table lookup or bittest (say, realize you'd need just a single condition to > > cover everything and go with if series), or it runs late and then on the > > other side if reassoc turned gswitch into if series without taking into > > account other possibilities like table lookup, it could pessimize stuff. > > So, IMHO cswitch should he able to do this too. > > I agree with you that we should teach switch lowering to also consider the > transformation that reassoc does. I understand it as a generalized bit test, > and I would be happy to implement that on gswitch level. Which leads us > back to need for if-to-switch conversion pass that will enable such > optimization. The point I'm trying to make is that with if-to-switch, if cswitch doesn't handle it, we have a regression. If user writes it as a switch, it is a missed optimization if we generate worse code than we could, but not a regression. Jakub
Re: [PATCH] Add one more pass_convert_switch late.
On 11/19/19 10:12 AM, Jakub Jelinek wrote: On Mon, Nov 18, 2019 at 03:34:35PM +0100, Martin Liška wrote: Now, I believe with the if to gswitch optimization these will only rarely kick in, because the IL will have switches that reassoc doesn't handle, instead of series of ifs. Yes, so my question is whether reassoc can handle gswitch statements similar to series of if statements? Note that use can write explicitly something like: I don't see how. Either the cswitch pass runs early, and then you need to take it into account when deciding whether to expand it as series of ifs or table lookup or bittest (say, realize you'd need just a single condition to cover everything and go with if series), or it runs late and then on the other side if reassoc turned gswitch into if series without taking into account other possibilities like table lookup, it could pessimize stuff. So, IMHO cswitch should he able to do this too. I agree with you that we should teach switch lowering to also consider the transformation that reassoc does. I understand it as a generalized bit test, and I would be happy to implement that on gswitch level. Which leads us back to need for if-to-switch conversion pass that will enable such optimization. My counter-example is following: diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c index 944362ad076..58f9737b918 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c @@ -6,10 +6,15 @@ int test (int a, int b, int c) { - if ( a == 10 || a == 12 || a == 26) -return b; - else -return c; + switch (a) +{ +case 10: +case 12: +case 26: + return b; +default: + return c; +} } int main () Note that I see both representation in the source code equivalent. It's probably just a matter of taste ;) Martin switch (a) case 0...30: return 1; case 32: return 1; case 34: return 1; which won't be optimized by reassoc. While it can handle something 0<=A<=30|A==32|A==34. Jakub
Re: [PATCH] Add one more pass_convert_switch late.
On Mon, Nov 18, 2019 at 10:17 AM Martin Liška wrote: > > On 11/14/19 1:15 PM, Richard Biener wrote: > > Hmm. I was thinking of moving the pass instead of adding another one. > > What's the reason to run switch-conversion during early optimization again? > > I'm adding CC, as he's the author of cswitch pass and he probably knows > more about the pass placement. But I bet early conversion to array access > rapidly simplifies CFG and other passes can benefit from that. > > > > > But then it's probably not too bad... (and somehow I'd still like to see > > switch-conversion, switch lowering and if-to-switch be "integrated" > > somehow, analyzing the IL and then outputting optimized if/switch > > according to the same cost metric). > > Well, my intuition is that gswitch statement is a canonical representation > of a switch. And so that one wants to have if-to-switch early in the pipeline > and switch lowering very late? Just to throw in a tiny thing here - while a switch seems a natural canonical representation it is actually a poor one for most GCC passes since they're going to give up on switches and it's CFG complexity. GCCs switch representation also isn't exactly good (remember all that label lookup stuff you have to do). Switches are a quite high-level representation and to me only "good" if we are actually going to emit a jump-table. Given converting a series of ifs to a switch is "hard" (the opposite is easy) going for a switch early (before CFG transforms mess up things too much) kind-of makes sense, but lowering back non-jump table ones soon is equally important. That's of course all unrelated to adding another switch-conversion pass. (and yes, switch-converted IL is more friendly than a switch) Richard. > Martin >
Re: [PATCH] Add one more pass_convert_switch late.
On Mon, Nov 18, 2019 at 03:34:35PM +0100, Martin Liška wrote: > > Now, I believe with the if to gswitch optimization these will only rarely > > kick in, because the IL will have switches that reassoc doesn't handle, > > instead of series of ifs. > > Yes, so my question is whether reassoc can handle gswitch statements similar > to series of if statements? Note that use can write explicitly something like: I don't see how. Either the cswitch pass runs early, and then you need to take it into account when deciding whether to expand it as series of ifs or table lookup or bittest (say, realize you'd need just a single condition to cover everything and go with if series), or it runs late and then on the other side if reassoc turned gswitch into if series without taking into account other possibilities like table lookup, it could pessimize stuff. So, IMHO cswitch should he able to do this too. > > switch (a) >case 0...30: > return 1; >case 32: > return 1; >case 34: > return 1; > > which won't be optimized by reassoc. While it can handle something > 0<=A<=30|A==32|A==34. Jakub
Re: [PATCH] Add one more pass_convert_switch late.
On 11/18/19 3:07 PM, Jakub Jelinek wrote: On Mon, Nov 18, 2019 at 02:39:12PM +0100, Martin Liška wrote: On 11/18/19 11:06 AM, Jakub Jelinek wrote: On Mon, Nov 18, 2019 at 10:17:56AM +0100, Martin Liška wrote: On 11/14/19 1:15 PM, Richard Biener wrote: Hmm. I was thinking of moving the pass instead of adding another one. What's the reason to run switch-conversion during early optimization again? I'm adding CC, as he's the author of cswitch pass and he probably knows more about the pass placement. But I bet early conversion to array access rapidly simplifies CFG and other passes can benefit from that. Yes, I think the early placement is to take it into account in inlining decisions, on the other side it causes various issues, I think we have several PRs open where the early cswitch just makes a good decision based on what it knows at that point, but later on we find better VRP ranges for the switch condition and it is too late to undo it and handle it differently. I bet we must have multiple similar problems where one transformation makes a transformation that other pass is not happy about. This is something that happens quite often. Typical example is where we commit to doing a an constant array load and later the range is narrowed and it would be better done either as a bitmask test, or say the constant array load with sometimes much shorter needed arrays. I agree that it can be quite often optimization. I don't like the suggested approach much. There can be optimization in between that can leverage the conversion to a constant array. For what exactly? Dunno :) Well, that said we can move the cswitch pass later in the pipeline.. Another thing is perform the optimizations we do in reassoc (but aren't What kind of optimizations do you mean? I believe the cswitch expansion can handle well the case where certain range can be expressed as ((unsigned) x - const1) <= const2) conditional, but reassoc has also optimize_range_tests_xor, optimize_range_tests_diff, both of which can be applied multiple times. To quote the comments explaining what it does: /* Optimize X == CST1 || X == CST2 if popcount (CST1 ^ CST2) == 1 into (X & ~(CST1 ^ CST2)) == (CST1 & ~(CST1 ^ CST2)). Similarly for ranges. E.g. X != 2 && X != 3 && X != 10 && X != 11 will be transformed by the previous optimization into !((X - 2U) <= 1U || (X - 10U) <= 1U) and this loop can transform that into !(((X & ~8) - 2U) <= 1U). */ and /* Optimize X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into ((X - CST1) & ~(CST2 - CST1)) == 0. Similarly for ranges. E.g. X == 43 || X == 76 || X == 44 || X == 78 || X == 77 || X == 46 || X == 75 || X == 45 will be transformed by the previous optimization into (X - 43U) <= 3U || (X - 75U) <= 3U and this loop can transform that into ((X - 43U) & ~(75U - 43U)) <= 3U. */ Now, I believe with the if to gswitch optimization these will only rarely kick in, because the IL will have switches that reassoc doesn't handle, instead of series of ifs. Yes, so my question is whether reassoc can handle gswitch statements similar to series of if statements? Note that use can write explicitly something like: switch (a) case 0...30: return 1; case 32: return 1; case 34: return 1; which won't be optimized by reassoc. While it can handle something 0<=A<=30|A==32|A==34. So, I believe cswitch needs to handle these cases too and consider it in the decision whether to use const array loads or bittest or if tree. Right now, it differentiates in between const array loads and if tree. Usage of bittest is probably situation which switch lowering can be taught. Martin Jakub
Re: [PATCH] Add one more pass_convert_switch late.
On Mon, Nov 18, 2019 at 02:39:12PM +0100, Martin Liška wrote: > On 11/18/19 11:06 AM, Jakub Jelinek wrote: > > On Mon, Nov 18, 2019 at 10:17:56AM +0100, Martin Liška wrote: > > > On 11/14/19 1:15 PM, Richard Biener wrote: > > > > Hmm. I was thinking of moving the pass instead of adding another one. > > > > What's the reason to run switch-conversion during early optimization > > > > again? > > > > > > I'm adding CC, as he's the author of cswitch pass and he probably knows > > > more about the pass placement. But I bet early conversion to array access > > > rapidly simplifies CFG and other passes can benefit from that. > > > > Yes, I think the early placement is to take it into account in inlining > > decisions, on the other side it causes various issues, I think we have > > several PRs open where the early cswitch just makes a good decision based on > > what it knows at that point, but later on we find better VRP ranges for the > > switch condition and it is too late to undo it and handle it differently. > > I bet we must have multiple similar problems where one transformation makes > a transformation that other pass is not happy about. This is something that happens quite often. Typical example is where we commit to doing a an constant array load and later the range is narrowed and it would be better done either as a bitmask test, or say the constant array load with sometimes much shorter needed arrays. > I don't like the suggested approach much. There can be optimization in between > that can leverage the conversion to a constant array. For what exactly? > > Another thing is perform the optimizations we do in reassoc (but aren't > > What kind of optimizations do you mean? I believe the cswitch expansion can handle well the case where certain range can be expressed as ((unsigned) x - const1) <= const2) conditional, but reassoc has also optimize_range_tests_xor, optimize_range_tests_diff, both of which can be applied multiple times. To quote the comments explaining what it does: /* Optimize X == CST1 || X == CST2 if popcount (CST1 ^ CST2) == 1 into (X & ~(CST1 ^ CST2)) == (CST1 & ~(CST1 ^ CST2)). Similarly for ranges. E.g. X != 2 && X != 3 && X != 10 && X != 11 will be transformed by the previous optimization into !((X - 2U) <= 1U || (X - 10U) <= 1U) and this loop can transform that into !(((X & ~8) - 2U) <= 1U). */ and /* Optimize X == CST1 || X == CST2 if popcount (CST2 - CST1) == 1 into ((X - CST1) & ~(CST2 - CST1)) == 0. Similarly for ranges. E.g. X == 43 || X == 76 || X == 44 || X == 78 || X == 77 || X == 46 || X == 75 || X == 45 will be transformed by the previous optimization into (X - 43U) <= 3U || (X - 75U) <= 3U and this loop can transform that into ((X - 43U) & ~(75U - 43U)) <= 3U. */ Now, I believe with the if to gswitch optimization these will only rarely kick in, because the IL will have switches that reassoc doesn't handle, instead of series of ifs. So, I believe cswitch needs to handle these cases too and consider it in the decision whether to use const array loads or bittest or if tree. Jakub
Re: [PATCH] Add one more pass_convert_switch late.
On 11/18/19 11:06 AM, Jakub Jelinek wrote: On Mon, Nov 18, 2019 at 10:17:56AM +0100, Martin Liška wrote: On 11/14/19 1:15 PM, Richard Biener wrote: Hmm. I was thinking of moving the pass instead of adding another one. What's the reason to run switch-conversion during early optimization again? I'm adding CC, as he's the author of cswitch pass and he probably knows more about the pass placement. But I bet early conversion to array access rapidly simplifies CFG and other passes can benefit from that. Yes, I think the early placement is to take it into account in inlining decisions, on the other side it causes various issues, I think we have several PRs open where the early cswitch just makes a good decision based on what it knows at that point, but later on we find better VRP ranges for the switch condition and it is too late to undo it and handle it differently. I bet we must have multiple similar problems where one transformation makes a transformation that other pass is not happy about. So, perhaps if we could during early cswitch just annotate the gswitch that it would be optimized that way (table lookup, ...), but not actually modify the switch and let inlininer use that info and then perform a late gswitch that would actually optimize it, or let early cswitch perform the transformation, but annotate it so that late cswitch could undo it and perform it differently. I don't like the suggested approach much. There can be optimization in between that can leverage the conversion to a constant array. Another thing is perform the optimizations we do in reassoc (but aren't What kind of optimizations do you mean? Thanks, Martin effective anymore after your if to gswitch changes) on switches too, most likely during the (late?) cswitch pass. Jakub
Re: [PATCH] Add one more pass_convert_switch late.
On Mon, Nov 18, 2019 at 10:17:56AM +0100, Martin Liška wrote: > On 11/14/19 1:15 PM, Richard Biener wrote: > > Hmm. I was thinking of moving the pass instead of adding another one. > > What's the reason to run switch-conversion during early optimization again? > > I'm adding CC, as he's the author of cswitch pass and he probably knows > more about the pass placement. But I bet early conversion to array access > rapidly simplifies CFG and other passes can benefit from that. Yes, I think the early placement is to take it into account in inlining decisions, on the other side it causes various issues, I think we have several PRs open where the early cswitch just makes a good decision based on what it knows at that point, but later on we find better VRP ranges for the switch condition and it is too late to undo it and handle it differently. So, perhaps if we could during early cswitch just annotate the gswitch that it would be optimized that way (table lookup, ...), but not actually modify the switch and let inlininer use that info and then perform a late gswitch that would actually optimize it, or let early cswitch perform the transformation, but annotate it so that late cswitch could undo it and perform it differently. Another thing is perform the optimizations we do in reassoc (but aren't effective anymore after your if to gswitch changes) on switches too, most likely during the (late?) cswitch pass. Jakub
Re: [PATCH] Add one more pass_convert_switch late.
On 11/14/19 1:15 PM, Richard Biener wrote: Hmm. I was thinking of moving the pass instead of adding another one. What's the reason to run switch-conversion during early optimization again? I'm adding CC, as he's the author of cswitch pass and he probably knows more about the pass placement. But I bet early conversion to array access rapidly simplifies CFG and other passes can benefit from that. But then it's probably not too bad... (and somehow I'd still like to see switch-conversion, switch lowering and if-to-switch be "integrated" somehow, analyzing the IL and then outputting optimized if/switch according to the same cost metric). Well, my intuition is that gswitch statement is a canonical representation of a switch. And so that one wants to have if-to-switch early in the pipeline and switch lowering very late? Martin
Re: [PATCH] Add one more pass_convert_switch late.
On Thu, Nov 14, 2019 at 1:06 PM Martin Liška wrote: > > Hi. > > As mentioned in the PR, the patch adds one more late pass_convert_switch > just before switch lowering. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? Hmm. I was thinking of moving the pass instead of adding another one. What's the reason to run switch-conversion during early optimization again? But then it's probably not too bad... (and somehow I'd still like to see switch-conversion, switch lowering and if-to-switch be "integrated" somehow, analyzing the IL and then outputting optimized if/switch according to the same cost metric). Richard. > Thanks, > Martin > > gcc/ChangeLog: > > 2019-11-13 Martin Liska > > PR tree-optimization/92005 > * passes.def: Add pass_convert_switch late. > * tree-switch-conversion.c: Define clone > method. > > gcc/testsuite/ChangeLog: > > 2019-11-13 Martin Liska > > PR tree-optimization/92005 > * g++.dg/tree-ssa/pr92005.C: New test. > * gcc.dg/tree-ssa/cswtch-2.c: Update dump file name. > * gcc.dg/tree-ssa/cswtch-3.c: Likewise. > * gcc.dg/tree-ssa/cswtch-4.c: Likewise. > * gcc.dg/tree-ssa/cswtch-5.c: Likewise. > * gcc.dg/tree-ssa/cswtch.c: Likewise. > * gcc.dg/tree-ssa/pr36881.c: Likewise. > * gcc.dg/tree-ssa/pr84436-1.c: Likewise. > * gcc.dg/tree-ssa/pr84436-2.c: Likewise. > * gcc.dg/tree-ssa/pr84436-3.c: Likewise. > * gcc.dg/tree-ssa/pr84436-4.c: Likewise. > * gcc.dg/tree-ssa/pr84436-5.c: Likewise. > * gcc.dg/tree-ssa/pr88753.c: Likewise. > * gcc.target/i386/pr45830.c: Likewise. > --- > gcc/passes.def| 1 + > gcc/testsuite/g++.dg/tree-ssa/pr92005.C | 50 +++ > gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c | 2 +- > gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c | 4 +- > gcc/testsuite/gcc.dg/tree-ssa/cswtch-4.c | 4 +- > gcc/testsuite/gcc.dg/tree-ssa/cswtch-5.c | 4 +- > gcc/testsuite/gcc.dg/tree-ssa/cswtch.c| 2 +- > gcc/testsuite/gcc.dg/tree-ssa/pr36881.c | 2 +- > gcc/testsuite/gcc.dg/tree-ssa/pr84436-1.c | 4 +- > gcc/testsuite/gcc.dg/tree-ssa/pr84436-2.c | 2 +- > gcc/testsuite/gcc.dg/tree-ssa/pr84436-3.c | 2 +- > gcc/testsuite/gcc.dg/tree-ssa/pr84436-4.c | 2 +- > gcc/testsuite/gcc.dg/tree-ssa/pr84436-5.c | 2 +- > gcc/testsuite/gcc.dg/tree-ssa/pr88753.c | 2 +- > gcc/testsuite/gcc.target/i386/pr45830.c | 2 +- > gcc/tree-switch-conversion.c | 1 + > 16 files changed, 69 insertions(+), 17 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr92005.C > >
[PATCH] Add one more pass_convert_switch late.
Hi. As mentioned in the PR, the patch adds one more late pass_convert_switch just before switch lowering. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin gcc/ChangeLog: 2019-11-13 Martin Liska PR tree-optimization/92005 * passes.def: Add pass_convert_switch late. * tree-switch-conversion.c: Define clone method. gcc/testsuite/ChangeLog: 2019-11-13 Martin Liska PR tree-optimization/92005 * g++.dg/tree-ssa/pr92005.C: New test. * gcc.dg/tree-ssa/cswtch-2.c: Update dump file name. * gcc.dg/tree-ssa/cswtch-3.c: Likewise. * gcc.dg/tree-ssa/cswtch-4.c: Likewise. * gcc.dg/tree-ssa/cswtch-5.c: Likewise. * gcc.dg/tree-ssa/cswtch.c: Likewise. * gcc.dg/tree-ssa/pr36881.c: Likewise. * gcc.dg/tree-ssa/pr84436-1.c: Likewise. * gcc.dg/tree-ssa/pr84436-2.c: Likewise. * gcc.dg/tree-ssa/pr84436-3.c: Likewise. * gcc.dg/tree-ssa/pr84436-4.c: Likewise. * gcc.dg/tree-ssa/pr84436-5.c: Likewise. * gcc.dg/tree-ssa/pr88753.c: Likewise. * gcc.target/i386/pr45830.c: Likewise. --- gcc/passes.def| 1 + gcc/testsuite/g++.dg/tree-ssa/pr92005.C | 50 +++ gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/cswtch-4.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/cswtch-5.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/cswtch.c| 2 +- gcc/testsuite/gcc.dg/tree-ssa/pr36881.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/pr84436-1.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/pr84436-2.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/pr84436-3.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/pr84436-4.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/pr84436-5.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/pr88753.c | 2 +- gcc/testsuite/gcc.target/i386/pr45830.c | 2 +- gcc/tree-switch-conversion.c | 1 + 16 files changed, 69 insertions(+), 17 deletions(-) create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr92005.C diff --git a/gcc/passes.def b/gcc/passes.def index 798a391bd35..48204af1009 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -305,6 +305,7 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_convert_switch); NEXT_PASS (pass_lower_switch); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc, false /* insert_powi_p */); diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr92005.C b/gcc/testsuite/g++.dg/tree-ssa/pr92005.C new file mode 100644 index 000..ce228b0a20c --- /dev/null +++ b/gcc/testsuite/g++.dg/tree-ssa/pr92005.C @@ -0,0 +1,50 @@ +/* PR tree-optimization/92005 */ +/* { dg-options "-O2 -fdump-tree-optimized -std=c++17" } */ + +template struct overloaded : Ts... { using Ts::operator()...; }; +template overloaded(Ts...) -> overloaded; + +struct T0 {}; +struct T1 {}; +struct T2 {}; +struct T3 {}; +struct T4 {}; + +struct variant +{ +unsigned index_; + +union +{ +T0 t0_; +T1 t1_; +T2 t2_; +T3 t3_; +T4 t4_; +}; +}; + +template int visit( F f, variant const& v ) +{ +switch( v.index_ ) +{ +case 0: return f( v.t0_ ); +case 1: return f( v.t1_ ); +case 2: return f( v.t2_ ); +case 3: return f( v.t3_ ); +case 4: return f( v.t4_ ); +default: __builtin_unreachable(); +} +} + +int do_visit(variant const& v) { + return visit(overloaded{ +[](T0 val) { return 3; }, +[](T1 val) { return 5; }, +[](T2 val) { return 8; }, +[](T3 val) { return 9; }, +[](T4 val) { return 10; } +}, v); +} + +/* { dg-final { scan-tree-dump "CSWTCH" "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c b/gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c index 87ed7bba517..44e25ebbac1 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c @@ -17,4 +17,4 @@ int h1 (X x) } } -/* { dg-final { scan-tree-dump-times "CSWTCH" 0 "switchconv" } } */ +/* { dg-final { scan-tree-dump-times "CSWTCH" 0 "switchconv1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c b/gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c index b983c8fbe92..25c2e75c5a2 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c @@ -326,5 +326,5 @@ main () T (f5 (231, 0), 2, { 80, 0 }); } -/* { dg-final { scan-tree-dump-times "Switch converted" 5 "switchconv" } } */ -/* { dg-final { scan-tree-dump-times "= CSWTCH" 8 "switchconv" } } */ +/* { dg-final { scan-tree-dump-times "Switch converted" 5 "switchconv1" } } */ +/* { dg-final { scan-tree-dump-times "= CSWTCH" 8 "switchconv1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cswtch-4.c