Re: [PATCH] Add one more pass_convert_switch late.

2020-03-27 Thread Martin Liška

On 11/19/19 3:48 PM, Richard Biener wrote:

Sure.  As said in the if-to-switch pass "review" all these (now three)
passes should share an intermediate representation and costing they
then generate code from (either switch-converted lookups, switches
or optimized if-chains).  There's no reason to actually generate a switch
from an if-chain if we just want to optimize the if-chain (it might be
convenient for the code generator though, not sure).  Likewise if the
switch ends up switch-converted there's no reason to produce the
switch first.

But first and foremost sharing costing and transform decision code
which means having some unified representation of the IL semantics
is what is needed for this.


Hi.

I'm slowly but surely planning to continue working on that in next stage1.
There are concerns that should be addressed:

1) if-chain to switch conversion pass made conversion unconditionally without
analysis if a switch can be transformed in a clever way (cswitch, jump-table 
(JT), bit-test (BT)).

That can be solved by having an abstract analysis model. One problematic thing 
is cswitch
conversion that requires various CFG constraints and I'm not sure that early 
tree pass
can prove that a switch can eventually become CSWITCH.

2) we observed that current switch expansion can't support tree reassoc tricks:
optimize_range_tests_xor and optimize_range_tests_diff

I can add that to switch expansion but I see one problem. The switch expansion 
builds
top-level decision tree that contains JT, BT or a simple equality comparison 
nodes.
And each node handles only a consecutive interval. On the contrary the XOR and 
DIFF tricks
can handle any 2 comparisons. Moreover, these tests generate a if-chain that 
goes at the beginning
of a comparison.

There's one counter-example which presents the problem:

For foo we get:

Optimizing range tests a_20(D) -[33, 33] and -[41, 41]
 into (a_20(D) & -9) != 33
Optimizing range tests a_20(D) -[80, 80] and -[88, 88]
 into (a_20(D) & -9) != 80
Optimizing range tests a_20(D) -[99, 99] and -[115, 115]
 into (a_20(D) & -17) != 99
Optimizing range tests a_20(D) -[130, 130] and -[162, 162]
 into (a_20(D) & -33) != 130
Optimizing range tests a_20(D) -[200, 200] and -[216, 216]
 into (a_20(D) & -17) != 200
Optimizing range tests a_20(D) -[3, 3] and -[5, 5]
 into ((unsigned int) a_20(D) + 4294967293 & 4294967293) != 0

so a chain of 6 comparisons.

For bar we can generate a 3 BTs with the following balanced tree on the top:

;;   BT:3-41 21.7% (guessed) subtree: 21.7% (guessed))
;;BT:80-130 27.1% (guessed) subtree: 65.0% (guessed))
;;   BT:162-216 16.3% (guessed) subtree: 16.3% (guessed))

There can be also situations where a simple balanced tree can be faster than a 
series of
the XOR and DIFF tests.

So it's not an easy task to integrate XOR and DIFF tricks into switch 
conversion with a reasonable
costing model.

3) I can imagine a huge number if-else chain can benefit from simple balanced 
tree expansion. It can
be even better with PGO.

So maybe we can transform these huge if-chains into switch unconditionally? 
Examples:

gcc/c-family/c-common.c:2302
gcc/c-family/c-common.c:2871

Thoughts?
Thanks,
Martin
#define T(v, step) a == v || a == (v + step)
#define C(v, step) case v: case (v + step)

int foo(int a)
{
  if (T(3, 2) || T(33, 8) || T(80, 8) || T(99, 16) || T(130, 32) || T(200, 16))
return 1;
  else
return 4;
}

int bar(int a)
{
  switch (a)
  {
C(3, 2):
C(33, 8):
C(80, 8):
C(99, 16):
C(130, 32):
C(200, 16):
  return 1;
default:
return 4;
  }
}


Re: [PATCH] Add one more pass_convert_switch late.

2019-11-19 Thread Richard Biener
On Tue, Nov 19, 2019 at 2:54 PM Martin Liška  wrote:
>
> On 11/19/19 10:25 AM, Richard Biener wrote:
> > On Mon, Nov 18, 2019 at 10:17 AM Martin Liška  wrote:
> >>
> >> On 11/14/19 1:15 PM, Richard Biener wrote:
> >>> Hmm.  I was thinking of moving the pass instead of adding another one.
> >>> What's the reason to run switch-conversion during early optimization 
> >>> again?
> >>
> >> I'm adding CC, as he's the author of cswitch pass and he probably knows
> >> more about the pass placement. But I bet early conversion to array access
> >> rapidly simplifies CFG and other passes can benefit from that.
> >>
> >>>
> >>> But then it's probably not too bad... (and somehow I'd still like to see
> >>> switch-conversion, switch lowering and if-to-switch be "integrated"
> >>> somehow, analyzing the IL and then outputting optimized if/switch
> >>> according to the same cost metric).
> >>
> >> Well, my intuition is that gswitch statement is a canonical representation
> >> of a switch. And so that one wants to have if-to-switch early in the 
> >> pipeline
> >> and switch lowering very late?
> >
> > Just to throw in a tiny thing here - while a switch seems a natural 
> > canonical
> > representation it is actually a poor one for most GCC passes since they're
> > going to give up on switches and it's CFG complexity.  GCCs switch
> > representation also isn't exactly good (remember all that label lookup
> > stuff you have to do).  Switches are a quite high-level representation
> > and to me only "good" if we are actually going to emit a jump-table.
>
> I agree with you here. My simple counter example would be a user input
> where a switch will be explicitly used :)
>
> > Given converting a series of ifs to a switch is "hard" (the opposite
> > is easy) going for a switch early (before CFG transforms mess up
> > things too much) kind-of makes sense, but lowering back non-jump table
> > ones soon is equally important.
>
> What we can do is to share decision login in between iftoswitch pass and
> cswitch/switch lowering. That way we can convert chains of if statements
> that will be interesting for any of the later passes.
>
> One comment here is that 'if-elseif -> gswitch -> lowering to ifs' can
> still produce a faster code as be expand smart to a decision tree.

Sure.  As said in the if-to-switch pass "review" all these (now three)
passes should share an intermediate representation and costing they
then generate code from (either switch-converted lookups, switches
or optimized if-chains).  There's no reason to actually generate a switch
from an if-chain if we just want to optimize the if-chain (it might be
convenient for the code generator though, not sure).  Likewise if the
switch ends up switch-converted there's no reason to produce the
switch first.

But first and foremost sharing costing and transform decision code
which means having some unified representation of the IL semantics
is what is needed for this.

> The question still remains what do we want for GCC 10? A selective
> iftoswitch conversion?

Not sure if we really need it for GCC 10 - was there any pressing reason
other than PRxyz asks for it?

Thanks,
Richard.

> Thanks,
> Martin
>
> >
> > That's of course all unrelated to adding another switch-conversion pass.
> > (and yes, switch-converted IL is more friendly than a switch)
> >
> > Richard.
> >
> >> Martin
> >>
>


Re: [PATCH] Add one more pass_convert_switch late.

2019-11-19 Thread Martin Liška

On 11/19/19 11:25 AM, Jakub Jelinek wrote:

The point I'm trying to make is that with if-to-switch, if cswitch doesn't
handle it, we have a regression.  If user writes it as a switch, it is a
missed optimization if we generate worse code than we could, but not a
regression.


Sure, but the question is how common are these conditions that can reassoc
transform using the fancy popcount techniques? On the other handy, if-to-switch
can optimize more switch statements to e.g. jump tables.

As mentioned in the previous email, I can imagine doing gswitch from a if chain
only if cswitch/switch lowering can handle it.

Martin


Re: [PATCH] Add one more pass_convert_switch late.

2019-11-19 Thread Martin Liška

On 11/19/19 10:25 AM, Richard Biener wrote:

On Mon, Nov 18, 2019 at 10:17 AM Martin Liška  wrote:


On 11/14/19 1:15 PM, Richard Biener wrote:

Hmm.  I was thinking of moving the pass instead of adding another one.
What's the reason to run switch-conversion during early optimization again?


I'm adding CC, as he's the author of cswitch pass and he probably knows
more about the pass placement. But I bet early conversion to array access
rapidly simplifies CFG and other passes can benefit from that.



But then it's probably not too bad... (and somehow I'd still like to see
switch-conversion, switch lowering and if-to-switch be "integrated"
somehow, analyzing the IL and then outputting optimized if/switch
according to the same cost metric).


Well, my intuition is that gswitch statement is a canonical representation
of a switch. And so that one wants to have if-to-switch early in the pipeline
and switch lowering very late?


Just to throw in a tiny thing here - while a switch seems a natural canonical
representation it is actually a poor one for most GCC passes since they're
going to give up on switches and it's CFG complexity.  GCCs switch
representation also isn't exactly good (remember all that label lookup
stuff you have to do).  Switches are a quite high-level representation
and to me only "good" if we are actually going to emit a jump-table.


I agree with you here. My simple counter example would be a user input
where a switch will be explicitly used :)


Given converting a series of ifs to a switch is "hard" (the opposite
is easy) going for a switch early (before CFG transforms mess up
things too much) kind-of makes sense, but lowering back non-jump table
ones soon is equally important.


What we can do is to share decision login in between iftoswitch pass and
cswitch/switch lowering. That way we can convert chains of if statements
that will be interesting for any of the later passes.

One comment here is that 'if-elseif -> gswitch -> lowering to ifs' can
still produce a faster code as be expand smart to a decision tree.

The question still remains what do we want for GCC 10? A selective
iftoswitch conversion?

Thanks,
Martin



That's of course all unrelated to adding another switch-conversion pass.
(and yes, switch-converted IL is more friendly than a switch)

Richard.


Martin





Re: [PATCH] Add one more pass_convert_switch late.

2019-11-19 Thread Jakub Jelinek
On Tue, Nov 19, 2019 at 11:20:16AM +0100, Martin Liška wrote:
> On 11/19/19 10:12 AM, Jakub Jelinek wrote:
> > On Mon, Nov 18, 2019 at 03:34:35PM +0100, Martin Liška wrote:
> > > > Now, I believe with the if to gswitch optimization these will only 
> > > > rarely
> > > > kick in, because the IL will have switches that reassoc doesn't handle,
> > > > instead of series of ifs.
> > > 
> > > Yes, so my question is whether reassoc can handle gswitch statements 
> > > similar
> > > to series of if statements? Note that use can write explicitly something 
> > > like:
> > 
> > I don't see how.  Either the cswitch pass runs early, and then you need to
> > take it into account when deciding whether to expand it as series of ifs or
> > table lookup or bittest (say, realize you'd need just a single condition to
> > cover everything and go with if series), or it runs late and then on the
> > other side if reassoc turned gswitch into if series without taking into
> > account other possibilities like table lookup, it could pessimize stuff.
> > So, IMHO cswitch should he able to do this too.
> 
> I agree with you that we should teach switch lowering to also consider the
> transformation that reassoc does. I understand it as a generalized bit test,
> and I would be happy to implement that on gswitch level. Which leads us
> back to need for if-to-switch conversion pass that will enable such 
> optimization.

The point I'm trying to make is that with if-to-switch, if cswitch doesn't
handle it, we have a regression.  If user writes it as a switch, it is a
missed optimization if we generate worse code than we could, but not a
regression.

Jakub



Re: [PATCH] Add one more pass_convert_switch late.

2019-11-19 Thread Martin Liška

On 11/19/19 10:12 AM, Jakub Jelinek wrote:

On Mon, Nov 18, 2019 at 03:34:35PM +0100, Martin Liška wrote:

Now, I believe with the if to gswitch optimization these will only rarely
kick in, because the IL will have switches that reassoc doesn't handle,
instead of series of ifs.


Yes, so my question is whether reassoc can handle gswitch statements similar
to series of if statements? Note that use can write explicitly something like:


I don't see how.  Either the cswitch pass runs early, and then you need to
take it into account when deciding whether to expand it as series of ifs or
table lookup or bittest (say, realize you'd need just a single condition to
cover everything and go with if series), or it runs late and then on the
other side if reassoc turned gswitch into if series without taking into
account other possibilities like table lookup, it could pessimize stuff.
So, IMHO cswitch should he able to do this too.


I agree with you that we should teach switch lowering to also consider the
transformation that reassoc does. I understand it as a generalized bit test,
and I would be happy to implement that on gswitch level. Which leads us
back to need for if-to-switch conversion pass that will enable such 
optimization.

My counter-example is following:

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c 
b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c
index 944362ad076..58f9737b918 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-32.c
@@ -6,10 +6,15 @@
 
 int test (int a, int b, int c)

 {
-  if ( a == 10 || a == 12 || a == 26)
-return b;
-  else
-return c;
+  switch (a)
+{
+case 10:
+case 12:
+case 26:
+  return b;
+default:
+  return c;
+}
 }
 
 int main ()


Note that I see both representation in the source code equivalent.
It's probably just a matter of taste ;)

Martin





switch (a)
case 0...30:
   return 1;
case 32:
   return 1;
case 34:
   return 1;

which won't be optimized by reassoc. While it can handle something 
0<=A<=30|A==32|A==34.


Jakub





Re: [PATCH] Add one more pass_convert_switch late.

2019-11-19 Thread Richard Biener
On Mon, Nov 18, 2019 at 10:17 AM Martin Liška  wrote:
>
> On 11/14/19 1:15 PM, Richard Biener wrote:
> > Hmm.  I was thinking of moving the pass instead of adding another one.
> > What's the reason to run switch-conversion during early optimization again?
>
> I'm adding CC, as he's the author of cswitch pass and he probably knows
> more about the pass placement. But I bet early conversion to array access
> rapidly simplifies CFG and other passes can benefit from that.
>
> >
> > But then it's probably not too bad... (and somehow I'd still like to see
> > switch-conversion, switch lowering and if-to-switch be "integrated"
> > somehow, analyzing the IL and then outputting optimized if/switch
> > according to the same cost metric).
>
> Well, my intuition is that gswitch statement is a canonical representation
> of a switch. And so that one wants to have if-to-switch early in the pipeline
> and switch lowering very late?

Just to throw in a tiny thing here - while a switch seems a natural canonical
representation it is actually a poor one for most GCC passes since they're
going to give up on switches and it's CFG complexity.  GCCs switch
representation also isn't exactly good (remember all that label lookup
stuff you have to do).  Switches are a quite high-level representation
and to me only "good" if we are actually going to emit a jump-table.
Given converting a series of ifs to a switch is "hard" (the opposite
is easy) going for a switch early (before CFG transforms mess up
things too much) kind-of makes sense, but lowering back non-jump table
ones soon is equally important.

That's of course all unrelated to adding another switch-conversion pass.
(and yes, switch-converted IL is more friendly than a switch)

Richard.

> Martin
>


Re: [PATCH] Add one more pass_convert_switch late.

2019-11-19 Thread Jakub Jelinek
On Mon, Nov 18, 2019 at 03:34:35PM +0100, Martin Liška wrote:
> > Now, I believe with the if to gswitch optimization these will only rarely
> > kick in, because the IL will have switches that reassoc doesn't handle,
> > instead of series of ifs.
> 
> Yes, so my question is whether reassoc can handle gswitch statements similar
> to series of if statements? Note that use can write explicitly something like:

I don't see how.  Either the cswitch pass runs early, and then you need to
take it into account when deciding whether to expand it as series of ifs or
table lookup or bittest (say, realize you'd need just a single condition to
cover everything and go with if series), or it runs late and then on the
other side if reassoc turned gswitch into if series without taking into
account other possibilities like table lookup, it could pessimize stuff.
So, IMHO cswitch should he able to do this too.

> 
> switch (a)
>case 0...30:
>   return 1;
>case 32:
>   return 1;
>case 34:
>   return 1;
> 
> which won't be optimized by reassoc. While it can handle something 
> 0<=A<=30|A==32|A==34.

Jakub



Re: [PATCH] Add one more pass_convert_switch late.

2019-11-18 Thread Martin Liška

On 11/18/19 3:07 PM, Jakub Jelinek wrote:

On Mon, Nov 18, 2019 at 02:39:12PM +0100, Martin Liška wrote:

On 11/18/19 11:06 AM, Jakub Jelinek wrote:

On Mon, Nov 18, 2019 at 10:17:56AM +0100, Martin Liška wrote:

On 11/14/19 1:15 PM, Richard Biener wrote:

Hmm.  I was thinking of moving the pass instead of adding another one.
What's the reason to run switch-conversion during early optimization again?


I'm adding CC, as he's the author of cswitch pass and he probably knows
more about the pass placement. But I bet early conversion to array access
rapidly simplifies CFG and other passes can benefit from that.


Yes, I think the early placement is to take it into account in inlining
decisions, on the other side it causes various issues, I think we have
several PRs open where the early cswitch just makes a good decision based on
what it knows at that point, but later on we find better VRP ranges for the
switch condition and it is too late to undo it and handle it differently.


I bet we must have multiple similar problems where one transformation makes
a transformation that other pass is not happy about.


This is something that happens quite often.  Typical example is where
we commit to doing a an constant array load and later the range is narrowed
and it would be better done either as a bitmask test, or say the constant
array load with sometimes much shorter needed arrays.


I agree that it can be quite often optimization.




I don't like the suggested approach much. There can be optimization in between
that can leverage the conversion to a constant array.


For what exactly?


Dunno :) Well, that said we can move the cswitch pass later in the pipeline..




Another thing is perform the optimizations we do in reassoc (but aren't


What kind of optimizations do you mean?


I believe the cswitch expansion can handle well the case where certain
range can be expressed as ((unsigned) x - const1) <= const2) conditional,
but reassoc has also optimize_range_tests_xor, optimize_range_tests_diff,
both of which can be applied multiple times.  To quote the comments
explaining what it does:
/* Optimize X == CST1 || X == CST2
if popcount (CST1 ^ CST2) == 1 into
(X & ~(CST1 ^ CST2)) == (CST1 & ~(CST1 ^ CST2)).
Similarly for ranges.  E.g.
X != 2 && X != 3 && X != 10 && X != 11
will be transformed by the previous optimization into
!((X - 2U) <= 1U || (X - 10U) <= 1U)
and this loop can transform that into
!(((X & ~8) - 2U) <= 1U).  */
and
/* Optimize X == CST1 || X == CST2
if popcount (CST2 - CST1) == 1 into
((X - CST1) & ~(CST2 - CST1)) == 0.
Similarly for ranges.  E.g.
X == 43 || X == 76 || X == 44 || X == 78 || X == 77 || X == 46
|| X == 75 || X == 45
will be transformed by the previous optimization into
(X - 43U) <= 3U || (X - 75U) <= 3U
and this loop can transform that into
((X - 43U) & ~(75U - 43U)) <= 3U.  */
Now, I believe with the if to gswitch optimization these will only rarely
kick in, because the IL will have switches that reassoc doesn't handle,
instead of series of ifs.


Yes, so my question is whether reassoc can handle gswitch statements similar
to series of if statements? Note that use can write explicitly something like:

switch (a)
   case 0...30:
  return 1;
   case 32:
  return 1;
   case 34:
  return 1;

which won't be optimized by reassoc. While it can handle something 
0<=A<=30|A==32|A==34.


So, I believe cswitch needs to handle these cases
too and consider it in the decision whether to use const array loads or
bittest or if tree.


Right now, it differentiates in between const array loads and if tree. Usage of 
bittest
is probably situation which switch lowering can be taught.

Martin



Jakub





Re: [PATCH] Add one more pass_convert_switch late.

2019-11-18 Thread Jakub Jelinek
On Mon, Nov 18, 2019 at 02:39:12PM +0100, Martin Liška wrote:
> On 11/18/19 11:06 AM, Jakub Jelinek wrote:
> > On Mon, Nov 18, 2019 at 10:17:56AM +0100, Martin Liška wrote:
> > > On 11/14/19 1:15 PM, Richard Biener wrote:
> > > > Hmm.  I was thinking of moving the pass instead of adding another one.
> > > > What's the reason to run switch-conversion during early optimization 
> > > > again?
> > > 
> > > I'm adding CC, as he's the author of cswitch pass and he probably knows
> > > more about the pass placement. But I bet early conversion to array access
> > > rapidly simplifies CFG and other passes can benefit from that.
> > 
> > Yes, I think the early placement is to take it into account in inlining
> > decisions, on the other side it causes various issues, I think we have
> > several PRs open where the early cswitch just makes a good decision based on
> > what it knows at that point, but later on we find better VRP ranges for the
> > switch condition and it is too late to undo it and handle it differently.
> 
> I bet we must have multiple similar problems where one transformation makes
> a transformation that other pass is not happy about.

This is something that happens quite often.  Typical example is where
we commit to doing a an constant array load and later the range is narrowed
and it would be better done either as a bitmask test, or say the constant
array load with sometimes much shorter needed arrays.

> I don't like the suggested approach much. There can be optimization in between
> that can leverage the conversion to a constant array.

For what exactly?

> > Another thing is perform the optimizations we do in reassoc (but aren't
> 
> What kind of optimizations do you mean?

I believe the cswitch expansion can handle well the case where certain
range can be expressed as ((unsigned) x - const1) <= const2) conditional,
but reassoc has also optimize_range_tests_xor, optimize_range_tests_diff,
both of which can be applied multiple times.  To quote the comments
explaining what it does:
/* Optimize X == CST1 || X == CST2
   if popcount (CST1 ^ CST2) == 1 into
   (X & ~(CST1 ^ CST2)) == (CST1 & ~(CST1 ^ CST2)).
   Similarly for ranges.  E.g.
   X != 2 && X != 3 && X != 10 && X != 11
   will be transformed by the previous optimization into
   !((X - 2U) <= 1U || (X - 10U) <= 1U)
   and this loop can transform that into
   !(((X & ~8) - 2U) <= 1U).  */
and
/* Optimize X == CST1 || X == CST2
   if popcount (CST2 - CST1) == 1 into
   ((X - CST1) & ~(CST2 - CST1)) == 0.
   Similarly for ranges.  E.g.
   X == 43 || X == 76 || X == 44 || X == 78 || X == 77 || X == 46
   || X == 75 || X == 45
   will be transformed by the previous optimization into
   (X - 43U) <= 3U || (X - 75U) <= 3U
   and this loop can transform that into
   ((X - 43U) & ~(75U - 43U)) <= 3U.  */
Now, I believe with the if to gswitch optimization these will only rarely
kick in, because the IL will have switches that reassoc doesn't handle,
instead of series of ifs.  So, I believe cswitch needs to handle these cases
too and consider it in the decision whether to use const array loads or
bittest or if tree.

Jakub



Re: [PATCH] Add one more pass_convert_switch late.

2019-11-18 Thread Martin Liška

On 11/18/19 11:06 AM, Jakub Jelinek wrote:

On Mon, Nov 18, 2019 at 10:17:56AM +0100, Martin Liška wrote:

On 11/14/19 1:15 PM, Richard Biener wrote:

Hmm.  I was thinking of moving the pass instead of adding another one.
What's the reason to run switch-conversion during early optimization again?


I'm adding CC, as he's the author of cswitch pass and he probably knows
more about the pass placement. But I bet early conversion to array access
rapidly simplifies CFG and other passes can benefit from that.


Yes, I think the early placement is to take it into account in inlining
decisions, on the other side it causes various issues, I think we have
several PRs open where the early cswitch just makes a good decision based on
what it knows at that point, but later on we find better VRP ranges for the
switch condition and it is too late to undo it and handle it differently.


I bet we must have multiple similar problems where one transformation makes
a transformation that other pass is not happy about.



So, perhaps if we could during early cswitch just annotate the gswitch
that it would be optimized that way (table lookup, ...), but not actually
modify the switch and let inlininer use that info and then perform a late
gswitch that would actually optimize it, or let early cswitch perform the
transformation, but annotate it so that late cswitch could undo it and
perform it differently.


I don't like the suggested approach much. There can be optimization in between
that can leverage the conversion to a constant array.



Another thing is perform the optimizations we do in reassoc (but aren't


What kind of optimizations do you mean?

Thanks,
Martin


effective anymore after your if to gswitch changes) on switches too,
most likely during the (late?) cswitch pass.

Jakub





Re: [PATCH] Add one more pass_convert_switch late.

2019-11-18 Thread Jakub Jelinek
On Mon, Nov 18, 2019 at 10:17:56AM +0100, Martin Liška wrote:
> On 11/14/19 1:15 PM, Richard Biener wrote:
> > Hmm.  I was thinking of moving the pass instead of adding another one.
> > What's the reason to run switch-conversion during early optimization again?
> 
> I'm adding CC, as he's the author of cswitch pass and he probably knows
> more about the pass placement. But I bet early conversion to array access
> rapidly simplifies CFG and other passes can benefit from that.

Yes, I think the early placement is to take it into account in inlining
decisions, on the other side it causes various issues, I think we have
several PRs open where the early cswitch just makes a good decision based on
what it knows at that point, but later on we find better VRP ranges for the
switch condition and it is too late to undo it and handle it differently.

So, perhaps if we could during early cswitch just annotate the gswitch
that it would be optimized that way (table lookup, ...), but not actually
modify the switch and let inlininer use that info and then perform a late
gswitch that would actually optimize it, or let early cswitch perform the
transformation, but annotate it so that late cswitch could undo it and
perform it differently.

Another thing is perform the optimizations we do in reassoc (but aren't
effective anymore after your if to gswitch changes) on switches too,
most likely during the (late?) cswitch pass.

Jakub



Re: [PATCH] Add one more pass_convert_switch late.

2019-11-18 Thread Martin Liška

On 11/14/19 1:15 PM, Richard Biener wrote:

Hmm.  I was thinking of moving the pass instead of adding another one.
What's the reason to run switch-conversion during early optimization again?


I'm adding CC, as he's the author of cswitch pass and he probably knows
more about the pass placement. But I bet early conversion to array access
rapidly simplifies CFG and other passes can benefit from that.



But then it's probably not too bad... (and somehow I'd still like to see
switch-conversion, switch lowering and if-to-switch be "integrated"
somehow, analyzing the IL and then outputting optimized if/switch
according to the same cost metric).


Well, my intuition is that gswitch statement is a canonical representation
of a switch. And so that one wants to have if-to-switch early in the pipeline
and switch lowering very late?

Martin



Re: [PATCH] Add one more pass_convert_switch late.

2019-11-14 Thread Richard Biener
On Thu, Nov 14, 2019 at 1:06 PM Martin Liška  wrote:
>
> Hi.
>
> As mentioned in the PR, the patch adds one more late pass_convert_switch
> just before switch lowering.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

Hmm.  I was thinking of moving the pass instead of adding another one.
What's the reason to run switch-conversion during early optimization again?

But then it's probably not too bad... (and somehow I'd still like to see
switch-conversion, switch lowering and if-to-switch be "integrated"
somehow, analyzing the IL and then outputting optimized if/switch
according to the same cost metric).

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2019-11-13  Martin Liska  
>
> PR tree-optimization/92005
> * passes.def: Add pass_convert_switch late.
> * tree-switch-conversion.c: Define clone
> method.
>
> gcc/testsuite/ChangeLog:
>
> 2019-11-13  Martin Liska  
>
> PR tree-optimization/92005
> * g++.dg/tree-ssa/pr92005.C: New test.
> * gcc.dg/tree-ssa/cswtch-2.c: Update dump file name.
> * gcc.dg/tree-ssa/cswtch-3.c: Likewise.
> * gcc.dg/tree-ssa/cswtch-4.c: Likewise.
> * gcc.dg/tree-ssa/cswtch-5.c: Likewise.
> * gcc.dg/tree-ssa/cswtch.c: Likewise.
> * gcc.dg/tree-ssa/pr36881.c: Likewise.
> * gcc.dg/tree-ssa/pr84436-1.c: Likewise.
> * gcc.dg/tree-ssa/pr84436-2.c: Likewise.
> * gcc.dg/tree-ssa/pr84436-3.c: Likewise.
> * gcc.dg/tree-ssa/pr84436-4.c: Likewise.
> * gcc.dg/tree-ssa/pr84436-5.c: Likewise.
> * gcc.dg/tree-ssa/pr88753.c: Likewise.
> * gcc.target/i386/pr45830.c: Likewise.
> ---
>   gcc/passes.def|  1 +
>   gcc/testsuite/g++.dg/tree-ssa/pr92005.C   | 50 +++
>   gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c  |  2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c  |  4 +-
>   gcc/testsuite/gcc.dg/tree-ssa/cswtch-4.c  |  4 +-
>   gcc/testsuite/gcc.dg/tree-ssa/cswtch-5.c  |  4 +-
>   gcc/testsuite/gcc.dg/tree-ssa/cswtch.c|  2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr36881.c   |  2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr84436-1.c |  4 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr84436-2.c |  2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr84436-3.c |  2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr84436-4.c |  2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr84436-5.c |  2 +-
>   gcc/testsuite/gcc.dg/tree-ssa/pr88753.c   |  2 +-
>   gcc/testsuite/gcc.target/i386/pr45830.c   |  2 +-
>   gcc/tree-switch-conversion.c  |  1 +
>   16 files changed, 69 insertions(+), 17 deletions(-)
>   create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr92005.C
>
>


[PATCH] Add one more pass_convert_switch late.

2019-11-14 Thread Martin Liška

Hi.

As mentioned in the PR, the patch adds one more late pass_convert_switch
just before switch lowering.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-11-13  Martin Liska  

PR tree-optimization/92005
* passes.def: Add pass_convert_switch late.
* tree-switch-conversion.c: Define clone
method.

gcc/testsuite/ChangeLog:

2019-11-13  Martin Liska  

PR tree-optimization/92005
* g++.dg/tree-ssa/pr92005.C: New test.
* gcc.dg/tree-ssa/cswtch-2.c: Update dump file name.
* gcc.dg/tree-ssa/cswtch-3.c: Likewise.
* gcc.dg/tree-ssa/cswtch-4.c: Likewise.
* gcc.dg/tree-ssa/cswtch-5.c: Likewise.
* gcc.dg/tree-ssa/cswtch.c: Likewise.
* gcc.dg/tree-ssa/pr36881.c: Likewise.
* gcc.dg/tree-ssa/pr84436-1.c: Likewise.
* gcc.dg/tree-ssa/pr84436-2.c: Likewise.
* gcc.dg/tree-ssa/pr84436-3.c: Likewise.
* gcc.dg/tree-ssa/pr84436-4.c: Likewise.
* gcc.dg/tree-ssa/pr84436-5.c: Likewise.
* gcc.dg/tree-ssa/pr88753.c: Likewise.
* gcc.target/i386/pr45830.c: Likewise.
---
 gcc/passes.def|  1 +
 gcc/testsuite/g++.dg/tree-ssa/pr92005.C   | 50 +++
 gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c  |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c  |  4 +-
 gcc/testsuite/gcc.dg/tree-ssa/cswtch-4.c  |  4 +-
 gcc/testsuite/gcc.dg/tree-ssa/cswtch-5.c  |  4 +-
 gcc/testsuite/gcc.dg/tree-ssa/cswtch.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr36881.c   |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr84436-1.c |  4 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr84436-2.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr84436-3.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr84436-4.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr84436-5.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr88753.c   |  2 +-
 gcc/testsuite/gcc.target/i386/pr45830.c   |  2 +-
 gcc/tree-switch-conversion.c  |  1 +
 16 files changed, 69 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr92005.C


diff --git a/gcc/passes.def b/gcc/passes.def
index 798a391bd35..48204af1009 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -305,6 +305,7 @@ along with GCC; see the file COPYING3.  If not see
   POP_INSERT_PASSES ()
   NEXT_PASS (pass_simduid_cleanup);
   NEXT_PASS (pass_lower_vector_ssa);
+  NEXT_PASS (pass_convert_switch);
   NEXT_PASS (pass_lower_switch);
   NEXT_PASS (pass_cse_reciprocals);
   NEXT_PASS (pass_reassoc, false /* insert_powi_p */);
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr92005.C b/gcc/testsuite/g++.dg/tree-ssa/pr92005.C
new file mode 100644
index 000..ce228b0a20c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr92005.C
@@ -0,0 +1,50 @@
+/* PR tree-optimization/92005 */
+/* { dg-options "-O2 -fdump-tree-optimized -std=c++17" } */
+
+template struct overloaded : Ts... { using Ts::operator()...; };
+template overloaded(Ts...) -> overloaded;
+
+struct T0 {};
+struct T1 {};
+struct T2 {};
+struct T3 {};
+struct T4 {};
+
+struct variant
+{
+unsigned index_;
+
+union
+{
+T0 t0_;
+T1 t1_;
+T2 t2_;
+T3 t3_;
+T4 t4_;
+};
+};
+
+template int visit( F f, variant const& v )
+{
+switch( v.index_ )
+{
+case 0: return f( v.t0_ );
+case 1: return f( v.t1_ );
+case 2: return f( v.t2_ );
+case 3: return f( v.t3_ );
+case 4: return f( v.t4_ );
+default: __builtin_unreachable();
+}
+}
+
+int do_visit(variant const& v) {
+ return visit(overloaded{
+[](T0 val) { return 3; },
+[](T1 val) { return 5; },
+[](T2 val) { return 8; },
+[](T3 val) { return 9; },
+[](T4 val) { return 10; }
+}, v);
+}
+
+/* { dg-final { scan-tree-dump "CSWTCH" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c b/gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c
index 87ed7bba517..44e25ebbac1 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cswtch-2.c
@@ -17,4 +17,4 @@ int h1 (X x)
 }
 }
 
-/* { dg-final { scan-tree-dump-times "CSWTCH" 0 "switchconv" } } */
+/* { dg-final { scan-tree-dump-times "CSWTCH" 0 "switchconv1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c b/gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c
index b983c8fbe92..25c2e75c5a2 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cswtch-3.c
@@ -326,5 +326,5 @@ main ()
   T (f5 (231, 0), 2, { 80, 0 });
 }
 
-/* { dg-final { scan-tree-dump-times "Switch converted" 5 "switchconv" } } */
-/* { dg-final { scan-tree-dump-times "= CSWTCH" 8 "switchconv" } } */
+/* { dg-final { scan-tree-dump-times "Switch converted" 5 "switchconv1" } } */
+/* { dg-final { scan-tree-dump-times "= CSWTCH" 8 "switchconv1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cswtch-4.c