Re: [RFC PATCH] Improve switchconv optimization (PR tree-optimization/79472)

2017-02-15 Thread Jeff Law

On 02/15/2017 04:51 AM, Jakub Jelinek wrote:

On Wed, Feb 15, 2017 at 12:46:44PM +0100, Richard Biener wrote:

Possibly, but for GCC 8.


To both this switchconv patch and the potential improvement for loading
from const arrays (can create an enhancement PR for that), or just the
latter?


Both I think - the patch is pretty big.


Ok, I'll queue the patch for GCC8 then.


 Maybe we can instead make early
threading not mess this up?


Maybe, but not planning to do that myself, my knowledge about jump threading
is too limited.

The problem is at the point where we thread all we see is this:

  # s_1 = PHI <"foo"(2), "bar"(3), "spam"(4), 0B(5)>
:
  if (s_1 == 0B)
goto ;
  else
goto ;

Nothing more is needed for jump threading to do its job.  This doesn't 
look any different to the threader than any other simple jump threading 
opportunity.


I guess we could look to see if the PHI is the join point for a switch, 
but that seems rather hacky.


jeff


Re: [RFC PATCH] Improve switchconv optimization (PR tree-optimization/79472)

2017-02-15 Thread Jakub Jelinek
On Wed, Feb 15, 2017 at 12:46:44PM +0100, Richard Biener wrote:
> >> Possibly, but for GCC 8.
> > 
> > To both this switchconv patch and the potential improvement for loading
> > from const arrays (can create an enhancement PR for that), or just the
> > latter?
> 
> Both I think - the patch is pretty big.

Ok, I'll queue the patch for GCC8 then.

>  Maybe we can instead make early
> threading not mess this up?

Maybe, but not planning to do that myself, my knowledge about jump threading
is too limited.

> >> can we teach EVRP about this?  It runs before switch conversion.
> > 
> > I guess so.  It is a matter of calling simplify_switch_using_ranges
> > and then doing some cleanup (you wrote that optimization)
> > - to_update_switch_stmts handling.
> 
> Sounds like a good thing to do (for GCC 8 as well).

Ok, will file enhancement PRs.

Jakub


Re: [RFC PATCH] Improve switchconv optimization (PR tree-optimization/79472)

2017-02-15 Thread Richard Biener
On 15/02/17 08:17, Jakub Jelinek wrote:
> On Wed, Feb 15, 2017 at 08:06:16AM +0100, Richard Biener wrote:
>> On February 14, 2017 9:04:45 PM GMT+01:00, Jakub Jelinek  
>> wrote:
>>> Hi!
>>>
>>> The following patch is an attempt to fix a regression where we no
>>> longer
>>> switch convert one switch because earlier optimizations turn it into
>>> unsupported shape.
>>
>> Is that because of early threading?
> 
> Yes.
> 
>>> and expects to be optimized into return 3 by vrp1.  As switchconv is
>>> earlier
>>> than that, vrp1 sees:
>>>  _1 = a_3(D) & 1;
>>>  _4 = (unsigned int) _1;
>>>  _5 = CSWTCH.1[_4];
>>>  return _5;
>>> and doesn't optimize it.  If the testcase had say case 7: replaced with
>>> default:, it wouldn't pass already before.
>>
>> That looks odd...
> 
> Just a pass ordering issue.
> 
>>   If the patch is ok, what
>>> should
>>> we do with vrp40.c?  Change it in some way (e.g. return variable in one
>>> case) so that switchconv doesn't trigger, or add an optimization in vrp
>>> if we see a load from constant array with known initializer and the
>>> range
>>> is small enough and contains the same value for all those values,
>>> replace
>>> it with the value? 
>>
>> Possibly, but for GCC 8.
> 
> To both this switchconv patch and the potential improvement for loading
> from const arrays (can create an enhancement PR for that), or just the
> latter?

Both I think - the patch is pretty big.  Maybe we can instead make early
threading not mess this up?

>> can we teach EVRP about this?  It runs before switch conversion.
> 
> I guess so.  It is a matter of calling simplify_switch_using_ranges
> and then doing some cleanup (you wrote that optimization)
> - to_update_switch_stmts handling.

Sounds like a good thing to do (for GCC 8 as well).

Thanks,
Richard.

> 
>   Jakub
> 



Re: [RFC PATCH] Improve switchconv optimization (PR tree-optimization/79472)

2017-02-14 Thread Jakub Jelinek
On Wed, Feb 15, 2017 at 08:06:16AM +0100, Richard Biener wrote:
> On February 14, 2017 9:04:45 PM GMT+01:00, Jakub Jelinek  
> wrote:
> >Hi!
> >
> >The following patch is an attempt to fix a regression where we no
> >longer
> >switch convert one switch because earlier optimizations turn it into
> >unsupported shape.
> 
> Is that because of early threading?

Yes.

> >and expects to be optimized into return 3 by vrp1.  As switchconv is
> >earlier
> >than that, vrp1 sees:
> >  _1 = a_3(D) & 1;
> >  _4 = (unsigned int) _1;
> >  _5 = CSWTCH.1[_4];
> >  return _5;
> >and doesn't optimize it.  If the testcase had say case 7: replaced with
> >default:, it wouldn't pass already before.
> 
> That looks odd...

Just a pass ordering issue.

>   If the patch is ok, what
> >should
> >we do with vrp40.c?  Change it in some way (e.g. return variable in one
> >case) so that switchconv doesn't trigger, or add an optimization in vrp
> >if we see a load from constant array with known initializer and the
> >range
> >is small enough and contains the same value for all those values,
> >replace
> >it with the value? 
> 
> Possibly, but for GCC 8.

To both this switchconv patch and the potential improvement for loading
from const arrays (can create an enhancement PR for that), or just the
latter?

> can we teach EVRP about this?  It runs before switch conversion.

I guess so.  It is a matter of calling simplify_switch_using_ranges
and then doing some cleanup (you wrote that optimization)
- to_update_switch_stmts handling.

Jakub


Re: [RFC PATCH] Improve switchconv optimization (PR tree-optimization/79472)

2017-02-14 Thread Richard Biener
On February 14, 2017 9:04:45 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>The following patch is an attempt to fix a regression where we no
>longer
>switch convert one switch because earlier optimizations turn it into
>unsupported shape.

Is that because of early threading?

>The patch contains two important changes (that can perhaps be split off
>separately):
>1) handle virtual PHIs; while because we require all the switch bbs
> to be empty, all the edges from the switch to final_bb should have the
>   same virtual op SSA_NAME, if the final_bb is reachable through other
>  edges from other code, it might have virtual PHI and switchconv would
>   unnecessarily give up
>2) if the switch cases form contiguous range, there is no need to
>require
>   anything about the default case, it can be abort, or arbitrary code
>   that can or might not fall through into final_bb, or it can e.g. be
>  empty bb and just the values from default bb might not be appropriate
>   constants; we emit an if (val is in range) vals = CSWTCH[...]; else
> anyway and the else can be anything; we still have to require standard
>  default case if the range is non-contiguous, because then the default
>   is used also for some values in the tables.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux.  It causes a
>single
>regression, vrp40.c, which looks like this:
>int f(int a) {
>switch (a & 1) { case 0: case 1: return 3; case 2: return 5; case 3:
>return 7;
>case 4: return 11; case 5: return 13; case 6: return 17; case 7: return
>19; }
>}
>and expects to be optimized into return 3 by vrp1.  As switchconv is
>earlier
>than that, vrp1 sees:
>  _1 = a_3(D) & 1;
>  _4 = (unsigned int) _1;
>  _5 = CSWTCH.1[_4];
>  return _5;
>and doesn't optimize it.  If the testcase had say case 7: replaced with
>default:, it wouldn't pass already before.

That looks odd...

  If the patch is ok, what
>should
>we do with vrp40.c?  Change it in some way (e.g. return variable in one
>case) so that switchconv doesn't trigger, or add an optimization in vrp
>if we see a load from constant array with known initializer and the
>range
>is small enough and contains the same value for all those values,
>replace
>it with the value? 

Possibly, but for GCC 8.

can we teach EVRP about this?  It runs before switch conversion.

Richard.


 It would help also with say:
>const int a[] = { 7, 8, 9, 1, 1, 1, 1, 2, 3, 4, 5, 6 };
>int foo (int x)
>{
>  if (x <= 2 || x >= 7) return 3;
>  return a[x];
>}
>turn it into
>int foo (int x)
>{
>  if (x <= 2 || x >= 7) return 3;
>  return 1;
>}
>
>2017-02-14  Jakub Jelinek  
>
>   PR tree-optimization/79472
>   * tree-switch-conversion.c (struct switch_conv_info): Add
>   contiguous_range and default_case_nonstandard fields.
>   (collect_switch_conv_info): Compute contiguous_range and
>   default_case_nonstandard fields, don't clear final_bb if
>   contiguous_range and only the default case doesn't have the required
>   structure.
>   (check_all_empty_except_final): Set default_case_nonstandard instead
>   of failing if contiguous_range and the default case doesn't have empty
>   block.
>   (check_final_bb): Add SWTCH argument, don't fail if contiguous_range
>   and only the default case doesn't have the required constants.  Skip
>   virtual phis.
>   (gather_default_values): Skip virtual phis.  Allow non-NULL CASE_LOW
>   if default_case_nonstandard.
>   (build_constructors): Build constant 1 just once.  Assert that default
>   values aren't inserted in between cases if contiguous_range.  Skip
>   virtual phis.
>   (build_arrays): Skip virtual phis.
>   (prune_bbs): Add DEFAULT_BB argument, don't remove that bb.
>   (fix_phi_nodes): Don't add e2f phi arg if default_case_nonstandard.
>   Handle virtual phis.
>   (gen_inbound_check): Handle default_case_nonstandard case.
>   (process_switch): Adjust check_final_bb caller.  Call
>   gather_default_values with the first non-default case instead of
>   default case if default_case_nonstandard.
>
>   * gcc.dg/tree-ssa/cswtch-3.c: New test.
>   * gcc.dg/tree-ssa/cswtch-4.c: New test.
>   * gcc.dg/tree-ssa/cswtch-5.c: New test.
>
>--- gcc/tree-switch-conversion.c.jj2017-02-14 14:54:08.020975500 +0100
>+++ gcc/tree-switch-conversion.c   2017-02-14 17:09:01.162826954 +0100
>@@ -592,6 +592,14 @@ struct switch_conv_info
>  dump file, if there is one.  */
>   const char *reason;
> 
>+  /* True if default case is not used for any value between range_min
>and
>+ range_max inclusive.  */
>+  bool contiguous_range;
>+
>+  /* True if default case does not have the required shape for other
>case
>+ labels.  */
>+  bool default_case_nonstandard;
>+
>   /* Parameters for expand_switch_using_bit_tests.  Should be computed
>  the same way as in expand_case.  */
>   unsigned int uniq;
>@@ -606,8 +614,9 @@ collect_switch_conv_info 

[RFC PATCH] Improve switchconv optimization (PR tree-optimization/79472)

2017-02-14 Thread Jakub Jelinek
Hi!

The following patch is an attempt to fix a regression where we no longer
switch convert one switch because earlier optimizations turn it into
unsupported shape.

The patch contains two important changes (that can perhaps be split off
separately):
1) handle virtual PHIs; while because we require all the switch bbs
   to be empty, all the edges from the switch to final_bb should have the
   same virtual op SSA_NAME, if the final_bb is reachable through other
   edges from other code, it might have virtual PHI and switchconv would
   unnecessarily give up
2) if the switch cases form contiguous range, there is no need to require
   anything about the default case, it can be abort, or arbitrary code
   that can or might not fall through into final_bb, or it can e.g. be
   empty bb and just the values from default bb might not be appropriate
   constants; we emit an if (val is in range) vals = CSWTCH[...]; else
   anyway and the else can be anything; we still have to require standard
   default case if the range is non-contiguous, because then the default
   is used also for some values in the tables.

Bootstrapped/regtested on x86_64-linux and i686-linux.  It causes a single
regression, vrp40.c, which looks like this:
int f(int a) {
 switch (a & 1) { case 0: case 1: return 3; case 2: return 5; case 3: return 7;
 case 4: return 11; case 5: return 13; case 6: return 17; case 7: return 19; }
}
and expects to be optimized into return 3 by vrp1.  As switchconv is earlier
than that, vrp1 sees:
  _1 = a_3(D) & 1;
  _4 = (unsigned int) _1;
  _5 = CSWTCH.1[_4];
  return _5;
and doesn't optimize it.  If the testcase had say case 7: replaced with
default:, it wouldn't pass already before.  If the patch is ok, what should
we do with vrp40.c?  Change it in some way (e.g. return variable in one
case) so that switchconv doesn't trigger, or add an optimization in vrp
if we see a load from constant array with known initializer and the range
is small enough and contains the same value for all those values, replace
it with the value?  It would help also with say:
const int a[] = { 7, 8, 9, 1, 1, 1, 1, 2, 3, 4, 5, 6 };
int foo (int x)
{
  if (x <= 2 || x >= 7) return 3;
  return a[x];
}
turn it into
int foo (int x)
{
  if (x <= 2 || x >= 7) return 3;
  return 1;
}

2017-02-14  Jakub Jelinek  

PR tree-optimization/79472
* tree-switch-conversion.c (struct switch_conv_info): Add
contiguous_range and default_case_nonstandard fields.
(collect_switch_conv_info): Compute contiguous_range and
default_case_nonstandard fields, don't clear final_bb if
contiguous_range and only the default case doesn't have the required
structure.
(check_all_empty_except_final): Set default_case_nonstandard instead
of failing if contiguous_range and the default case doesn't have empty
block.
(check_final_bb): Add SWTCH argument, don't fail if contiguous_range
and only the default case doesn't have the required constants.  Skip
virtual phis.
(gather_default_values): Skip virtual phis.  Allow non-NULL CASE_LOW
if default_case_nonstandard.
(build_constructors): Build constant 1 just once.  Assert that default
values aren't inserted in between cases if contiguous_range.  Skip
virtual phis.
(build_arrays): Skip virtual phis.
(prune_bbs): Add DEFAULT_BB argument, don't remove that bb.
(fix_phi_nodes): Don't add e2f phi arg if default_case_nonstandard.
Handle virtual phis.
(gen_inbound_check): Handle default_case_nonstandard case.
(process_switch): Adjust check_final_bb caller.  Call
gather_default_values with the first non-default case instead of
default case if default_case_nonstandard.

* gcc.dg/tree-ssa/cswtch-3.c: New test.
* gcc.dg/tree-ssa/cswtch-4.c: New test.
* gcc.dg/tree-ssa/cswtch-5.c: New test.

--- gcc/tree-switch-conversion.c.jj 2017-02-14 14:54:08.020975500 +0100
+++ gcc/tree-switch-conversion.c2017-02-14 17:09:01.162826954 +0100
@@ -592,6 +592,14 @@ struct switch_conv_info
  dump file, if there is one.  */
   const char *reason;
 
+  /* True if default case is not used for any value between range_min and
+ range_max inclusive.  */
+  bool contiguous_range;
+
+  /* True if default case does not have the required shape for other case
+ labels.  */
+  bool default_case_nonstandard;
+
   /* Parameters for expand_switch_using_bit_tests.  Should be computed
  the same way as in expand_case.  */
   unsigned int uniq;
@@ -606,8 +614,9 @@ collect_switch_conv_info (gswitch *swtch
   unsigned int branch_num = gimple_switch_num_labels (swtch);
   tree min_case, max_case;
   unsigned int count, i;
-  edge e, e_default;
+  edge e, e_default, e_first;
   edge_iterator ei;
+  basic_block first;
 
   memset (info, 0, sizeof (*info));
 
@@ -616,8 +625,8 @@ collect_switch_conv_info