On Mon, Jul 25, 2016 at 5:38 AM, Patrick Palka <patr...@parcs.ath.cx> wrote:
> On Fri, 22 Jul 2016, Patrick Palka wrote:
>
>> On Fri, 22 Jul 2016, Patrick Palka wrote:
>>
>> > On Fri, 22 Jul 2016, Patrick Palka wrote:
>> >
>> > > This patch teaches VRP to register along a default switch label
>> > > assertions that corresponds to the anti range of each case label.
>> > >
>> > > Does this look OK to commit after bootstrap + regtest on
>> > > x86_64-pc-linux-gnu?
>> >
>> > Forgot the changelog:
>> >
>> > gcc/ChangeLog:
>> >
>> >     PR tree-optimization/18046
>> >     * tree-vrp.c (find_switch_asserts): Register edge assertions
>> >     for the default label which correspond to the anti-ranges
>> >     of each non-case label.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >     PR tree-optimization/18046
>> >     * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Bump FSM count to 5.
>> >     * gcc.dg/tree-ssa/vrp103.c: New test.
>> >     * gcc.dg/tree-ssa/vrp104.c: New test.
>> >
>>
>> The patch failed to bootstrap due to a number -Warray-bounds warnings
>> getting emitted for the autogenerated header file insn-modes.h:
>>
>> In file included from /home/patrick/code/gcc/gcc/machmode.h:24:0,
>>                  from /home/patrick/code/gcc/gcc/coretypes.h:391,
>>                  from /home/patrick/code/gcc/gcc/lto-streamer-out.c:25:
>> ./insn-modes.h: In function ‘void produce_asm_for_decls()’:
>> ./insn-modes.h:638:36: warning: array subscript is outside array bounds 
>> [-Warray-bounds]
>>      default: return mode_inner[mode];
>>                      ~~~~~~~~~~~~~~~^
>>
>> These new warnings seem legitimate.  From what I can tell, this array
>> access along the default switch label will always be out of bounds.  The
>> code it's warning about basically looks like this:
>>
>>   enum A { a, b, c };
>>   int foo (A i)
>>   {
>>     int array[3];
>>     switch (i)
>>     {
>>       case a: return x;
>>       case b: return y;
>>       case c: return z;
>>       default: return array[i];
>>     }
>>   }
>>
>> and while the switch does exhaust every possible enumeration value of A,
>> there's nothing stopping the user from passing an arbitrary int to
>> foo() thus triggering the default case.  So this patch suppresses these
>> warnings by making genemit emit an assertion that verifies that mode is
>> within the bounds of the array.  This assertion won't affect performance
>> because the mode_*_inline functions are always called with constant
>> arguments.
>>
>> This version of the patch has the following changes:
>>
>> 1. Fixes the bootstrap failure as mentioned above.
>> 2. Checks if the switch operand is live along the default edge before
>> registering assertions.
>> 3. Combines contiguous case ranges to reduce the number of assert
>> expressions to insert.
>>
>> Bootstrap + regtesting in progress on x86_64-pc-linux-gnu.
>>
>> gcc/ChangeLog:
>>
>>       PR tree-optimization/18046
>>       * genmodes.c (emit_mode_size_inline): Emit an assert that
>>       verifies that mode is a valid array index.
>>       (emit_mode_nuinits_inline): Likewise.
>>       (emit_mode_inner_inline): Likewise.
>>       (emit_mode_unit_size_inline): Likewise.
>>       (emit_mode_unit_precision_inline): Likewise.
>>       * tree-vrp.c (compare_case_label_values): New qsort callback.
>>       (find_switch_asserts): Register edge assertions for
>>       the default label, which correspond to the anti-range of each
>>       non-case label.
>>
>> gcc/testsuite/ChangeLog:
>>
>>       PR tree-optimization/18046
>>       * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Bump FSM count to 5.
>>       * gcc.dg/tree-ssa/vrp103.c: New test.
>>       * gcc.dg/tree-ssa/vrp104.c: New test.
>
> Here's another version of the patch, which has the following changes:
>
> 1. Use wide-int arithmetic directly instead of tree arithmetic.
> 2. Don't bother re-sorting and re-using the ci vector.  Instead just
> inspect gimple_switch_label() directly since cases are already sorted
> there by CASE_LOW.
> 3. Add an insertion limit of 10 and a tunable param.
>
> Bootstrapped + regtested on x86_64-pc-linux-gnu.

Ok.

The only thing I wonder is whether VRP does a good job combining
non-adjacent anti-ranges or if the result is sth non-sensible.  ISTR
VRP simply chooses A when combining A and B which would mean
inserting asserts will only provide better ranges for more than one
asserts via equivalences (which might be good enough, of course).
I think the anti-range merge behavior is good but I for example wonder
about the behavior for ~[0, n] which will end up as [n, +INF] for
unsigned.  Also the combining limitation will make ranges derived
from the switch parameter less useful, like

  switch (x)
  {
....
  default:
     x++;
....

where x++ will only be derived from the merged anti-range.

All these are existing VRP range representation limitations of course.

Thanks,
Richard.

> gcc/ChangeLog:
>
>         PR tree-optimization/18046
>         * genmodes.c (emit_mode_size_inline): Emit an assert that
>         verifies that mode is a valid array index.
>         (emit_mode_nuinits_inline): Likewise.
>         (emit_mode_inner_inline): Likewise.
>         (emit_mode_unit_size_inline): Likewise.
>         (emit_mode_unit_precision_inline): Likewise.
>         * tree-vrp.c: Include params.h.
>         (find_switch_asserts): Register edge assertions for the default
>         label which correspond to the anti-ranges of each non-case
>         label.
>         * params.def (PARAM_MAX_VRP_SWITCH_ASSERTIONS): New.
>         * doc/invoke.texi: Document it.
>
> gcc/testsuite/ChangeLog:
>
>         PR tree-optimization/18046
>         * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Bump FSM count to 5.
>         * gcc.dg/tree-ssa/vrp103.c: New test.
>         * gcc.dg/tree-ssa/vrp104.c: New test.
> ---
>  gcc/doc/invoke.texi                              |  4 ++
>  gcc/genmodes.c                                   |  5 ++
>  gcc/params.def                                   |  6 +++
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c |  2 +-
>  gcc/testsuite/gcc.dg/tree-ssa/vrp103.c           | 30 ++++++++++++
>  gcc/testsuite/gcc.dg/tree-ssa/vrp104.c           | 36 ++++++++++++++
>  gcc/tree-vrp.c                                   | 62 
> +++++++++++++++++++++++-
>  7 files changed, 142 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp103.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp104.c
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 9e0f07e..6eeecc4 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -9774,6 +9774,10 @@ enable it.
>  The maximum number of may-defs we analyze when looking for a must-def
>  specifying the dynamic type of an object that invokes a virtual call
>  we may be able to devirtualize speculatively.
> +
> +@item max-vrp-switch-assertions
> +The maximum number of assertions to add along the default edge of a switch
> +statement during VRP.  The default is 10.
>  @end table
>  @end table
>
> diff --git a/gcc/genmodes.c b/gcc/genmodes.c
> index 097cc80..1170d4f 100644
> --- a/gcc/genmodes.c
> +++ b/gcc/genmodes.c
> @@ -976,6 +976,7 @@ unsigned char\n\
>  mode_size_inline (machine_mode mode)\n\
>  {\n\
>    extern %sunsigned char mode_size[NUM_MACHINE_MODES];\n\
> +  gcc_assert (mode >= 0 && mode < NUM_MACHINE_MODES);\n\
>    switch (mode)\n\
>      {\n", adj_bytesize ? "" : "const ");
>
> @@ -1006,6 +1007,7 @@ unsigned char\n\
>  mode_nunits_inline (machine_mode mode)\n\
>  {\n\
>    extern const unsigned char mode_nunits[NUM_MACHINE_MODES];\n\
> +  gcc_assert (mode >= 0 && mode < NUM_MACHINE_MODES);\n\
>    switch (mode)\n\
>      {");
>
> @@ -1035,6 +1037,7 @@ unsigned char\n\
>  mode_inner_inline (machine_mode mode)\n\
>  {\n\
>    extern const unsigned char mode_inner[NUM_MACHINE_MODES];\n\
> +  gcc_assert (mode >= 0 && mode < NUM_MACHINE_MODES);\n\
>    switch (mode)\n\
>      {");
>
> @@ -1067,6 +1070,7 @@ mode_unit_size_inline (machine_mode mode)\n\
>  {\n\
>    extern CONST_MODE_UNIT_SIZE unsigned char 
> mode_unit_size[NUM_MACHINE_MODES];\
>  \n\
> +  gcc_assert (mode >= 0 && mode < NUM_MACHINE_MODES);\n\
>    switch (mode)\n\
>      {");
>
> @@ -1103,6 +1107,7 @@ unsigned short\n\
>  mode_unit_precision_inline (machine_mode mode)\n\
>  {\n\
>    extern const unsigned short mode_unit_precision[NUM_MACHINE_MODES];\n\
> +  gcc_assert (mode >= 0 && mode < NUM_MACHINE_MODES);\n\
>    switch (mode)\n\
>      {");
>
> diff --git a/gcc/params.def b/gcc/params.def
> index 166032e..79b7dd4 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -1246,6 +1246,12 @@ DEFPARAM (PARAM_MAX_SPECULATIVE_DEVIRT_MAYDEFS,
>           "Maximum number of may-defs visited when devirtualizing "
>           "speculatively", 50, 0, 0)
>
> +DEFPARAM (PARAM_MAX_VRP_SWITCH_ASSERTIONS,
> +         "max-vrp-switch-assertions",
> +         "Maximum number of assertions to add along the default "
> +         "edge of a switch statement during VRP",
> +         10, 0, 0)
> +
>  /*
>
>  Local variables:
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
> index 5ec4687..551fbac 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
> @@ -1,7 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fdump-tree-thread1-details 
> -fdump-tree-thread2-details" } */
>  /* { dg-final { scan-tree-dump-times "FSM" 3 "thread1" } } */
> -/* { dg-final { scan-tree-dump-times "FSM" 4 "thread2" } } */
> +/* { dg-final { scan-tree-dump-times "FSM" 5 "thread2" } } */
>
>  int sum0, sum1, sum2, sum3;
>  int foo (char *s, char **ret)
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp103.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/vrp103.c
> new file mode 100644
> index 0000000..eea98bb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp103.c
> @@ -0,0 +1,30 @@
> +/* PR tree-optimization/18046  */
> +/* { dg-options "-O2 -fdump-tree-vrp" }  */
> +/* { dg-final { scan-tree-dump-times "baz \\(0\\);" 4 "vrp1" } }  */
> +
> +void foo ();
> +void bar ();
> +void baz (int);
> +
> +void
> +test (int i)
> +{
> +  switch (i)
> +    {
> +    case 1:
> +    case 2:
> +    case 3:
> +      foo ();
> +      break;
> +    case 5:
> +      bar ();
> +      break;
> +    default:
> +      /* These tests should be folded to 0, resulting in 4 calls to bar (0). 
>  */
> +      baz (i == 1);
> +      baz (i == 2);
> +      baz (i == 3);
> +      baz (i == 5);
> +      break;
> +    }
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c
> new file mode 100644
> index 0000000..73dac36
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c
> @@ -0,0 +1,36 @@
> +/* PR tree-optimization/18046  */
> +/* { dg-options "-O2 -fdump-tree-optimized" }  */
> +/* { dg-final { scan-tree-dump-times "switch" 1 "optimized" } }  */
> +
> +void foo ();
> +void bar ();
> +void baz ();
> +
> +void
> +test (int i)
> +{
> +  switch (i)
> +    {
> +    case 1:
> +      foo ();
> +      break;
> +    case 2:
> +      bar ();
> +      break;
> +    default:
> +      break;
> +    }
> +
> +  /* This switch should be gone after threading/VRP.  */
> +  switch (i)
> +    {
> +    case 1:
> +      foo ();
> +      break;
> +    case 2:
> +      baz ();
> +      break;
> +    default:
> +      break;
> +    }
> +}
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index 1abc99a..77c3014 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "omp-low.h"
>  #include "target.h"
>  #include "case-cfn-macros.h"
> +#include "params.h"
>
>  /* Range of values that can be associated with an SSA_NAME after VRP
>     has executed.  */
> @@ -5919,6 +5920,7 @@ find_switch_asserts (basic_block bb, gswitch *last)
>        ci[idx].expr = gimple_switch_label (last, idx);
>        ci[idx].bb = label_to_block (CASE_LABEL (ci[idx].expr));
>      }
> +  edge default_edge = find_edge (bb, ci[0].bb);
>    qsort (ci, n, sizeof (struct case_info), compare_case_labels);
>
>    for (idx = 0; idx < n; ++idx)
> @@ -5947,8 +5949,8 @@ find_switch_asserts (basic_block bb, gswitch *last)
>             max = CASE_LOW (ci[idx].expr);
>         }
>
> -      /* Nothing to do if the range includes the default label until we
> -        can register anti-ranges.  */
> +      /* Can't extract a useful assertion out of a range that includes the
> +        default label.  */
>        if (min == NULL_TREE)
>         continue;
>
> @@ -5966,6 +5968,62 @@ find_switch_asserts (basic_block bb, gswitch *last)
>      }
>
>    XDELETEVEC (ci);
> +
> +  if (!live_on_edge (default_edge, op))
> +    return;
> +
> +  /* Now register along the default label assertions that correspond to the
> +     anti-range of each label.  */
> +  int insertion_limit = PARAM_VALUE (PARAM_MAX_VRP_SWITCH_ASSERTIONS);
> +  for (idx = 1; idx < n; idx++)
> +    {
> +      tree min, max;
> +      tree cl = gimple_switch_label (last, idx);
> +
> +      min = CASE_LOW (cl);
> +      max = CASE_HIGH (cl);
> +
> +      /* Combine contiguous case ranges to reduce the number of assertions
> +        to insert.  */
> +      for (idx = idx + 1; idx < n; idx++)
> +       {
> +         tree next_min, next_max;
> +         tree next_cl = gimple_switch_label (last, idx);
> +
> +         next_min = CASE_LOW (next_cl);
> +         next_max = CASE_HIGH (next_cl);
> +
> +         wide_int difference = wi::sub (next_min, max ? max : min);
> +         if (wi::eq_p (difference, 1))
> +           max = next_max ? next_max : next_min;
> +         else
> +           break;
> +       }
> +      idx--;
> +
> +      if (max == NULL_TREE)
> +       {
> +         /* Register the assertion OP != MIN.  */
> +         min = fold_convert (TREE_TYPE (op), min);
> +         register_edge_assert_for (op, default_edge, bsi, NE_EXPR, op, min);
> +       }
> +      else
> +       {
> +         /* Register the assertion (unsigned)OP - MIN > (MAX - MIN),
> +            which will give OP the anti-range ~[MIN,MAX].  */
> +         tree uop = fold_convert (unsigned_type_for (TREE_TYPE (op)), op);
> +         min = fold_convert (TREE_TYPE (uop), min);
> +         max = fold_convert (TREE_TYPE (uop), max);
> +
> +         tree lhs = fold_build2 (MINUS_EXPR, TREE_TYPE (uop), uop, min);
> +         tree rhs = int_const_binop (MINUS_EXPR, max, min);
> +         register_new_assert_for (op, lhs, GT_EXPR, rhs,
> +                                  NULL, default_edge, bsi);
> +       }
> +
> +      if (--insertion_limit == 0)
> +       break;
> +    }
>  }
>
>
> --
> 2.9.2.413.g76d2a70

Reply via email to