[PATCH] [v4][aarch64] Avoid tag collisions for loads falkor

2018-07-24 Thread Siddhesh Poyarekar
Hi,

This is a rewrite of the tag collision avoidance patch that Kugan had
written as a machine reorg pass back in February.

The falkor hardware prefetching system uses a combination of the
source, destination and offset to decide which prefetcher unit to
train with the load.  This is great when loads in a loop are
sequential but sub-optimal if there are unrelated loads in a loop that
tag to the same prefetcher unit.

This pass attempts to rename the desination register of such colliding
loads using routines available in regrename.c so that their tags do
not collide.  This shows some performance gains with mcf and xalancbmk
(~5% each) and will be tweaked further.  The pass is placed near the
fag end of the pass list so that subsequent passes don't inadvertantly
end up undoing the renames.

A full gcc bootstrap and testsuite ran successfully on aarch64, i.e. it
did not introduce any new regressions.  I also did a make-check with
-mcpu=falkor to ensure that there were no regressions.  The couple of
regressions I found were target-specific and were related to scheduling
and cost differences and are not correctness issues.

Changes from v3:
- Avoid renaming argument/return registers and registers that have a
  specific architectural meaning, i.e. stack pointer, frame pointer,
  etc.  Try renaming their aliases instead.

Changes from v2:
- Ignore SVE instead of asserting that falkor does not support sve

Changes from v1:

- Fixed up issues pointed out by Kyrill
- Avoid renaming R0/V0 since they could be return values
- Fixed minor formatting issues.

2018-07-02  Siddhesh Poyarekar  
Kugan Vivekanandarajah  

* config/aarch64/falkor-tag-collision-avoidance.c: New file.
* config.gcc (extra_objs): Build it.
* config/aarch64/t-aarch64 (falkor-tag-collision-avoidance.o):
Likewise.
* config/aarch64/aarch64-passes.def
(pass_tag_collision_avoidance): New pass.
* config/aarch64/aarch64.c (qdf24xx_tunings): Add
AARCH64_EXTRA_TUNE_RENAME_LOAD_REGS to tuning_flags.
(aarch64_classify_address): Remove static qualifier.
(aarch64_address_info, aarch64_address_type): Move to...
* config/aarch64/aarch64-protos.h: ... here.
(make_pass_tag_collision_avoidance): New function.
* config/aarch64/aarch64-tuning-flags.def (rename_load_regs):
New tuning flag.

CC: james.greenha...@arm.com
CC: kyrylo.tkac...@foss.arm.com
---
 gcc/config.gcc|   2 +-
 gcc/config/aarch64/aarch64-passes.def |   1 +
 gcc/config/aarch64/aarch64-protos.h   |  49 +
 gcc/config/aarch64/aarch64-tuning-flags.def   |   2 +
 gcc/config/aarch64/aarch64.c  |  48 +-
 .../aarch64/falkor-tag-collision-avoidance.c  | 881 ++
 gcc/config/aarch64/t-aarch64  |   9 +
 7 files changed, 946 insertions(+), 46 deletions(-)
 create mode 100644 gcc/config/aarch64/falkor-tag-collision-avoidance.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 78e84c2b864..8f5e458e8a6 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -304,7 +304,7 @@ aarch64*-*-*)
extra_headers="arm_fp16.h arm_neon.h arm_acle.h"
c_target_objs="aarch64-c.o"
cxx_target_objs="aarch64-c.o"
-   extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o"
+   extra_objs="aarch64-builtins.o aarch-common.o cortex-a57-fma-steering.o 
falkor-tag-collision-avoidance.o"
target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.c"
target_has_targetm_common=yes
;;
diff --git a/gcc/config/aarch64/aarch64-passes.def 
b/gcc/config/aarch64/aarch64-passes.def
index 87747b420b0..f61a8870aa1 100644
--- a/gcc/config/aarch64/aarch64-passes.def
+++ b/gcc/config/aarch64/aarch64-passes.def
@@ -19,3 +19,4 @@
.  */
 
 INSERT_PASS_AFTER (pass_regrename, 1, pass_fma_steering);
+INSERT_PASS_AFTER (pass_machine_reorg, 1, pass_tag_collision_avoidance);
diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index af5db9c5953..647ad7a9c37 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -288,6 +288,49 @@ struct tune_params
   const struct cpu_prefetch_tune *prefetch;
 };
 
+/* Classifies an address.
+
+   ADDRESS_REG_IMM
+   A simple base register plus immediate offset.
+
+   ADDRESS_REG_WB
+   A base register indexed by immediate offset with writeback.
+
+   ADDRESS_REG_REG
+   A base register indexed by (optionally scaled) register.
+
+   ADDRESS_REG_UXTW
+   A base register indexed by (optionally scaled) zero-extended register.
+
+   ADDRESS_REG_SXTW
+   A base register indexed by (optionally scaled) sign-extended register.
+
+   ADDRESS_LO_SUM
+   A LO_SUM rtx with a base register and "LO12" symbol relocation.
+
+   ADDRESS_SYMBOLIC:
+   A constant symbolic address, in pc-relative literal pool.  */
+
+enum 

[PATCH] Make strlen range computations more conservative

2018-07-24 Thread Bernd Edlinger
Hi!

This patch makes strlen range computations more conservative.

Firstly if there is a visible type cast from type A to B before passing
then value to strlen, don't expect the type layout of B to restrict the
possible return value range of strlen.

Furthermore use the outermost enclosing array instead of the
innermost one, because too aggressive optimization will likely
convert harmless errors into security-relevant errors, because
as the existing test cases demonstrate, this optimization is actively
attacking string length checks in user code, while and not giving
any warnings.



Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.gcc:
2018-07-24  Bernd Edlinger  

* gimple-fold.c (get_range_strlen): Add a check for type casts.
Use outermost enclosing array size instead of innermost one.
* tree-ssa-strlen.c (maybe_set_strlen_range): Likewise.

testsuite:
2018-07-24  Bernd Edlinger  

* gcc.dg/strlenopt-40.c: Adjust test expectations.
* gcc.dg/strlenopt-45.c: Likewise.
* gcc.dg/strlenopt-48.c: Likewise.
* gcc.dg/strlenopt-51.c: Likewise.
* gcc.dg/strlenopt-54.c: New test.
Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c	(revision 262904)
+++ gcc/gimple-fold.c	(working copy)
@@ -1339,19 +1339,33 @@ get_range_strlen (tree arg, tree length[2], bitmap
 
 	  if (TREE_CODE (arg) == ARRAY_REF)
 	{
-	  tree type = TREE_TYPE (TREE_OPERAND (arg, 0));
+	  /* Avoid arrays of pointers.  */
+	  if (TREE_CODE (TREE_TYPE (arg)) == POINTER_TYPE)
+		return false;
 
-	  /* Determine the "innermost" array type.  */
-	  while (TREE_CODE (type) == ARRAY_TYPE
-		 && TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE)
-		type = TREE_TYPE (type);
+	  /* Look for the outermost enclosing array.  */
+	  while (TREE_CODE (arg) == ARRAY_REF
+		 && TREE_CODE (TREE_TYPE (TREE_OPERAND (arg, 0)))
+			== ARRAY_TYPE)
+		arg = TREE_OPERAND (arg, 0);
 
-	  /* Avoid arrays of pointers.  */
-	  tree eltype = TREE_TYPE (type);
-	  if (TREE_CODE (type) != ARRAY_TYPE
-		  || !INTEGRAL_TYPE_P (eltype))
+	  tree base = arg;
+	  while (TREE_CODE (base) == ARRAY_REF
+		 || TREE_CODE (base) == ARRAY_RANGE_REF
+		 || TREE_CODE (base) == COMPONENT_REF)
+		base = TREE_OPERAND (base, 0);
+
+	  /* If this looks like a type cast don't assume anything.  */
+	  if ((TREE_CODE (base) == MEM_REF
+		   && (! integer_zerop (TREE_OPERAND (base, 1))
+		   || TREE_TYPE (TREE_TYPE (TREE_OPERAND (base, 0)))
+			  != TREE_TYPE (base)))
+		  || TREE_CODE (base) == VIEW_CONVERT_EXPR)
 		return false;
 
+	  tree type = TREE_TYPE (arg);
+
+	  /* Fail when the array bound is unknown or zero.  */
 	  val = TYPE_SIZE_UNIT (type);
 	  if (!val || integer_zerop (val))
 		return false;
@@ -1362,9 +1376,9 @@ get_range_strlen (tree arg, tree length[2], bitmap
 		 the array could have zero length.  */
 	  *minlen = ssize_int (0);
 
-	  if (TREE_CODE (TREE_OPERAND (arg, 0)) == COMPONENT_REF
-		  && type == TREE_TYPE (TREE_OPERAND (arg, 0))
-		  && array_at_struct_end_p (TREE_OPERAND (arg, 0)))
+	  if (TREE_CODE (arg) == COMPONENT_REF
+		  && type == TREE_TYPE (arg)
+		  && array_at_struct_end_p (arg))
 		*flexp = true;
 	}
 	  else if (TREE_CODE (arg) == COMPONENT_REF
@@ -1371,6 +1385,20 @@ get_range_strlen (tree arg, tree length[2], bitmap
 		   && (TREE_CODE (TREE_TYPE (TREE_OPERAND (arg, 1)))
 		   == ARRAY_TYPE))
 	{
+	  tree base = TREE_OPERAND (arg, 0);
+	  while (TREE_CODE (base) == ARRAY_REF
+		 || TREE_CODE (base) == ARRAY_RANGE_REF
+		 || TREE_CODE (base) == COMPONENT_REF)
+		base = TREE_OPERAND (base, 0);
+
+	  /* If this looks like a type cast don't assume anything.  */
+	  if ((TREE_CODE (base) == MEM_REF
+		   && (! integer_zerop (TREE_OPERAND (base, 1))
+		   || TREE_TYPE (TREE_TYPE (TREE_OPERAND (base, 0)))
+			  != TREE_TYPE (base)))
+		  || TREE_CODE (base) == VIEW_CONVERT_EXPR)
+		return false;
+
 	  /* Use the type of the member array to determine the upper
 		 bound on the length of the array.  This may be overly
 		 optimistic if the array itself isn't NUL-terminated and
@@ -1386,10 +1414,6 @@ get_range_strlen (tree arg, tree length[2], bitmap
 
 	  tree type = TREE_TYPE (arg);
 
-	  while (TREE_CODE (type) == ARRAY_TYPE
-		 && TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE)
-		type = TREE_TYPE (type);
-
 	  /* Fail when the array bound is unknown or zero.  */
 	  val = TYPE_SIZE_UNIT (type);
 	  if (!val || integer_zerop (val))
Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c	(revision 262904)
+++ gcc/tree-ssa-strlen.c	(working copy)
@@ -1149,9 +1149,33 @@ maybe_set_strlen_range (tree lhs, tree src, tree b
 
   if (TREE_CODE (src) == ADDR_EXPR)
 {
+  src 

[PATCH] Introduce __builtin_expect_with_probability (PR target/83610).

2018-07-24 Thread Martin Liška
Hi.

This is implementation of new built-in that can be used for more fine
tweaking of probability. Micro benchmark is attached as part of the PR.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2018-07-24  Martin Liska  

PR target/83610
* builtin-types.def (BT_FN_LONG_LONG_LONG_LONG): New type.
* builtins.c (expand_builtin_expect_with_probability):
New function.
(expand_builtin): Handle also BUILT_IN_EXPECT_WITH_PROBABILITY.
(build_builtin_expect_predicate): Likewise.
(fold_builtin_expect): Likewise.
(fold_builtin_2): Likewise.
(fold_builtin_3): Likewise.
* builtins.def (BUILT_IN_EXPECT_WITH_PROBABILITY): Define new
builtin.
* builtins.h (fold_builtin_expect): Add new argument
(probability).
* doc/extend.texi: Document the new builtin.
* doc/invoke.texi: Likewise.
* gimple-fold.c (gimple_fold_call): Pass new argument.
* ipa-fnsummary.c (find_foldable_builtin_expect):
Handle also BUILT_IN_EXPECT_WITH_PROBABILITY.
* predict.c (expr_expected_value): Add new out argument which
is probability.
(expr_expected_value_1): Likewise.
(tree_predict_by_opcode): Predict edge based on
provided probability.
(pass_strip_predict_hints::execute): Use newly added
DECL_BUILT_IN_P macro.
* predict.def (PRED_BUILTIN_EXPECT_WITH_PROBABILITY):
Define new predictor.
* tree.h (DECL_BUILT_IN_P): Define.

gcc/testsuite/ChangeLog:

2018-07-24  Martin Liska  

* gcc.dg/predict-16.c: New test.
---
 gcc/builtin-types.def |  2 +
 gcc/builtins.c| 65 ---
 gcc/builtins.def  |  1 +
 gcc/builtins.h|  2 +-
 gcc/doc/extend.texi   |  8 
 gcc/doc/invoke.texi   |  3 ++
 gcc/gimple-fold.c |  3 +-
 gcc/ipa-fnsummary.c   |  1 +
 gcc/predict.c | 61 ++---
 gcc/predict.def   |  5 +++
 gcc/testsuite/gcc.dg/predict-16.c | 13 +++
 gcc/tree.h|  6 +++
 12 files changed, 140 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/predict-16.c


diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index b01095c420f..6e87bcbbf1d 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -531,6 +531,8 @@ DEF_FUNCTION_TYPE_3 (BT_FN_ULONG_ULONG_ULONG_ULONG,
 		 BT_ULONG, BT_ULONG, BT_ULONG, BT_ULONG)
 DEF_FUNCTION_TYPE_3 (BT_FN_LONG_LONG_UINT_UINT,
 		 BT_LONG, BT_LONG, BT_UINT, BT_UINT)
+DEF_FUNCTION_TYPE_3 (BT_FN_LONG_LONG_LONG_LONG,
+		 BT_LONG, BT_LONG, BT_LONG, BT_LONG)
 DEF_FUNCTION_TYPE_3 (BT_FN_ULONG_ULONG_UINT_UINT,
 		 BT_ULONG, BT_ULONG, BT_UINT, BT_UINT)
 DEF_FUNCTION_TYPE_3 (BT_FN_STRING_CONST_STRING_CONST_STRING_INT,
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 539a6d17688..29d77d3d83b 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -148,6 +148,7 @@ static rtx expand_builtin_unop (machine_mode, tree, rtx, rtx, optab);
 static rtx expand_builtin_frame_address (tree, tree);
 static tree stabilize_va_list_loc (location_t, tree, int);
 static rtx expand_builtin_expect (tree, rtx);
+static rtx expand_builtin_expect_with_probability (tree, rtx);
 static tree fold_builtin_constant_p (tree);
 static tree fold_builtin_classify_type (tree);
 static tree fold_builtin_strlen (location_t, tree, tree);
@@ -5237,6 +5238,27 @@ expand_builtin_expect (tree exp, rtx target)
   return target;
 }
 
+/* Expand a call to __builtin_expect_with_probability.  We just return our
+   argument as the builtin_expect semantic should've been already executed by
+   tree branch prediction pass.  */
+
+static rtx
+expand_builtin_expect_with_probability (tree exp, rtx target)
+{
+  tree arg;
+
+  if (call_expr_nargs (exp) < 3)
+return const0_rtx;
+  arg = CALL_EXPR_ARG (exp, 0);
+
+  target = expand_expr (arg, target, VOIDmode, EXPAND_NORMAL);
+  /* When guessing was done, the hints should be already stripped away.  */
+  gcc_assert (!flag_guess_branch_prob
+	  || optimize == 0 || seen_error ());
+  return target;
+}
+
+
 /* Expand a call to __builtin_assume_aligned.  We just return our first
argument as the builtin_assume_aligned semantic should've been already
executed by CCP.  */
@@ -7494,6 +7516,8 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
   return expand_builtin_va_copy (exp);
 case BUILT_IN_EXPECT:
   return expand_builtin_expect (exp, target);
+case BUILT_IN_EXPECT_WITH_PROBABILITY:
+  return expand_builtin_expect_with_probability (exp, target);
 case BUILT_IN_ASSUME_ALIGNED:
   return expand_builtin_assume_aligned (exp, target);
 case BUILT_IN_PREFETCH:
@@ -8134,16 +8158,20 @@ fold_builtin_constant_p 

Re: [Patch-86512]: Subnormal float support in armv7(with -msoft-float) for intrinsics

2018-07-24 Thread Umesh Kalappa
Thank you All for the suggestions  and we tried runing the GCC
testsuite and found that no regression with the fix and also ran the
our regressions base for conformance with no regress.

Is ok for commit with below  Changelog ?
+++ libgcc/ChangeLog(working copy)
@@ -1,3 +1,9 @@
+2018-07-18  Umesh Kalappa 
+
+   PR libgcc/86512
+   * config/arm/ieee754-df.S :Don't normalise the denormal result.
+   * config/arm/ieee754-sf.S:Likewise.
+
+
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2018-07-18  Umesh Kalappa 
+
+   PR libgcc/86512
+   * gcc.target/arm/pr86512.c :New test.
+

On Mon, Jul 23, 2018 at 5:24 PM, Wilco Dijkstra  wrote:
> Umesh Kalappa wrote:
>
>> We tested on the SP and yes the problem persist on the SP too and
>> attached patch will fix the both SP and DP issues for the  denormal
>> resultant.
>
> The patch now looks correct to me (but I can't approve).
>
>> We bootstrapped the compiler ,look ok to us with minimal testing ,
>>
>> Any floating point test-suite to test for the attached patch ? any
>> recommendations or inputs  ?
>
> Running the GCC regression tests would be required since a bootstrap isn't
> useful for this kind of change. Assuming you use Linux, building and running
> GLIBC with the changed GCC would give additional test coverage as it tests
> all the math library functions.
>
> I don't know of any IEEE conformance testsuites in the GNU world, which is
> why I'm suggesting running some targeted and randomized tests. You could
> use the generic soft-float code in libgcc/soft-fp/adddf3.c to compare the 
> outputs.
>
>
 Index: libgcc/config/arm/ieee754-df.S
 ===
 --- libgcc/config/arm/ieee754-df.S   (revision 262850)
 +++ libgcc/config/arm/ieee754-df.S   (working copy)
 @@ -203,6 +203,7 @@
  #endif

  @ Determine how to normalize the result.
 +@ if result is denormal i.e (exp)=0,then don't normalise the result,
>
> Use a standard sentence here, eg. like:
>
> If exp is zero and the mantissa unnormalized, return a denormal.
>
> Wilco
>


pr86512.patch
Description: Binary data


[PATCH] Limix dump_flag enum values range (PR middle-end/86645).

2018-07-24 Thread Martin Liška
Hi.

That fixes many UBSAN issues that are caused by:

  {"all", dump_flags_t (~(TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_GRAPH
| TDF_STMTADDR | TDF_RHS_ONLY | TDF_NOUID
| TDF_ENUMERATE_LOCALS | TDF_SCEV | TDF_GIMPLE))},

That goes out of:

  minv = TYPE_MIN_VALUE (TREE_TYPE (type));
  maxv = TYPE_MAX_VALUE (TREE_TYPE (type));

Thus I would like to limit value of "all".

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
And UBSAN errors are gone.

Ready to be installed?
Martin


gcc/ChangeLog:

2018-07-23  Martin Liska  

PR middle-end/86645
* dumpfile.c: And excluded values with TDF_ALL_VALUES.
* dumpfile.h (enum dump_flag): Defince TDF_ALL_VALUES.
---
 gcc/dumpfile.c | 7 ---
 gcc/dumpfile.h | 5 -
 2 files changed, 8 insertions(+), 4 deletions(-)


diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 6c9920c6bd2..176c9b846d7 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -150,9 +150,10 @@ static const kv_pair dump_options[] =
   {"missed", MSG_MISSED_OPTIMIZATION},
   {"note", MSG_NOTE},
   {"optall", MSG_ALL},
-  {"all", dump_flags_t (~(TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_GRAPH
-			| TDF_STMTADDR | TDF_RHS_ONLY | TDF_NOUID
-			| TDF_ENUMERATE_LOCALS | TDF_SCEV | TDF_GIMPLE))},
+  {"all", dump_flags_t (TDF_ALL_VALUES
+			& ~(TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_GRAPH
+			| TDF_STMTADDR | TDF_RHS_ONLY | TDF_NOUID
+			| TDF_ENUMERATE_LOCALS | TDF_SCEV | TDF_GIMPLE))},
   {NULL, TDF_NONE}
 };
 
diff --git a/gcc/dumpfile.h b/gcc/dumpfile.h
index ad14acdfc9a..1dbe3b85b7c 100644
--- a/gcc/dumpfile.h
+++ b/gcc/dumpfile.h
@@ -146,7 +146,10 @@ enum dump_flag
 	 | MSG_NOTE),
 
   /* Dumping for -fcompare-debug.  */
-  TDF_COMPARE_DEBUG = (1 << 25)
+  TDF_COMPARE_DEBUG = (1 << 25),
+
+  /* All values.  */
+  TDF_ALL_VALUES = (1 << 26) - 1
 };
 
 /* Dump flags type.  */



[PATCH] Fix expand_divmod (PR middle-end/86627)

2018-07-24 Thread Jakub Jelinek
Hi!

As the following testcase shows, expand_divmod stopped emitting int128
signed divisions by positive small (fitting into hwi) power of two constants
in my r242690 aka PR78416 fix, where I've added next to
EXACT_POWER_OF_2_OR_ZERO_P uses a check that either the bitsize is
smaller or equal to hwi, or the value is positive (because otherwise
the value is not a power of two, has say 65 bits set and 63 bits clear).
In this particular spot I've been changing:
else if (EXACT_POWER_OF_2_OR_ZERO_P (d)
+&& (size <= HOST_BITS_PER_WIDE_INT || d >= 0)
 && (rem_flag
 ? smod_pow2_cheap (speed, compute_mode)
 : sdiv_pow2_cheap (speed, compute_mode))
 /* We assume that cheap metric is true if the
optab has an expander for this mode.  */
 && ((optab_handler ((rem_flag ? smod_optab
  : sdiv_optab),
 compute_mode)
  != CODE_FOR_nothing)
 || (optab_handler (sdivmod_optab,
compute_mode)
 != CODE_FOR_nothing)))
  ;
-   else if (EXACT_POWER_OF_2_OR_ZERO_P (abs_d))
+   else if (EXACT_POWER_OF_2_OR_ZERO_P (abs_d)
+&& (size <= HOST_BITS_PER_WIDE_INT
+|| abs_d != (unsigned HOST_WIDE_INT) d))

The first change was correct, but I think I've failed to take into account
the large additional && there and that the positive power of two values of
d aren't really handled in the first else if, it is merely about not
optimizing it if division or modulo is fast, and the actual optimization
is only done in the second else if, where it handles both the cases of
d being a positive power of two, and the case where d is negative and is not
a power of two, but its negation abs_d is a positive power of two (handled
by doing additional negation afterwards).  The condition I've added allowed
for the > 64-bit bitsizes only the cases of negative d values where their
negation is a positive power of two (and disallowed the corner wrapping
case of abs_d == d).  This means with the above change we keep optimizing
signed int128 division by e.g. -2 or -0x400 into shifts, but
actually don't optimize division by 2 or 0x40.
Although d and abs_d are HOST_WIDE_INT and unsigned HOST_WIDE_INT, the
d >= cases are always good even for int128, the higher bits are all zeros
and abs_d is the same as d.  For d < 0 and d != HOST_WIDE_INT_MIN it is also
ok, d is lots of sign bits followed by 63 arbitrary bits, but the absolute
value of that is still a number with the msb bit in hwi clear and in wider
precision all bits above it clear too.  So the only problematic case is
d equal to HOST_WIDE_INT_MIN, where we are divising or doing modulo
by (signed __int128) 0x8000, and its negation
is still the same value when expressed in CONST_INT or HOST_WIDE_INT, but
the actual negated value should be (signed __int128) 0x8000ULL.

So, this patch punts for that single special case which we don't handle
properly (so it will be expanded as __divti3 likely), and allows again
the positive power of two d values.

Sorry for introducing this regression.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
release branches?

2018-07-24  Jakub Jelinek  

PR middle-end/86627
* expmed.c (expand_divmod): Punt if d == HOST_WIDE_INT_MIN
and size > HOST_BITS_PER_WIDE_INT.  For size > HOST_BITS_PER_WIDE_INT
and abs_d == d, do the power of two handling if profitable.

* gcc.target/i386/pr86627.c: New test.

--- gcc/expmed.c.jj 2018-07-16 23:24:29.0 +0200
+++ gcc/expmed.c2018-07-23 12:22:05.835272680 +0200
@@ -4480,6 +4480,11 @@ expand_divmod (int rem_flag, enum tree_c
HOST_WIDE_INT d = INTVAL (op1);
unsigned HOST_WIDE_INT abs_d;
 
+   /* Not prepared to handle division/remainder by
+  0x8000 etc.  */
+   if (d == HOST_WIDE_INT_MIN && size > HOST_BITS_PER_WIDE_INT)
+ break;
+
/* Since d might be INT_MIN, we have to cast to
   unsigned HOST_WIDE_INT before negating to avoid
   undefined signed overflow.  */
@@ -4522,9 +4527,7 @@ expand_divmod (int rem_flag, enum tree_c
 || (optab_handler (sdivmod_optab, int_mode)
 != CODE_FOR_nothing)))
  ;
-   else if (EXACT_POWER_OF_2_OR_ZERO_P (abs_d)
-&& (size <= HOST_BITS_PER_WIDE_INT
-|| abs_d != (unsigned HOST_WIDE_INT) 

[PATCH] gcov: Fix wrong usage of NAN in statistics (PR gcov-profile/86536).

2018-07-24 Thread Martin Liška
Hi.

We have situations where a branch can return more often than called (fork).
Thus I decided to rapidly simplify format_gcov and print ratios that are
provided. No extra values are handled now.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
If no objections, I'll install the patch in couple of days.

Martin

gcc/ChangeLog:

2018-07-20  Martin Liska  

PR gcov-profile/86536
* gcov.c (format_gcov): Use printf format %.*f directly
and do not handle special values.

gcc/testsuite/ChangeLog:

2018-07-20  Martin Liska  

PR gcov-profile/86536
* gcc.misc-tests/gcov-pr86536.c: New test.
---
 gcc/gcov.c  | 46 +
 gcc/testsuite/gcc.misc-tests/gcov-pr86536.c | 25 +++
 2 files changed, 35 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-pr86536.c


diff --git a/gcc/gcov.c b/gcc/gcov.c
index ad2de4d5b22..78a3e0e19e9 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -2203,50 +2203,24 @@ format_count (gcov_type count)
 }
 
 /* Format a GCOV_TYPE integer as either a percent ratio, or absolute
-   count.  If dp >= 0, format TOP/BOTTOM * 100 to DP decimal places.
-   If DP is zero, no decimal point is printed. Only print 100% when
-   TOP==BOTTOM and only print 0% when TOP=0.  If dp < 0, then simply
+   count.  If DECIMAL_PLACES >= 0, format TOP/BOTTOM * 100 to DECIMAL_PLACES.
+   If DECIMAL_PLACES is zero, no decimal point is printed. Only print 100% when
+   TOP==BOTTOM and only print 0% when TOP=0.  If DECIMAL_PLACES < 0, then simply
format TOP.  Return pointer to a static string.  */
 
 static char const *
-format_gcov (gcov_type top, gcov_type bottom, int dp)
+format_gcov (gcov_type top, gcov_type bottom, int decimal_places)
 {
   static char buffer[20];
 
-  /* Handle invalid values that would result in a misleading value.  */
-  if (bottom != 0 && top > bottom && dp >= 0)
+  if (decimal_places >= 0)
 {
-  sprintf (buffer, "NAN %%");
-  return buffer;
-}
+  float ratio = bottom ? 100.0f * top / bottom: 0;
 
-  if (dp >= 0)
-{
-  float ratio = bottom ? (float)top / bottom : 0;
-  int ix;
-  unsigned limit = 100;
-  unsigned percent;
-
-  for (ix = dp; ix--; )
-	limit *= 10;
-
-  percent = (unsigned) (ratio * limit + (float)0.5);
-  if (percent <= 0 && top)
-	percent = 1;
-  else if (percent >= limit && top != bottom)
-	percent = limit - 1;
-  ix = sprintf (buffer, "%.*u%%", dp + 1, percent);
-  if (dp)
-	{
-	  dp++;
-	  do
-	{
-	  buffer[ix+1] = buffer[ix];
-	  ix--;
-	}
-	  while (dp--);
-	  buffer[ix + 1] = '.';
-	}
+  /* Round up to 1% if there's a small non-zero value.  */
+  if (ratio > 0.0f && ratio < 0.5f && decimal_places == 0)
+	ratio = 1.0f;
+  sprintf (buffer, "%.*f%%", decimal_places, ratio);
 }
   else
 return format_count (top);
diff --git a/gcc/testsuite/gcc.misc-tests/gcov-pr86536.c b/gcc/testsuite/gcc.misc-tests/gcov-pr86536.c
new file mode 100644
index 000..48177735999
--- /dev/null
+++ b/gcc/testsuite/gcc.misc-tests/gcov-pr86536.c
@@ -0,0 +1,25 @@
+// PR gcov-profile/86536
+// { dg-options "-fprofile-arcs -ftest-coverage" }
+// { dg-do run { target native } }
+// { dg-require-fork "" }
+
+#include 
+#include 
+#include 
+#include 
+
+int
+main (void)
+{
+
+  int j = 22;		  /* count(1) */
+
+			  /* returns(200) */
+  fork ();		  /* count(1)  */
+			  /* returns(end) */
+
+  int i = 7;		  /* count(2) */
+  return 0;		  /* count(2) */
+}
+
+// { dg-final { run-gcov branches calls { -b gcov-pr86536.c } } }



Avoid _VINFO_MASKS for bb vectorisation (PR 86618)

2018-07-24 Thread Richard Sandiford
r262589 introduced another instance of the bug fixed in r258131.

Tested on aarch64-linux-gnu and applied as obvious.

Richard


2018-07-24  Richard Sandiford  

gcc/
PR tree-optimization/86618
* tree-vect-stmts.c (vectorizable_call): Don't take the address
of LOOP_VINFO_MASKS (loop_vinfo) when loop_vinfo is null.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 19:03:04.0 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 19:03:23.521825230 +0100
@@ -3337,7 +3337,7 @@ vectorizable_call (gimple *gs, gimple_st
  needs to be generated.  */
   gcc_assert (ncopies >= 1);
 
-  vec_loop_masks *masks = _VINFO_MASKS (loop_vinfo);
+  vec_loop_masks *masks = (loop_vinfo ? _VINFO_MASKS (loop_vinfo) : NULL);
   if (!vec_stmt) /* transformation not required.  */
 {
   STMT_VINFO_TYPE (stmt_info) = call_vec_info_type;


Re: [RFC 2/3, debug] Add fkeep-vars-live

2018-07-24 Thread Alexandre Oliva
On Jul 24, 2018, Tom de Vries  wrote:

> This patch adds fake uses of user variables at the point where they go out of
> scope, to keep user variables inspectable throughout the application.

I suggest also adding such uses before sets, so that variables aren't
regarded as dead and get optimized out in ranges between the end of a
live range and a subsequent assignment.

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist


[Patch] [Aarch64] PR 86538 - Define __ARM_FEATURE_LSE if LSE is available

2018-07-24 Thread Steve Ellcey
This is a patch for PR 86538, to define an __ARM_FEATURE_LSE macro
when LSE is available.  Richard Earnshaw closed PR 86538 as WONTFIX
because the ACLE (Arm C Language Extension) does not require this
macro and because he is concerned that it might encourage people to
use inline assembly instead of the __sync and atomic intrinsics.
(See actual comments in the defect report.)

While I agree that we want people to use the intrinsics I still think
there are use cases where people may want to know if LSE is available
or not and there is currrently no (simple) way to determine if this feature
is available since it can be turned or and off independently of the
architecture used.  Also, as a general principle, I  think any feature
that can be toggled on or off by the compiler should provide a way for
users to determine what its state is.

So what do other ARM maintainers and users think?  Is this a useful
feature to have in GCC?

Steve Ellcey
sell...@cavium.com


2018-07-24  Steve Ellcey  

PR target/86538
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins):
Add define of __ARM_FEATURE_LSE.


diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 40c738c..e057ba9 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -154,6 +154,9 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
   aarch64_def_or_undef (TARGET_SM4, "__ARM_FEATURE_SM4", pfile);
   aarch64_def_or_undef (TARGET_F16FML, "__ARM_FEATURE_FP16_FML", pfile);
 
+  /* This is not required by ACLE, but it is useful.  */
+  aarch64_def_or_undef (TARGET_LSE, "__ARM_FEATURE_LSE", pfile);
+
   /* Not for ACLE, but required to keep "float.h" correct if we switch
  target between implementations that do or do not support ARMv8.2-A
  16-bit floating-point extensions.  */


Re: [5/5] C-SKY port: libgcc

2018-07-24 Thread Segher Boessenkool
On Mon, Jul 23, 2018 at 10:26:35PM -0600, Sandra Loosemore wrote:
> diff --git a/libgcc/config.host b/libgcc/config.host
> index 18cabaf..b2ee0c9 100644
> --- a/libgcc/config.host
> +++ b/libgcc/config.host
> @@ -94,6 +94,9 @@ am33_2.0-*-linux*)
>  arc*-*-*)
>   cpu_type=arc
>   ;;
> +csky*-*-*)
> + cpu_type=csky
> + ;;
>  arm*-*-*)
>   cpu_type=arm
>   ;;

This long list was alphabetic before (except x86_64 and tic6x, alas);
let's not make things worse?


Segher


Re: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-24 Thread Alexandre Oliva
Hello, Christina,

On Jul 24, 2018, Tamar Christina  wrote:

> gcc/
> 2018-07-24  Tamar Christina  

>   PR target/86486
>   * configure.ac: Add stack-clash-protection-guard-size.
>   * doc/install.texi: Document it.
>   * config.in (DEFAULT_STK_CLASH_GUARD_SIZE): New.
>   * params.def: Update comment for guard-size.
>   * configure: Regenerate.

The configury bits look almost good to me.

I wish the help message, comments and docs expressed somehow that the
given power of two expresses a size in bytes, rather than in kilobytes,
bits or any other unit that might be reasonably assumed to express stack
sizes.  I'm afraid I don't know the best way to accomplish that in a few
words.

> +stk_clash_default=12

This seems to be left-over from an earlier patch, as it is now unused
AFAICT.

Thanks,

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist


Fix ceil_log2(0) (PR 86644)

2018-07-24 Thread Richard Sandiford
This PR shows a pathological case in which we try SLP vectorisation on
dead code.  We record that 0 bits of the result are enough to satisfy
all users (which is true), and that led to precision being 0 in:

static unsigned int
vect_element_precision (unsigned int precision)
{
  precision = 1 << ceil_log2 (precision);
  return MAX (precision, BITS_PER_UNIT);
}

ceil_log2 (0) returned 64 rather than 0, leading to 1 << 64, which is UB.

Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.
OK to install?

Richard


2018-07-24  Richard Sandiford  

gcc/
* hwint.c (ceil_log2): Fix comment.  Return 0 for 0.

Index: gcc/hwint.c
===
--- gcc/hwint.c 2018-05-02 08:38:14.433364094 +0100
+++ gcc/hwint.c 2018-07-24 19:09:03.522774662 +0100
@@ -60,12 +60,12 @@ floor_log2 (unsigned HOST_WIDE_INT x)
   return t;
 }
 
-/* Given X, an unsigned number, return the largest Y such that 2**Y >= X.  */
+/* Given X, an unsigned number, return the least Y such that 2**Y >= X.  */
 
 int
 ceil_log2 (unsigned HOST_WIDE_INT x)
 {
-  return floor_log2 (x - 1) + 1;
+  return x == 0 ? 0 : floor_log2 (x - 1) + 1;
 }
 
 /* Return the logarithm of X, base 2, considering X unsigned,


Re: [PATCH] Introduce instance discriminators

2018-07-24 Thread Alexandre Oliva
On Jul 19, 2018, Richard Biener  wrote:

> Oh, that probably wasn't omitted on purpose.  Cary said it was used
> for profiling but I can't see any such use.

> Is the instance discriminator stuff also used for profiling?

Not that I know, but...  I probably wouldn't know yet ;-)

Anyway, it was easy enough to implement this:

>> I suspect there might be a way to assign instance discriminator numbers
>> to individual function DECLs, and then walk up the lexical block tree to
>> identify the DECL containing the block so as to obtain the discriminator
>> number.

and then, in a subsequent patch, I went ahead and added support for LTO,
saving and recovering discriminator info for instances and, while at
that, for basic blocks too.

Besides sucessfully regstrapping the first two patches on
x86_64-linux-gnu, I have tested this patchset with an additional
bootstrap with the third (throw-away) patch, that adds -gnateS to Ada
compilations in gcc/ada, libada and gnattools.  I also tested the saving
and restoring of discriminators for LTO by manually inspecting the line
number tables in LTO-recompiled executables, to check that they retained
the instance or BB discriminator numbers that went into the non-LTO
object files.

Ok to install the first two patches?  (the third is just for reference)


Introduce instance discriminators

From: Alexandre Oliva 

With -gnateS, the Ada compiler sets itself up to output discriminators
for different instantiations of generics, but the middle and back ends
have lacked support for that.  This patch introduces the missing bits,
translating the GNAT-internal representation of the per-file instance
map to an instance_table that maps decls to instance discriminators.


From: Alexandre Oliva  , Olivier Hainque  

for  gcc/ChangeLog

* debug.h (decl_to_instance_map_t): New type.
(decl_to_instance_map): Declare.
(maybe_create_decl_to_instance_map): New inline function.
* final.c (bb_discriminator, last_bb_discriminator): New statics,
to track basic block discriminators.
(final_start_function_1): Initialize them.
(final_scan_insn_1): On NOTE_INSN_BASIC_BLOCK, track
bb_discriminator.
(decl_to_instance_map): New variable.
(map_decl_to_instance, maybe_set_discriminator): New functions.
(notice_source_line): Set discriminator.

for  gcc/ada

* trans.c: Include debug.h.
(file_map): New static variable.
(gigi): Set it.  Create decl_to_instance_map when needed.
(Subprogram_Body_to_gnu): Pass gnu_subprog_decl to...
(Sloc_to_locus): ... this.  Add decl parm, map it to instance.
* gigi.h (Sloc_to_locus): Adjust declaration.

for  gcc/testsuite/ChangeLog

* gnat.dg/dinst.adb: New.
* gnat.dg/dinst_pkg.ads, gnat.dg/dinst_pkg.adb: New.
---
 gcc/ada/gcc-interface/gigi.h|2 +
 gcc/ada/gcc-interface/trans.c   |   29 ---
 gcc/debug.h |   15 
 gcc/final.c |   70 +--
 gcc/testsuite/gnat.dg/dinst.adb |   20 ++
 gcc/testsuite/gnat.dg/dinst_pkg.adb |7 
 gcc/testsuite/gnat.dg/dinst_pkg.ads |4 ++
 7 files changed, 137 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gnat.dg/dinst.adb
 create mode 100644 gcc/testsuite/gnat.dg/dinst_pkg.adb
 create mode 100644 gcc/testsuite/gnat.dg/dinst_pkg.ads

diff --git a/gcc/ada/gcc-interface/gigi.h b/gcc/ada/gcc-interface/gigi.h
index a75cb9094491..b890195cefc3 100644
--- a/gcc/ada/gcc-interface/gigi.h
+++ b/gcc/ada/gcc-interface/gigi.h
@@ -285,7 +285,7 @@ extern void process_type (Entity_Id gnat_entity);
location and false if it doesn't.  If CLEAR_COLUMN is true, set the column
information to 0.  */
 extern bool Sloc_to_locus (Source_Ptr Sloc, location_t *locus,
-  bool clear_column = false);
+  bool clear_column = false, const_tree decl = 0);
 
 /* Post an error message.  MSG is the error message, properly annotated.
NODE is the node at which to post the error and the node to use for the
diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index 31e098a0c707..0371d00fce18 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -41,6 +41,7 @@
 #include "stmt.h"
 #include "varasm.h"
 #include "output.h"
+#include "debug.h"
 #include "libfuncs.h"  /* For set_stack_check_libfunc.  */
 #include "tree-iterator.h"
 #include "gimplify.h"
@@ -255,6 +256,12 @@ static tree create_init_temporary (const char *, tree, 
tree *, Node_Id);
 static const char *extract_encoding (const char *) ATTRIBUTE_UNUSED;
 static const char *decode_name (const char *) ATTRIBUTE_UNUSED;
 
+/* This makes gigi's file_info_ptr visible in this translation unit,
+   so that Sloc_to_locus can look it up when deciding whether to map
+   decls to instances.  */
+
+static struct File_Info_Type *file_map;
+

Re: [RFC 1/3, debug] Add fdebug-nops

2018-07-24 Thread Alexandre Oliva
On Jul 24, 2018, Tom de Vries  wrote:

> There's a design principle in GCC that code generation and debug generation
> are independent.  This guarantees that if you're encountering a problem in an
> application without debug info, you can recompile it with -g and be certain
> that you can reproduce the same problem, and use the debug info to debug the
> problem.  This invariant is enforced by bootstrap-debug.  The fdebug-nops
> breaks this invariant

I thought of a way to not break it: enable the debug info generation
machinery, including VTA and SFN, but discard those only at the very end
if -g is not enabled.  The downside is that it would likely slow -Og
down significantly, but who uses it without -g anyway?

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist


Re: committed: remove redundant -Wall from -Warray-bounds (PR 82063)

2018-07-24 Thread Franz Sirl

Am 2018-07-24 um 17:35 schrieb Martin Sebor:

On 07/24/2018 03:24 AM, Franz Sirl wrote:

Am 2018-07-20 um 23:22 schrieb Martin Sebor:

As the last observation in PR 82063 Jim points out that

   Both -Warray-bounds and -Warray-bounds= are listed in the c.opt
   file as being enabled by -Wall, but they are the same option,
   and it causes this one option to be processed twice in the
   C_handle_option_auto function in the generated options.c file.
   It gets set to the same value twice, so it does work as intended,
   but this is wasteful.

I have removed the redundant -Wall from the first option and
committed the change as obvious in r262912.


Hi Martin,

this looks related to PR 68845 and my patch in there. I never posted it
to gcc-patches because I couldn't find a definitive answer on how
options duplicated between common.opt and c-family/c.opt are supposed to
be handled.
For example, Warray-bounds in common.opt is a separate option (not an
alias to Warray-bounds=), leading to separate enums for them. Is this
intended? Warray-bounds seemed to be the only option with an equal sign
doing it like that at that time. Now Wcast-align is doing the same...

Can you shed some light on this?


-Warray-bounds= (the form that takes an argument) was added in
r219577.  Before then, only the plain form existed.  If I had
to guess, the interplay between the two options (as opposed to
making the latter an alias for the new option) wasn't considered.
I didn't think of it until now either.  Your patch seems like
the right solution to me.  Let me know if you will submit it.
If not, I posted the patch below that touches this area and
that will likely need updating so I can roll your change into
it:

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01286.html


I'll post a patch tomorrow, since I already have all the changes 
available and tested here.


Note that one minor change with this patch is that with 
-fdiagnostics-show-option the message will show -Warray-bounds= (equal 
sign added) instead of -Warray-bounds.


Franz


Re: committed: remove redundant -Wall from -Warray-bounds (PR 82063)

2018-07-24 Thread Martin Sebor

On 07/24/2018 01:48 PM, Franz Sirl wrote:

Am 2018-07-24 um 17:35 schrieb Martin Sebor:

On 07/24/2018 03:24 AM, Franz Sirl wrote:

Am 2018-07-20 um 23:22 schrieb Martin Sebor:

As the last observation in PR 82063 Jim points out that

   Both -Warray-bounds and -Warray-bounds= are listed in the c.opt
   file as being enabled by -Wall, but they are the same option,
   and it causes this one option to be processed twice in the
   C_handle_option_auto function in the generated options.c file.
   It gets set to the same value twice, so it does work as intended,
   but this is wasteful.

I have removed the redundant -Wall from the first option and
committed the change as obvious in r262912.


Hi Martin,

this looks related to PR 68845 and my patch in there. I never posted it
to gcc-patches because I couldn't find a definitive answer on how
options duplicated between common.opt and c-family/c.opt are supposed to
be handled.
For example, Warray-bounds in common.opt is a separate option (not an
alias to Warray-bounds=), leading to separate enums for them. Is this
intended? Warray-bounds seemed to be the only option with an equal sign
doing it like that at that time. Now Wcast-align is doing the same...

Can you shed some light on this?


-Warray-bounds= (the form that takes an argument) was added in
r219577.  Before then, only the plain form existed.  If I had
to guess, the interplay between the two options (as opposed to
making the latter an alias for the new option) wasn't considered.
I didn't think of it until now either.  Your patch seems like
the right solution to me.  Let me know if you will submit it.
If not, I posted the patch below that touches this area and
that will likely need updating so I can roll your change into
it:

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01286.html


I'll post a patch tomorrow, since I already have all the changes
available and tested here.

Note that one minor change with this patch is that with
-fdiagnostics-show-option the message will show -Warray-bounds= (equal
sign added) instead of -Warray-bounds.


That should be fine.

Martin



Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-24 Thread Martin Sebor

On 07/20/2018 04:20 AM, Richard Biener wrote:

On Thu, 19 Jul 2018, Martin Sebor wrote:


Here's one more update with tweaks addressing a couple more
of Bernd's comments:

1) correct the use of TREE_STRING_LENGTH() where a number of
array elements is expected and not bytes
2) set CHARTYPE as soon as it's first determined rather than
trying to extract it again later


Please look at Bernds followup comments.  One additional note:

I see you are ultimatively using CHARTYPE to get at the size
of the access.  That is wrong.

   if (TREE_CODE (arg) == ADDR_EXPR)
 {
+  tree argtype = TREE_TYPE (arg);
+  chartype = argtype;
+
   arg = TREE_OPERAND (arg, 0);
   tree ref = arg;
   if (TREE_CODE (arg) == ARRAY_REF)
{

so the "access" is of size array_ref_element_size (arg) here.  You
may not simply use TYPE_SIZE_UNIT of sth.

That is, lookign at current trunk,

  if (TREE_CODE (arg) == ADDR_EXPR)
{
  arg = TREE_OPERAND (arg, 0);
  tree ref = arg;
  if (TREE_CODE (arg) == ARRAY_REF)
{
  tree idx = TREE_OPERAND (arg, 1);
  if (TREE_CODE (idx) != INTEGER_CST)
{
  /* Extract the variable index to prevent
 get_addr_base_and_unit_offset() from failing due to
 it.  Use it later to compute the non-constant offset
 into the string and return it to the caller.  */
  varidx = idx;
  ref = TREE_OPERAND (arg, 0);

you should scale the index here by array_ref_element_size (arg).
Or simply rewrite this to instead of using get_addr_base_and_unit_offset,
use get_inner_reference which does all that magic for you.

That is, you shouldn't need chartype.


I've made use of size array_ref_element_size() here as you suggest
and eliminated the type.  For the purposes of testing though,
I haven't been able to come up with a test case that would have
the function return something other than TYPE_SIZE_UNIT().  IIUC,
the function is used to compute the size of elements of overaligned
types and there is no way that I know of to create an array of
overaligned characters.  (If there is a way to exercise this I'd
appreciate a test case so I can add it to the test suite).

I've also fixed the other bug Bernd pointed with pointers to arrays.
The fix seems small enough that it makes sense to handle at the same
time as this bug.

Attached is an update with these changes.

Martin



Richard.



On 07/19/2018 01:49 PM, Martin Sebor wrote:

On 07/19/2018 01:17 AM, Richard Biener wrote:

On Wed, 18 Jul 2018, Martin Sebor wrote:


+  while (TREE_CODE (chartype) != INTEGER_TYPE)
+chartype = TREE_TYPE (chartype);

This is a bit concerning.  First under what conditions is chartype
not
going to be an INTEGER_TYPE?  And under what conditions will
extracting
its type ultimately lead to something that is an INTEGER_TYPE?


chartype is usually (maybe even always) pointer type here:

  const char a[] = "123";
  extern int i;
  n = strlen ([i]);


But your hunch was correct that the loop isn't safe because
the element type need not be an integer (I didn't know/forgot
that the function is called for non-strings too).  The loop
should be replaced by:

  while (TREE_CODE (chartype) == ARRAY_TYPE
 || TREE_CODE (chartype) == POINTER_TYPE)
chartype = TREE_TYPE (chartype);


As this function may be called "late" you need to cope with
the middle-end ignoring type changes and thus happily
passing int *** directly rather than (char *) of that.

Also doesn't the above yield int for int *[]?


I don't think it ever gets this far for either a pointer to
an array of int, or for an array of pointers to int.  So for
something like the following the function fails earlier:

  const int* const a[2] = { ... };
  const char* (const *p)[2] = 

  int f (void)
  {
return __builtin_memcmp (*p, "12345678", 8);
  }

(Assuming this is what you were asking about.)


I guess you really want

   if (POINTER_TYPE_P (chartype))
 chartype = TREE_TYPE (chartype);
   while (TREE_CODE (chartype) == ARRAY_TYPE)
 chartype = TREE_TYPE (chartype);

?


That seems to work too.  Attached is an update with this tweak.
The update also addresses some of Bernd's comments: it removes
the pointless second test in:

if (TREE_CODE (type) == ARRAY_TYPE
&& TREE_CODE (type) != INTEGER_TYPE)

the unused assignment to chartype in:

   else if (DECL_P (arg))
 {
   array = arg;
   chartype = TREE_TYPE (arg);
 }

and calls string_constant() instead of strnlen() to compute
the length of a generic string.

Other improvements  are possible in this area but they are
orthogonal to the bug I'm trying to fix so I'll post separate
patches for some of those.

Martin







PR tree-optimization/86622 - incorrect strlen of array of array plus variable offset
PR tree-optimization/86532 - Wrong code due to a wrong strlen folding starting with r262522

gcc/ChangeLog:

	PR tree-optimization/86622
	PR 

Re: [2/5] C-SKY port: Backend implementation

2018-07-24 Thread Sandra Loosemore

On 07/24/2018 09:45 AM, Jeff Law wrote:

On 07/23/2018 10:21 PM, Sandra Loosemore wrote:

2018-07-23  Jojo  
     Huibin Wang  
     Sandra Loosemore  
     Chung-Lin Tang  

     C-SKY port: Backend implementation

     gcc/
     * config/csky/*: New.
     * common/config/csky/*: New.


Let's avoid gratutious whitespace that attempts to line up conditionals.
   As an example, look at the predicate csky_load_multiple_operation.  I
think just doing a quick pass over the .c, .h and main .md files should
be sufficient here.


OK, will do.


I'm not a big fan of more awk code, but I'm not going to object to it :-)

Why does the port have its own little pass for condition code
optimization (cse_cc)?  What is it doing that can't be done with our
generic optimizers?


This pass was included in the initial patch set we got from C-SKY, and 
as it didn't seem to break anything I left it in.  Perhaps C-SKY can 
provide a testcase that demonstrates why it's still useful in the 
current version of GCC; otherwise we can remove this from the initial 
port submission and restore it later if some performance analysis shows 
it is still worthwhile.



Any thoughts on using the newer function descriptor bits rather than old
style stack trampolines?


Has that been committed?  I vaguely remembered discussion of a new way 
to handle nested functions without using the trampoline interface, but I 
couldn't find any documentation in the internals manual.



I don't see anything terribly concerning in the core of the port.  The
amount of support code for minipool is huge and I wonder if some sharing
across the various ports would be possible, but I don't think that
should be a blocking issue for this port.


Yes, that code was clearly copied almost verbatim from the ARM backend. 
I left it alone as much as possible to simplify any future attempts at 
genericizing it.



Can you update the backends.html web page here appropriately for the
c-sky target?


Sure, I can take care of updating that when the port is committed.  I 
believe the right entry is


"csky  b   ia"


I'd like to take a closer look, but those are the high level comment's
I've got this morning :-)


Thanks.  I'll wait a bit for more comments to come in before preparing a 
revised patch.


-Sandra


Re: [5/5] C-SKY port: libgcc

2018-07-24 Thread Sandra Loosemore

On 07/24/2018 12:10 PM, Segher Boessenkool wrote:

On Mon, Jul 23, 2018 at 10:26:35PM -0600, Sandra Loosemore wrote:

diff --git a/libgcc/config.host b/libgcc/config.host
index 18cabaf..b2ee0c9 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -94,6 +94,9 @@ am33_2.0-*-linux*)
  arc*-*-*)
cpu_type=arc
;;
+csky*-*-*)
+   cpu_type=csky
+   ;;
  arm*-*-*)
cpu_type=arm
;;


This long list was alphabetic before (except x86_64 and tic6x, alas);
let's not make things worse?


Oops!  Good catch on that.  I'll take care of it.

-Sandra


Re: [PATCH] include more detail in -Warray-bounds (PR 86650)

2018-07-24 Thread Martin Sebor

On 07/24/2018 11:05 AM, David Malcolm wrote:

On Mon, 2018-07-23 at 20:56 -0600, Martin Sebor wrote:

On 07/23/2018 07:20 PM, David Malcolm wrote:

On Mon, 2018-07-23 at 17:49 -0600, Martin Sebor wrote:

(David, I'm hoping your your help here.  Please see the end.)

While looking into a recent -Warray-bounds instance in Glibc
involving inlining of large functions it became apparent that
GCC could do a better job of pinpointing the source of
the problem.

The attached patch makes a few adjustments to both
the pretty printer infrastructure and to VRP to make this
possible.  The diagnostic pretty printer already has directives
to print the inlining context for both tree and gcall* arguments,
so most of the changes just adjust things to be able to pass in
gimple* argument instead.

The only slightly interesting change is to print the declaration
to which the out-of-bounds array refers if one is known.

Tested on x86_64-linux with one regression.

The regression is in the gcc.dg/Warray-bounds.c test: the column
numbers of the warnings are off.  Adding the %G specifier to
the array bounds warnings in VRP has the unexpected effect of
expanding the extent of the underling. For instance, for a test
case like this:

   int a[10];

   void f (void)
   {
 a[-1] = 0;
   }

from the expected:

a[-1] = 0;
~^~~~

to this:

   a[-1] = 0;
~~^~~

David, do you have any idea how to avoid this?


Are you referring to the the various places in your patch (in e.g.
  vrp_prop::check_array_ref
  vrp_prop::check_mem_ref
  vrp_prop::search_for_addr_array
) where the patch changed things from this form:

  warning_at (location, OPT_Warray_bounds,
  "[...format string...]", ARGS...);

to this form:

  warning_at (location, OPT_Warray_bounds,
  "%G[...format string...]", stmt, ARGS...);


Yes.



If so, there are two location_t values of interest here:
(a) the "location" value, and
(b) gimple_location (stmt)

My recollection is that %G and %K override the "location" value
passed
in as the first param to the diagnostic call, overwriting it within
the
diagnostic_info's text_info with the location value from the %K/%G
(which also set up the pp_ti_abstract_origin of the text_info from
the
block information stashed in the ad-hoc data part of the location,
so
that the pretty-printer prints the inlining chain).


Would having the pretty printer restore the location and
the block after it's done printing the context and before
processing the rest of the format string fix it?  (I have
only a vague idea how this all works so I'm not sure if
this even makes sense.)


Structurally, it looks like this:

Temporaries during the emission of   |  Long-lived stuff:
the diagnostic:  |
 |+-+
++   ||global_dc|
|diagnostic_info |   |+-+
|++  |   |
||text_info:  |  |   |
||  m_richloc-+--+---> rich_location |
||  x_data+--+---+--> block (via pp_ti_abstract_origin)
|++  |   |
++   |
 |

The location_t of the diagnostic is stored in the rich_location.

Calling:
  warning_at (location)
creates a rich_location wrapping "location" and uses it as above.

During formatting, the %K/%G codes set text_info.x_data via
pp_ti_abstract_origin and overwrite the location_t in the
rich_location.

So in theory we could have a format code that sets the block and
doesn't touch the rich_location.  But that seems like overkill to me.


I wasn't thinking of a new format.  Rather, I thought the %K
would save the current block and location (set by the location
argument to warning_at), then after printing the inlining stack
but before printing the rest of the diagnostic the printer would
restore the saved block and location.  I still don't know enough
to tell if it would work.

In any event, if it's easier to always print the inlining stack
and get rid of %K and %G then that would be preferable.  I don't
think they are used for any other purpose (i.e., they are always
used as the first directive in a format string).


[aside, why don't we always just print the inlining chain?  IIRC,
%K
and %G feel too much like having to jump through hoops to me, given
that gimple_block is looking at gimple_location anyway, why not
just
use the location in the location_t's ad-hoc data; I have a feeling
there's a PR open about this, but I don't have it to hand right
now].


That would make sense to me.  I think that's also what we
agreed would be the way forward the last time we discussed
this.


(nods)


So how do we go about making this happen?  Somewhat selfishly
I was sort of waiting for you to take the lead on it since
you're much more familiar with the code than I am :)  But
that doesn't mean I can't try to tackle it myself.  If it seems
like something you can fit 

Re: [5/5] C-SKY port: libgcc

2018-07-24 Thread Segher Boessenkool
On Tue, Jul 24, 2018 at 12:19:30PM -0600, Sandra Loosemore wrote:
> On 07/24/2018 12:10 PM, Segher Boessenkool wrote:
> >On Mon, Jul 23, 2018 at 10:26:35PM -0600, Sandra Loosemore wrote:
> >>diff --git a/libgcc/config.host b/libgcc/config.host
> >>index 18cabaf..b2ee0c9 100644
> >>--- a/libgcc/config.host
> >>+++ b/libgcc/config.host
> >>@@ -94,6 +94,9 @@ am33_2.0-*-linux*)
> >>  arc*-*-*)
> >>cpu_type=arc
> >>;;
> >>+csky*-*-*)
> >>+   cpu_type=csky
> >>+   ;;
> >>  arm*-*-*)
> >>cpu_type=arm
> >>;;
> >
> >This long list was alphabetic before (except x86_64 and tic6x, alas);
> >let's not make things worse?
> 
> Oops!  Good catch on that.  I'll take care of it.

Thanks!

Rest looks fine fwiw (I just skimmed it, and I cannot read .gz files
without some effort, so take it for what it's worth: not too much ;-) ).


Segher


[PATCH] Fix a missing case of PR 21458 similar to fc6141f097056f830a412afebed8d81a9d72b696.

2018-07-24 Thread Robert Schiele
The original fix for PR 21458 was causing some issues, which were
addressed to be fixed with a follow-up fix
fc6141f097056f830a412afebed8d81a9d72b696.  Unfortunately that follow-up
fix missed one case, which is handled by this fix.

Change-Id: Ie32e3f2514b3e4b6b35c0a693de6b65ef010bb9d
---
 gas/config/tc-arm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gas/config/tc-arm.c b/gas/config/tc-arm.c
index feb725d..c92b6ef 100644
--- a/gas/config/tc-arm.c
+++ b/gas/config/tc-arm.c
@@ -10836,11 +10836,12 @@ do_t_adr (void)
   inst.instruction |= Rd << 4;
 }
 
-  if (inst.reloc.exp.X_op == O_symbol
+  if (support_interwork
+  && inst.reloc.exp.X_op == O_symbol
   && inst.reloc.exp.X_add_symbol != NULL
   && S_IS_DEFINED (inst.reloc.exp.X_add_symbol)
   && THUMB_IS_FUNC (inst.reloc.exp.X_add_symbol))
-inst.reloc.exp.X_add_number += 1;
+inst.reloc.exp.X_add_number |= 1;
 }
 
 /* Arithmetic instructions for which there is just one 16-bit
-- 
2.4.6


[01/46] Move special cases out of get_initial_def_for_reduction

2018-07-24 Thread Richard Sandiford
This minor clean-up avoids repeating the test for double reductions
and also moves the vect_get_vec_def_for_operand call to the same
function as the corresponding vect_get_vec_def_for_stmt_copy.


2018-07-24  Richard Sandiford  

gcc/
* tree-vect-loop.c (get_initial_def_for_reduction): Move special
cases for nested loops from here to ...
(vect_create_epilog_for_reduction): ...here.  Only call
vect_is_simple_use for inner-loop reductions.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-13 10:11:14.429843575 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:02.965552667 +0100
@@ -4113,10 +4113,8 @@ get_initial_def_for_reduction (gimple *s
   enum tree_code code = gimple_assign_rhs_code (stmt);
   tree def_for_init;
   tree init_def;
-  bool nested_in_vect_loop = false;
   REAL_VALUE_TYPE real_init_val = dconst0;
   int int_init_val = 0;
-  gimple *def_stmt = NULL;
   gimple_seq stmts = NULL;
 
   gcc_assert (vectype);
@@ -4124,39 +4122,12 @@ get_initial_def_for_reduction (gimple *s
   gcc_assert (POINTER_TYPE_P (scalar_type) || INTEGRAL_TYPE_P (scalar_type)
  || SCALAR_FLOAT_TYPE_P (scalar_type));
 
-  if (nested_in_vect_loop_p (loop, stmt))
-nested_in_vect_loop = true;
-  else
-gcc_assert (loop == (gimple_bb (stmt))->loop_father);
-
-  /* In case of double reduction we only create a vector variable to be put
- in the reduction phi node.  The actual statement creation is done in
- vect_create_epilog_for_reduction.  */
-  if (adjustment_def && nested_in_vect_loop
-  && TREE_CODE (init_val) == SSA_NAME
-  && (def_stmt = SSA_NAME_DEF_STMT (init_val))
-  && gimple_code (def_stmt) == GIMPLE_PHI
-  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt))
-  && vinfo_for_stmt (def_stmt)
-  && STMT_VINFO_DEF_TYPE (vinfo_for_stmt (def_stmt))
-  == vect_double_reduction_def)
-{
-  *adjustment_def = NULL;
-  return vect_create_destination_var (init_val, vectype);
-}
+  gcc_assert (nested_in_vect_loop_p (loop, stmt)
+ || loop == (gimple_bb (stmt))->loop_father);
 
   vect_reduction_type reduction_type
 = STMT_VINFO_VEC_REDUCTION_TYPE (stmt_vinfo);
 
-  /* In case of a nested reduction do not use an adjustment def as
- that case is not supported by the epilogue generation correctly
- if ncopies is not one.  */
-  if (adjustment_def && nested_in_vect_loop)
-{
-  *adjustment_def = NULL;
-  return vect_get_vec_def_for_operand (init_val, stmt);
-}
-
   switch (code)
 {
 case WIDEN_SUM_EXPR:
@@ -4586,9 +4557,22 @@ vect_create_epilog_for_reduction (vec

[00/46] Remove vinfo_for_stmt etc.

2018-07-24 Thread Richard Sandiford
The aim of this series is to:

(a) make the vectoriser refer to statements using its own expanded
stmt_vec_info rather than the underlying gimple stmt.  This reduces
the number of stmt lookups from 480 in current sources to under 100.

(b) make the remaining lookups relative the owning vec_info rather than
to global state.

The original motivation was to make it more natural to have multiple
vec_infos live at once.

The series is a clean-up only in a data structure sense.  It certainly
doesn't make the code prettier, and in the end it only shaves 120 LOC
in total.  But I think it should make it easier to do follow-on clean-ups.

The series was pretty tedious to write to and will be pretty tedious
to review, sorry.

I tested each individual patch on aarch64-linux-gnu and the series as a
whole on aarch64-linux-gnu with SVE, aarch64_be-elf and x86_64-linux-gnu.
I also built and tested at least one target per CPU directory, made sure
that there were no new warnings, and checked for differences in assembly
output for gcc.dg, g++.dg and gcc.c-torture.  There were a couple of
cases in vect-alias-check-* of equality comparisons using the opposite
operand order, which is an unrelated problem.  There were no other
differences.

OK to install?

Thanks,
Richard


[06/46] Add vec_info::add_stmt

2018-07-24 Thread Richard Sandiford
This patch adds a vec_info function for allocating and setting
stmt_vec_infos.  It's the start of a long process of removing
the global stmt_vec_info array.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (stmt_vec_info): Move typedef earlier in file.
(vec_info::add_stmt): Declare.
* tree-vectorizer.c (vec_info::add_stmt): New function.
* tree-vect-data-refs.c (vect_create_data_ref_ptr): Use it.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Likewise.
(vect_create_epilog_for_reduction, vectorizable_reduction): Likewise.
(vectorizable_induction): Likewise.
* tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise.
* tree-vect-stmts.c (vect_finish_stmt_generation_1): Likewise.
(vectorizable_simd_clone_call, vectorizable_store): Likewise.
(vectorizable_load): Likewise.
* tree-vect-patterns.c (vect_init_pattern_stmt): Likewise.
(vect_recog_bool_pattern, vect_recog_mask_conversion_pattern)
(vect_recog_gather_scatter_pattern): Likewise.
(append_pattern_def_seq): Likewise.  Remove a check that is
performed by add_stmt itself.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:09.237496975 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:19.809403100 +0100
@@ -25,6 +25,8 @@ #define GCC_TREE_VECTORIZER_H
 #include "tree-hash-traits.h"
 #include "target.h"
 
+typedef struct _stmt_vec_info *stmt_vec_info;
+
 /* Used for naming of new temporaries.  */
 enum vect_var_kind {
   vect_simple_var,
@@ -215,6 +217,8 @@ struct vec_info {
   vec_info (vec_kind, void *, vec_info_shared *);
   ~vec_info ();
 
+  stmt_vec_info add_stmt (gimple *);
+
   /* The type of vectorization.  */
   vec_kind kind;
 
@@ -761,7 +765,7 @@ struct dataref_aux {
 
 typedef struct data_reference *dr_p;
 
-typedef struct _stmt_vec_info {
+struct _stmt_vec_info {
 
   enum stmt_vec_info_type type;
 
@@ -914,7 +918,7 @@ typedef struct _stmt_vec_info {
  and OPERATION_BITS without changing the result.  */
   unsigned int operation_precision;
   signop operation_sign;
-} *stmt_vec_info;
+};
 
 /* Information about a gather/scatter call.  */
 struct gather_scatter_info {
Index: gcc/tree-vectorizer.c
===
--- gcc/tree-vectorizer.c   2018-07-24 10:22:09.237496975 +0100
+++ gcc/tree-vectorizer.c   2018-07-24 10:22:19.809403100 +0100
@@ -507,6 +507,17 @@ vec_info_shared::check_datarefs ()
   gcc_unreachable ();
 }
 
+/* Record that STMT belongs to the vectorizable region.  Create and return
+   an associated stmt_vec_info.  */
+
+stmt_vec_info
+vec_info::add_stmt (gimple *stmt)
+{
+  stmt_vec_info res = new_stmt_vec_info (stmt, this);
+  set_vinfo_for_stmt (stmt, res);
+  return res;
+}
+
 /* A helper function to free scev and LOOP niter information, as well as
clear loop constraint LOOP_C_FINITE.  */
 
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-23 15:56:47.0 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:22:19.801403171 +0100
@@ -4850,7 +4850,7 @@ vect_create_data_ref_ptr (gimple *stmt,
 aggr_ptr, loop, _gsi, insert_after,
 _before_incr, _after_incr);
   incr = gsi_stmt (incr_gsi);
-  set_vinfo_for_stmt (incr, new_stmt_vec_info (incr, loop_vinfo));
+  loop_vinfo->add_stmt (incr);
 
   /* Copy the points-to information if it exists. */
   if (DR_PTR_INFO (dr))
@@ -4880,7 +4880,7 @@ vect_create_data_ref_ptr (gimple *stmt,
 containing_loop, _gsi, insert_after, _before_incr,
 _after_incr);
   incr = gsi_stmt (incr_gsi);
-  set_vinfo_for_stmt (incr, new_stmt_vec_info (incr, loop_vinfo));
+  loop_vinfo->add_stmt (incr);
 
   /* Copy the points-to information if it exists. */
   if (DR_PTR_INFO (dr))
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:22:16.421433184 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:19.801403171 +0100
@@ -845,14 +845,14 @@ _loop_vec_info::_loop_vec_info (struct l
{
  gimple *phi = gsi_stmt (si);
  gimple_set_uid (phi, 0);
- set_vinfo_for_stmt (phi, new_stmt_vec_info (phi, this));
+ add_stmt (phi);
}
 
   for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next ())
{
  gimple *stmt = gsi_stmt (si);
  gimple_set_uid (stmt, 0);
- set_vinfo_for_stmt (stmt, new_stmt_vec_info (stmt, this));
+ add_stmt (stmt);
}
 }
   free (body);
@@ -4665,8 +4665,7 @@ vect_create_epilog_for_reduction (vecheader);
-  set_vinfo_for_stmt (new_phi,
- new_stmt_vec_info (new_phi, loop_vinfo));
+  

[07/46] Add vec_info::lookup_stmt

2018-07-24 Thread Richard Sandiford
This patch adds a vec_info replacement for vinfo_for_stmt.  The main
difference is that the new routine can cope with arbitrary statements,
so there's no need to call vect_stmt_in_region_p first.

The patch only converts calls that are still needed at the end of the
series.  Later patches get rid of most other calls to vinfo_for_stmt.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vec_info::lookup_stmt): Declare.
* tree-vectorizer.c (vec_info::lookup_stmt): New function.
* tree-vect-loop.c (vect_determine_vf_for_stmt): Use it instead
of vinfo_for_stmt.
(vect_determine_vectorization_factor, vect_analyze_scalar_cycles_1)
(vect_compute_single_scalar_iteration_cost, vect_analyze_loop_form)
(vect_update_vf_for_slp, vect_analyze_loop_operations)
(vect_is_slp_reduction, vectorizable_induction)
(vect_transform_loop_stmt, vect_transform_loop): Likewise.
* tree-vect-patterns.c (vect_init_pattern_stmt):
(vect_determine_min_output_precision_1, vect_determine_precisions)
(vect_pattern_recog): Likewise.
* tree-vect-stmts.c (vect_analyze_stmt, vect_transform_stmt): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_density_test): Likewise.
* config/rs6000/rs6000.c (rs6000_density_test): Likewise.
* tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Likewise.
(vect_detect_hybrid_slp_1, vect_detect_hybrid_slp_2)
(vect_detect_hybrid_slp): Likewise.  Change the walk_stmt_info
info field from a loop to a loop_vec_info.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:19.809403100 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:23.797367688 +0100
@@ -218,6 +218,7 @@ struct vec_info {
   ~vec_info ();
 
   stmt_vec_info add_stmt (gimple *);
+  stmt_vec_info lookup_stmt (gimple *);
 
   /* The type of vectorization.  */
   vec_kind kind;
Index: gcc/tree-vectorizer.c
===
--- gcc/tree-vectorizer.c   2018-07-24 10:22:19.809403100 +0100
+++ gcc/tree-vectorizer.c   2018-07-24 10:22:23.797367688 +0100
@@ -518,6 +518,23 @@ vec_info::add_stmt (gimple *stmt)
   return res;
 }
 
+/* If STMT has an associated stmt_vec_info, return that vec_info, otherwise
+   return null.  It is safe to call this function on any statement, even if
+   it might not be part of the vectorizable region.  */
+
+stmt_vec_info
+vec_info::lookup_stmt (gimple *stmt)
+{
+  unsigned int uid = gimple_uid (stmt);
+  if (uid > 0 && uid - 1 < stmt_vec_infos.length ())
+{
+  stmt_vec_info res = stmt_vec_infos[uid - 1];
+  if (res && res->stmt == stmt)
+   return res;
+}
+  return NULL;
+}
+
 /* A helper function to free scev and LOOP niter information, as well as
clear loop constraint LOOP_C_FINITE.  */
 
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:22:19.801403171 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:23.793367723 +0100
@@ -213,6 +213,7 @@ vect_determine_vf_for_stmt_1 (stmt_vec_i
 vect_determine_vf_for_stmt (stmt_vec_info stmt_info, poly_uint64 *vf,
vec *mask_producers)
 {
+  vec_info *vinfo = stmt_info->vinfo;
   if (dump_enabled_p ())
 {
   dump_printf_loc (MSG_NOTE, vect_location, "==> examining statement: ");
@@ -231,7 +232,7 @@ vect_determine_vf_for_stmt (stmt_vec_inf
   for (gimple_stmt_iterator si = gsi_start (pattern_def_seq);
   !gsi_end_p (si); gsi_next ())
{
- stmt_vec_info def_stmt_info = vinfo_for_stmt (gsi_stmt (si));
+ stmt_vec_info def_stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location,
@@ -306,7 +307,7 @@ vect_determine_vectorization_factor (loo
   gsi_next ())
{
  phi = si.phi ();
- stmt_info = vinfo_for_stmt (phi);
+ stmt_info = loop_vinfo->lookup_stmt (phi);
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location, "==> examining phi: ");
@@ -366,7 +367,7 @@ vect_determine_vectorization_factor (loo
   for (gimple_stmt_iterator si = gsi_start_bb (bb); !gsi_end_p (si);
   gsi_next ())
{
- stmt_info = vinfo_for_stmt (gsi_stmt (si));
+ stmt_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
  if (!vect_determine_vf_for_stmt (stmt_info, _factor,
   _producers))
return false;
@@ -487,7 +488,7 @@ vect_analyze_scalar_cycles_1 (loop_vec_i
   gphi *phi = gsi.phi ();
   tree access_fn = NULL;
   tree def = PHI_RESULT (phi);
-  stmt_vec_info stmt_vinfo = vinfo_for_stmt (phi);
+  stmt_vec_info stmt_vinfo = loop_vinfo->lookup_stmt 

[02/46] Remove dead vectorizable_reduction code

2018-07-24 Thread Richard Sandiford
vectorizable_reduction has old code to cope with cases in which the
given statement belongs to a reduction group but isn't the first statement.
That can no longer happen, since all statements in the group go into the
same SLP node, and we only check the first statement in each node.

The point is to remove the only path through vectorizable_reduction
in which stmt and stmt_info refer to different statements.


2018-07-24  Richard Sandiford  

gcc/
* tree-vect-loop.c (vectorizable_reduction): Assert that the
function is not called for second and subsequent members of
a reduction group.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:22:02.965552667 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:06.269523330 +0100
@@ -6162,7 +6162,6 @@ vectorizable_reduction (gimple *stmt, gi
   auto_vec phis;
   int vec_num;
   tree def0, tem;
-  bool first_p = true;
   tree cr_index_scalar_type = NULL_TREE, cr_index_vector_type = NULL_TREE;
   tree cond_reduc_val = NULL_TREE;
 
@@ -6178,15 +6177,8 @@ vectorizable_reduction (gimple *stmt, gi
   nested_cycle = true;
 }
 
-  /* In case of reduction chain we switch to the first stmt in the chain, but
- we don't update STMT_INFO, since only the last stmt is marked as reduction
- and has reduction properties.  */
-  if (REDUC_GROUP_FIRST_ELEMENT (stmt_info)
-  && REDUC_GROUP_FIRST_ELEMENT (stmt_info) != stmt)
-{
-  stmt = REDUC_GROUP_FIRST_ELEMENT (stmt_info);
-  first_p = false;
-}
+  if (REDUC_GROUP_FIRST_ELEMENT (stmt_info))
+gcc_assert (slp_node && REDUC_GROUP_FIRST_ELEMENT (stmt_info) == stmt);
 
   if (gimple_code (stmt) == GIMPLE_PHI)
 {
@@ -7050,8 +7042,7 @@ vectorizable_reduction (gimple *stmt, gi
 
   if (!vec_stmt) /* transformation not required.  */
 {
-  if (first_p)
-   vect_model_reduction_cost (stmt_info, reduc_fn, ncopies, cost_vec);
+  vect_model_reduction_cost (stmt_info, reduc_fn, ncopies, cost_vec);
   if (loop_vinfo && LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo))
{
  if (reduction_type != FOLD_LEFT_REDUCTION


[03/46] Remove unnecessary update of NUM_SLP_USES

2018-07-24 Thread Richard Sandiford
vect_free_slp_tree had:

  gimple *stmt;
  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt)
/* After transform some stmts are removed and thus their vinfo is gone.  */
if (vinfo_for_stmt (stmt))
  {
gcc_assert (STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmt)) > 0);
STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmt))--;
  }

But after transform this update is redundant even for statements that do
exist, so it seems better to skip this loop for the final teardown.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_free_slp_instance): Add a final_p parameter.
* tree-vect-slp.c (vect_free_slp_tree): Likewise.  Don't update
STMT_VINFO_NUM_SLP_USES when it's true.
(vect_free_slp_instance): Add a final_p parameter and pass it to
vect_free_slp_tree.
(vect_build_slp_tree_2): Update call to vect_free_slp_instance.
(vect_analyze_slp_instance): Likewise.
(vect_slp_analyze_operations): Likewise.
(vect_slp_analyze_bb_1): Likewise.
* tree-vectorizer.c (vec_info): Likewise.
* tree-vect-loop.c (vect_transform_loop): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-03 10:59:30.480481417 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:09.237496975 +0100
@@ -1634,7 +1634,7 @@ extern int vect_get_known_peeling_cost (
 extern tree cse_and_gimplify_to_preheader (loop_vec_info, tree);
 
 /* In tree-vect-slp.c.  */
-extern void vect_free_slp_instance (slp_instance);
+extern void vect_free_slp_instance (slp_instance, bool);
 extern bool vect_transform_slp_perm_load (slp_tree, vec ,
  gimple_stmt_iterator *, poly_uint64,
  slp_instance, bool, unsigned *);
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c 2018-07-23 16:58:06.0 +0100
+++ gcc/tree-vect-slp.c 2018-07-24 10:22:09.237496975 +0100
@@ -47,25 +47,32 @@ Software Foundation; either version 3, o
 #include "internal-fn.h"
 
 
-/* Recursively free the memory allocated for the SLP tree rooted at NODE.  */
+/* Recursively free the memory allocated for the SLP tree rooted at NODE.
+   FINAL_P is true if we have vectorized the instance or if we have
+   made a final decision not to vectorize the statements in any way.  */
 
 static void
-vect_free_slp_tree (slp_tree node)
+vect_free_slp_tree (slp_tree node, bool final_p)
 {
   int i;
   slp_tree child;
 
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
-vect_free_slp_tree (child);
+vect_free_slp_tree (child, final_p);
 
-  gimple *stmt;
-  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt)
-/* After transform some stmts are removed and thus their vinfo is gone.  */
-if (vinfo_for_stmt (stmt))
-  {
-   gcc_assert (STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmt)) > 0);
-   STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmt))--;
-  }
+  /* Don't update STMT_VINFO_NUM_SLP_USES if it isn't relevant.
+ Some statements might no longer exist, after having been
+ removed by vect_transform_stmt.  Updating the remaining
+ statements would be redundant.  */
+  if (!final_p)
+{
+  gimple *stmt;
+  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt)
+   {
+ gcc_assert (STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmt)) > 0);
+ STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmt))--;
+   }
+}
 
   SLP_TREE_CHILDREN (node).release ();
   SLP_TREE_SCALAR_STMTS (node).release ();
@@ -76,12 +83,14 @@ vect_free_slp_tree (slp_tree node)
 }
 
 
-/* Free the memory allocated for the SLP instance.  */
+/* Free the memory allocated for the SLP instance.  FINAL_P is true if we
+   have vectorized the instance or if we have made a final decision not
+   to vectorize the statements in any way.  */
 
 void
-vect_free_slp_instance (slp_instance instance)
+vect_free_slp_instance (slp_instance instance, bool final_p)
 {
-  vect_free_slp_tree (SLP_INSTANCE_TREE (instance));
+  vect_free_slp_tree (SLP_INSTANCE_TREE (instance), final_p);
   SLP_INSTANCE_LOADS (instance).release ();
   free (instance);
 }
@@ -1284,7 +1293,7 @@ vect_build_slp_tree_2 (vec_info *vinfo,
   if (++this_tree_size > max_tree_size)
{
  FOR_EACH_VEC_ELT (children, j, child)
-   vect_free_slp_tree (child);
+   vect_free_slp_tree (child, false);
  vect_free_oprnd_info (oprnds_info);
  return NULL;
}
@@ -1315,7 +1324,7 @@ vect_build_slp_tree_2 (vec_info *vinfo,
  this_loads.truncate (old_nloads);
  this_tree_size = old_tree_size;
  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (child), j, grandchild)
-   vect_free_slp_tree (grandchild);
+   vect_free_slp_tree (grandchild, false);
   

[04/46] Factor out the test for a valid reduction input

2018-07-24 Thread Richard Sandiford
vect_is_slp_reduction and vect_is_simple_reduction had two instances
each of:

  && (is_gimple_assign (def_stmt)
  || is_gimple_call (def_stmt)
  || STMT_VINFO_DEF_TYPE (vinfo_for_stmt (def_stmt))
   == vect_induction_def
  || (gimple_code (def_stmt) == GIMPLE_PHI
  && STMT_VINFO_DEF_TYPE (vinfo_for_stmt (def_stmt))
  == vect_internal_def
  && !is_loop_header_bb_p (gimple_bb (def_stmt)

This patch splits it out in a subroutine.


2018-07-24  Richard Sandiford  

gcc/
* tree-vect-loop.c (vect_valid_reduction_input_p): New function,
split out from...
(vect_is_slp_reduction): ...here...
(vect_is_simple_reduction): ...and here.  Remove repetition of tests
that are already known to be false.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:22:09.237496975 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:12.737465897 +0100
@@ -2501,6 +2501,21 @@ report_vect_op (dump_flags_t msg_type, g
   dump_gimple_stmt (msg_type, TDF_SLIM, stmt, 0);
 }
 
+/* DEF_STMT occurs in a loop that contains a potential reduction operation.
+   Return true if the results of DEF_STMT are something that can be
+   accumulated by such a reduction.  */
+
+static bool
+vect_valid_reduction_input_p (gimple *def_stmt)
+{
+  stmt_vec_info def_stmt_info = vinfo_for_stmt (def_stmt);
+  return (is_gimple_assign (def_stmt)
+ || is_gimple_call (def_stmt)
+ || STMT_VINFO_DEF_TYPE (def_stmt_info) == vect_induction_def
+ || (gimple_code (def_stmt) == GIMPLE_PHI
+ && STMT_VINFO_DEF_TYPE (def_stmt_info) == vect_internal_def
+ && !is_loop_header_bb_p (gimple_bb (def_stmt;
+}
 
 /* Detect SLP reduction of the form:
 
@@ -2624,16 +2639,9 @@ vect_is_slp_reduction (loop_vec_info loo
 ("vect_internal_def"), or it's an induction (defined by a
 loop-header phi-node).  */
   if (def_stmt
-  && gimple_bb (def_stmt)
+ && gimple_bb (def_stmt)
  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt))
-  && (is_gimple_assign (def_stmt)
-  || is_gimple_call (def_stmt)
-  || STMT_VINFO_DEF_TYPE (vinfo_for_stmt (def_stmt))
-   == vect_induction_def
-  || (gimple_code (def_stmt) == GIMPLE_PHI
-  && STMT_VINFO_DEF_TYPE (vinfo_for_stmt (def_stmt))
-  == vect_internal_def
-  && !is_loop_header_bb_p (gimple_bb (def_stmt)
+ && vect_valid_reduction_input_p (def_stmt))
{
  lhs = gimple_assign_lhs (next_stmt);
  next_stmt = REDUC_GROUP_NEXT_ELEMENT (vinfo_for_stmt (next_stmt));
@@ -2654,16 +2662,9 @@ vect_is_slp_reduction (loop_vec_info loo
 ("vect_internal_def"), or it's an induction (defined by a
 loop-header phi-node).  */
   if (def_stmt
-  && gimple_bb (def_stmt)
+ && gimple_bb (def_stmt)
  && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt))
-  && (is_gimple_assign (def_stmt)
-  || is_gimple_call (def_stmt)
-  || STMT_VINFO_DEF_TYPE (vinfo_for_stmt (def_stmt))
-  == vect_induction_def
-  || (gimple_code (def_stmt) == GIMPLE_PHI
-  && STMT_VINFO_DEF_TYPE (vinfo_for_stmt (def_stmt))
-  == vect_internal_def
-  && !is_loop_header_bb_p (gimple_bb (def_stmt)
+ && vect_valid_reduction_input_p (def_stmt))
{
  if (dump_enabled_p ())
{
@@ -3196,15 +3197,7 @@ vect_is_simple_reduction (loop_vec_info
   && (code == COND_EXPR
  || !def1 || gimple_nop_p (def1)
  || !flow_bb_inside_loop_p (loop, gimple_bb (def1))
-  || (def1 && flow_bb_inside_loop_p (loop, gimple_bb (def1))
-  && (is_gimple_assign (def1)
- || is_gimple_call (def1)
- || STMT_VINFO_DEF_TYPE (vinfo_for_stmt (def1))
-  == vect_induction_def
- || (gimple_code (def1) == GIMPLE_PHI
- && STMT_VINFO_DEF_TYPE (vinfo_for_stmt (def1))
-  == vect_internal_def
- && !is_loop_header_bb_p (gimple_bb (def1)))
+ || vect_valid_reduction_input_p (def1)))
 {
   if (dump_enabled_p ())
report_vect_op (MSG_NOTE, def_stmt, "detected reduction: ");
@@ -3215,15 +3208,7 @@ vect_is_simple_reduction (loop_vec_info
   && (code == COND_EXPR
  || !def2 || gimple_nop_p (def2)
  || !flow_bb_inside_loop_p (loop, gimple_bb (def2))
- || (def2 

[05/46] Fix make_ssa_name call in vectorizable_reduction

2018-07-24 Thread Richard Sandiford
The usual vectoriser dance to create new assignments is:

new_stmt = gimple_build_assign (vec_dest, ...);
new_temp = make_ssa_name (vec_dest, new_stmt);
gimple_assign_set_lhs (new_stmt, new_temp);

but one site in vectorizable_reduction used:

new_temp = make_ssa_name (vec_dest, new_stmt);

before creating new_stmt.

This method of creating statements probably needs cleaning up, but
that's for another day...


2018-07-24  Richard Sandiford  

gcc/
* tree-vect-loop.c (vectorizable_reduction): Fix an instance in
which make_ssa_name was called with new_stmt before new_stmt
had been created.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:22:12.737465897 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:16.421433184 +0100
@@ -7210,9 +7210,10 @@ vectorizable_reduction (gimple *stmt, gi
  if (op_type == ternary_op)
vop[2] = vec_oprnds2[i];
 
- new_temp = make_ssa_name (vec_dest, new_stmt);
- new_stmt = gimple_build_assign (new_temp, code,
+ new_stmt = gimple_build_assign (vec_dest, code,
  vop[0], vop[1], vop[2]);
+ new_temp = make_ssa_name (vec_dest, new_stmt);
+ gimple_assign_set_lhs (new_stmt, new_temp);
}
  vect_finish_stmt_generation (stmt, new_stmt, gsi);
 


[20/46] Make *FIRST_ELEMENT and *NEXT_ELEMENT stmt_vec_infos

2018-07-24 Thread Richard Sandiford
This patch changes {REDUC,DR}_GROUP_{FIRST,NEXT} element from a
gimple stmt to stmt_vec_info.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_stmt_vec_info::first_element): Change from
a gimple stmt to a stmt_vec_info.
(_stmt_vec_info::next_element): Likewise.
* tree-vect-data-refs.c (vect_update_misalignment_for_peel)
(vect_slp_analyze_and_verify_node_alignment)
(vect_analyze_group_access_1, vect_analyze_group_access)
(vect_small_gap_p, vect_prune_runtime_alias_test_list)
(vect_create_data_ref_ptr, vect_record_grouped_load_vectors)
(vect_supportable_dr_alignment): Update accordingly.
* tree-vect-loop.c (vect_fixup_reduc_chain): Likewise.
(vect_fixup_scalar_cycles_with_patterns, vect_is_slp_reduction)
(vect_is_simple_reduction, vectorizable_reduction): Likewise.
* tree-vect-patterns.c (vect_reassociating_reduction_p): Likewise.
* tree-vect-slp.c (vect_build_slp_tree_1)
(vect_attempt_slp_rearrange_stmts, vect_supported_load_permutation_p)
(vect_split_slp_store_group, vect_analyze_slp_instance)
(vect_analyze_slp, vect_transform_slp_perm_load): Likewise.
* tree-vect-stmts.c (vect_model_store_cost, vect_model_load_cost)
(get_group_load_store_type, get_load_store_type)
(get_group_alias_ptr_type, vectorizable_store, vectorizable_load)
(vect_transform_stmt, vect_remove_stores): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:23:04.033010396 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:23:08.536970400 +0100
@@ -871,9 +871,9 @@ struct _stmt_vec_info {
 
   /* Interleaving and reduction chains info.  */
   /* First element in the group.  */
-  gimple *first_element;
+  stmt_vec_info first_element;
   /* Pointer to the next element in the group.  */
-  gimple *next_element;
+  stmt_vec_info next_element;
   /* For data-refs, in case that two or more stmts share data-ref, this is the
  pointer to the previously detected stmt with the same dr.  */
   gimple *same_dr_stmt;
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:23:04.029010432 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:23:08.532970436 +0100
@@ -1077,7 +1077,7 @@ vect_update_misalignment_for_peel (struc
  /* For interleaved data accesses the step in the loop must be multiplied by
  the size of the interleaving group.  */
   if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
-dr_size *= DR_GROUP_SIZE (vinfo_for_stmt (DR_GROUP_FIRST_ELEMENT 
(stmt_info)));
+dr_size *= DR_GROUP_SIZE (DR_GROUP_FIRST_ELEMENT (stmt_info));
   if (STMT_VINFO_GROUPED_ACCESS (peel_stmt_info))
 dr_peel_size *= DR_GROUP_SIZE (peel_stmt_info);
 
@@ -2370,12 +2370,11 @@ vect_slp_analyze_and_verify_node_alignme
  the node is permuted in which case we start from the first
  element in the group.  */
   stmt_vec_info first_stmt_info = SLP_TREE_SCALAR_STMTS (node)[0];
-  gimple *first_stmt = first_stmt_info->stmt;
   data_reference_p first_dr = STMT_VINFO_DATA_REF (first_stmt_info);
   if (SLP_TREE_LOAD_PERMUTATION (node).exists ())
-first_stmt = DR_GROUP_FIRST_ELEMENT (first_stmt_info);
+first_stmt_info = DR_GROUP_FIRST_ELEMENT (first_stmt_info);
 
-  data_reference_p dr = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt));
+  data_reference_p dr = STMT_VINFO_DATA_REF (first_stmt_info);
   vect_compute_data_ref_alignment (dr);
   /* For creating the data-ref pointer we need alignment of the
  first element anyway.  */
@@ -2520,11 +2519,11 @@ vect_analyze_group_access_1 (struct data
   if (DR_GROUP_FIRST_ELEMENT (stmt_info) == stmt_info)
 {
   /* First stmt in the interleaving chain. Check the chain.  */
-  gimple *next = DR_GROUP_NEXT_ELEMENT (stmt_info);
+  stmt_vec_info next = DR_GROUP_NEXT_ELEMENT (stmt_info);
   struct data_reference *data_ref = dr;
   unsigned int count = 1;
   tree prev_init = DR_INIT (data_ref);
-  gimple *prev = stmt_info;
+  stmt_vec_info prev = stmt_info;
   HOST_WIDE_INT diff, gaps = 0;
 
   /* By construction, all group members have INTEGER_CST DR_INITs.  */
@@ -2535,8 +2534,7 @@ vect_analyze_group_access_1 (struct data
  stmt, and the rest get their vectorized loads from the first
  one.  */
   if (!tree_int_cst_compare (DR_INIT (data_ref),
- DR_INIT (STMT_VINFO_DATA_REF (
-  vinfo_for_stmt (next)
+DR_INIT (STMT_VINFO_DATA_REF (next
 {
   if (DR_IS_WRITE (data_ref))
 {
@@ -2550,16 +2548,16 @@ vect_analyze_group_access_1 (struct data
dump_printf_loc (MSG_MISSED_OPTIMIZATION, 

[18/46] Make SLP_TREE_SCALAR_STMTS a vec

2018-07-24 Thread Richard Sandiford
This patch changes SLP_TREE_SCALAR_STMTS from a vec to
a vec.  It's longer than the previous conversions
but mostly mechanical.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_slp_tree::stmts): Change from a vec
to a vec.
* tree-vect-slp.c (vect_free_slp_tree): Update accordingly.
(vect_create_new_slp_node): Take a vec instead of a
vec.
(_slp_oprnd_info::def_stmts): Change from a vec
to a vec.
(bst_traits::value_type, bst_traits::value_type): Likewise.
(bst_traits::hash): Update accordingly.
(vect_get_and_check_slp_defs): Change the stmts parameter from
a vec to a vec.
(vect_two_operations_perm_ok_p, vect_build_slp_tree_1): Likewise.
(vect_build_slp_tree): Likewise.
(vect_build_slp_tree_2): Likewise.  Update uses of
SLP_TREE_SCALAR_STMTS.
(vect_print_slp_tree): Update uses of SLP_TREE_SCALAR_STMTS.
(vect_mark_slp_stmts, vect_mark_slp_stmts_relevant)
(vect_slp_rearrange_stmts, vect_attempt_slp_rearrange_stmts)
(vect_supported_load_permutation_p, vect_find_last_scalar_stmt_in_slp)
(vect_detect_hybrid_slp_stmts, vect_slp_analyze_node_operations_1)
(vect_slp_analyze_node_operations, vect_slp_analyze_operations)
(vect_bb_slp_scalar_cost, vect_slp_analyze_bb_1)
(vect_get_constant_vectors, vect_get_slp_defs)
(vect_transform_slp_perm_load, vect_schedule_slp_instance)
(vect_remove_slp_scalar_calls, vect_schedule_slp): Likewise.
(vect_analyze_slp_instance): Build up a vec of stmt_vec_infos
instead of gimple stmts.
* tree-vect-data-refs.c (vect_slp_analyze_node_dependences): Change
the stores parameter for a vec to a vec.
(vect_slp_analyze_instance_dependence): Update uses of
SLP_TREE_SCALAR_STMTS.
(vect_slp_analyze_and_verify_node_alignment): Likewise.
(vect_slp_analyze_and_verify_instance_alignment): Likewise.
* tree-vect-loop.c (neutral_op_for_slp_reduction): Likewise.
(get_initial_defs_for_reduction): Likewise.
(vect_create_epilog_for_reduction): Likewise.
(vectorize_fold_left_reduction): Likewise.
* tree-vect-stmts.c (vect_prologue_cost_for_slp_op): Likewise.
(vect_model_simple_cost, vectorizable_shift, vectorizable_load)
(can_vectorize_live_stmts): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:57.277070390 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:23:00.401042649 +0100
@@ -138,7 +138,7 @@ struct _slp_tree {
   /* Nodes that contain def-stmts of this node statements operands.  */
   vec children;
   /* A group of scalar stmts to be vectorized together.  */
-  vec stmts;
+  vec stmts;
   /* Load permutation relative to the stores, NULL if there is no
  permutation.  */
   vec load_permutation;
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c 2018-07-24 10:22:57.277070390 +0100
+++ gcc/tree-vect-slp.c 2018-07-24 10:23:00.401042649 +0100
@@ -66,11 +66,11 @@ vect_free_slp_tree (slp_tree node, bool
  statements would be redundant.  */
   if (!final_p)
 {
-  gimple *stmt;
-  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt)
+  stmt_vec_info stmt_info;
+  FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info)
{
- gcc_assert (STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmt)) > 0);
- STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmt))--;
+ gcc_assert (STMT_VINFO_NUM_SLP_USES (stmt_info) > 0);
+ STMT_VINFO_NUM_SLP_USES (stmt_info)--;
}
 }
 
@@ -99,21 +99,21 @@ vect_free_slp_instance (slp_instance ins
 /* Create an SLP node for SCALAR_STMTS.  */
 
 static slp_tree
-vect_create_new_slp_node (vec scalar_stmts)
+vect_create_new_slp_node (vec scalar_stmts)
 {
   slp_tree node;
-  gimple *stmt = scalar_stmts[0];
+  stmt_vec_info stmt_info = scalar_stmts[0];
   unsigned int nops;
 
-  if (is_gimple_call (stmt))
+  if (gcall *stmt = dyn_cast  (stmt_info->stmt))
 nops = gimple_call_num_args (stmt);
-  else if (is_gimple_assign (stmt))
+  else if (gassign *stmt = dyn_cast  (stmt_info->stmt))
 {
   nops = gimple_num_ops (stmt) - 1;
   if (gimple_assign_rhs_code (stmt) == COND_EXPR)
nops++;
 }
-  else if (gimple_code (stmt) == GIMPLE_PHI)
+  else if (is_a  (stmt_info->stmt))
 nops = 0;
   else
 return NULL;
@@ -128,8 +128,8 @@ vect_create_new_slp_node (vec
   SLP_TREE_DEF_TYPE (node) = vect_internal_def;
 
   unsigned i;
-  FOR_EACH_VEC_ELT (scalar_stmts, i, stmt)
-STMT_VINFO_NUM_SLP_USES (vinfo_for_stmt (stmt))++;
+  FOR_EACH_VEC_ELT (scalar_stmts, i, stmt_info)
+STMT_VINFO_NUM_SLP_USES (stmt_info)++;
 
   return node;
 }
@@ -141,7 +141,7 @@ vect_create_new_slp_node (vec
 typedef struct 

[19/46] Make vect_dr_stmt return a stmt_vec_info

2018-07-24 Thread Richard Sandiford
This patch makes vect_dr_stmt return a stmt_vec_info instead of a
gimple stmt.  Rather than retain a separate gimple stmt variable
in cases where both existed, the patch replaces uses of the gimple
variable with the uses of the stmt_vec_info.  Later patches do this
more generally.

Many things that are keyed off a data_reference would these days
be better keyed off a stmt_vec_info, but it's more convenient
to do that later in the series.  The vect_dr_size calls that are
left over do still benefit from this patch.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_dr_stmt): Return a stmt_vec_info rather
than a gimple stmt.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence)
(vect_slp_analyze_data_ref_dependence, vect_record_base_alignments)
(vect_calculate_target_alignmentm, vect_compute_data_ref_alignment)
(vect_update_misalignment_for_peel, vect_verify_datarefs_alignment)
(vector_alignment_reachable_p, vect_get_data_access_cost)
(vect_get_peeling_costs_all_drs, vect_peeling_hash_get_lowest_cost)
(vect_peeling_supportable, vect_enhance_data_refs_alignment)
(vect_find_same_alignment_drs, vect_analyze_data_refs_alignment)
(vect_analyze_group_access_1, vect_analyze_group_access)
(vect_analyze_data_ref_access, vect_analyze_data_ref_accesses)
(vect_vfa_access_size, vect_small_gap_p, vect_analyze_data_refs)
(vect_supportable_dr_alignment): Remove vinfo_for_stmt from the
result of vect_dr_stmt and use the stmt_vec_info instead of
the associated gimple stmt.
* tree-vect-loop-manip.c (get_misalign_in_elems): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
* tree-vect-loop.c (vect_analyze_loop_2): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:23:00.401042649 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:23:04.033010396 +0100
@@ -1370,7 +1370,7 @@ vect_dr_behavior (data_reference *dr)
a pattern this returns the corresponding pattern stmt.  Otherwise
DR_STMT is returned.  */
 
-inline gimple *
+inline stmt_vec_info
 vect_dr_stmt (data_reference *dr)
 {
   gimple *stmt = DR_STMT (dr);
@@ -1379,7 +1379,7 @@ vect_dr_stmt (data_reference *dr)
 return STMT_VINFO_RELATED_STMT (stmt_info);
   /* DR_STMT should never refer to a stmt in a pattern replacement.  */
   gcc_checking_assert (!STMT_VINFO_RELATED_STMT (stmt_info));
-  return stmt;
+  return stmt_info;
 }
 
 /* Return true if the vect cost model is unlimited.  */
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:23:00.397042684 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:23:04.029010432 +0100
@@ -294,8 +294,8 @@ vect_analyze_data_ref_dependence (struct
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   struct data_reference *dra = DDR_A (ddr);
   struct data_reference *drb = DDR_B (ddr);
-  stmt_vec_info stmtinfo_a = vinfo_for_stmt (vect_dr_stmt (dra));
-  stmt_vec_info stmtinfo_b = vinfo_for_stmt (vect_dr_stmt (drb));
+  stmt_vec_info stmtinfo_a = vect_dr_stmt (dra);
+  stmt_vec_info stmtinfo_b = vect_dr_stmt (drb);
   lambda_vector dist_v;
   unsigned int loop_depth;
 
@@ -627,9 +627,9 @@ vect_slp_analyze_data_ref_dependence (st
 
   /* If dra and drb are part of the same interleaving chain consider
  them independent.  */
-  if (STMT_VINFO_GROUPED_ACCESS (vinfo_for_stmt (vect_dr_stmt (dra)))
-  && (DR_GROUP_FIRST_ELEMENT (vinfo_for_stmt (vect_dr_stmt (dra)))
- == DR_GROUP_FIRST_ELEMENT (vinfo_for_stmt (vect_dr_stmt (drb)
+  if (STMT_VINFO_GROUPED_ACCESS (vect_dr_stmt (dra))
+  && (DR_GROUP_FIRST_ELEMENT (vect_dr_stmt (dra))
+ == DR_GROUP_FIRST_ELEMENT (vect_dr_stmt (drb
 return false;
 
   /* Unknown data dependence.  */
@@ -841,19 +841,18 @@ vect_record_base_alignments (vec_info *v
   unsigned int i;
   FOR_EACH_VEC_ELT (vinfo->shared->datarefs, i, dr)
 {
-  gimple *stmt = vect_dr_stmt (dr);
-  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  stmt_vec_info stmt_info = vect_dr_stmt (dr);
   if (!DR_IS_CONDITIONAL_IN_STMT (dr)
  && STMT_VINFO_VECTORIZABLE (stmt_info)
  && !STMT_VINFO_GATHER_SCATTER_P (stmt_info))
{
- vect_record_base_alignment (vinfo, stmt, _INNERMOST (dr));
+ vect_record_base_alignment (vinfo, stmt_info, _INNERMOST (dr));
 
  /* If DR is nested in the loop that is being vectorized, we can also
 record the alignment of the base wrt the outer loop.  */
- if (loop && nested_in_vect_loop_p (loop, stmt))
+ if (loop && nested_in_vect_loop_p (loop, stmt_info))
vect_record_base_alignment
-   (vinfo, stmt, _VINFO_DR_WRT_VEC_LOOP (stmt_info));
+   (vinfo, stmt_info, 

[22/46] Make DR_GROUP_SAME_DR_STMT a stmt_vec_info

2018-07-24 Thread Richard Sandiford
This patch changes STMT_VINFO_SAME_DR_STMT from a gimple stmt to a
stmt_vec_info.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_stmt_vec_info::same_dr_stmt): Change from
a gimple stmt to a stmt_vec_info.
* tree-vect-stmts.c (vectorizable_load): Update accordingly.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:23:12.060939107 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:23:15.756906285 +0100
@@ -876,7 +876,7 @@ struct _stmt_vec_info {
   stmt_vec_info next_element;
   /* For data-refs, in case that two or more stmts share data-ref, this is the
  pointer to the previously detected stmt with the same dr.  */
-  gimple *same_dr_stmt;
+  stmt_vec_info same_dr_stmt;
   /* The size of the group.  */
   unsigned int size;
   /* For stores, number of stores from this group seen. We vectorize the last
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:23:08.536970400 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:23:15.756906285 +0100
@@ -7590,8 +7590,7 @@ vectorizable_load (gimple *stmt, gimple_
 we have to give up.  */
   if (DR_GROUP_SAME_DR_STMT (stmt_info)
  && (STMT_SLP_TYPE (stmt_info)
- != STMT_SLP_TYPE (vinfo_for_stmt
-(DR_GROUP_SAME_DR_STMT (stmt_info)
+ != STMT_SLP_TYPE (DR_GROUP_SAME_DR_STMT (stmt_info
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,


[21/46] Make grouped_stores and reduction_chains use stmt_vec_infos

2018-07-24 Thread Richard Sandiford
This patch changes the SLP lists grouped_stores and reduction_chains
from auto_vec to auto_vec.  It was easier
to do them together due to the way vect_analyze_slp is structured.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vec_info::grouped_stores): Change from
an auto_vec to an auto_vec.
(_loop_vec_info::reduction_chains): Likewise.
* tree-vect-loop.c (vect_fixup_scalar_cycles_with_patterns): Update
accordingly.
* tree-vect-slp.c (vect_analyze_slp): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:23:08.536970400 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:23:12.060939107 +0100
@@ -259,7 +259,7 @@ struct vec_info {
 
   /* All interleaving chains of stores, represented by the first
  stmt in the chain.  */
-  auto_vec grouped_stores;
+  auto_vec grouped_stores;
 
   /* Cost data used by the target cost model.  */
   void *target_cost_data;
@@ -479,7 +479,7 @@ typedef struct _loop_vec_info : public v
 
   /* All reduction chains in the loop, represented by the first
  stmt in the chain.  */
-  auto_vec reduction_chains;
+  auto_vec reduction_chains;
 
   /* Cost vector for a single scalar iteration.  */
   auto_vec scalar_cost_vec;
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:23:08.532970436 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:23:12.060939107 +0100
@@ -677,13 +677,13 @@ vect_fixup_reduc_chain (gimple *stmt)
 static void
 vect_fixup_scalar_cycles_with_patterns (loop_vec_info loop_vinfo)
 {
-  gimple *first;
+  stmt_vec_info first;
   unsigned i;
 
   FOR_EACH_VEC_ELT (LOOP_VINFO_REDUCTION_CHAINS (loop_vinfo), i, first)
-if (STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (first)))
+if (STMT_VINFO_IN_PATTERN_P (first))
   {
-   stmt_vec_info next = REDUC_GROUP_NEXT_ELEMENT (vinfo_for_stmt (first));
+   stmt_vec_info next = REDUC_GROUP_NEXT_ELEMENT (first);
while (next)
  {
if (! STMT_VINFO_IN_PATTERN_P (next))
@@ -696,7 +696,7 @@ vect_fixup_scalar_cycles_with_patterns (
  {
vect_fixup_reduc_chain (first);
LOOP_VINFO_REDUCTION_CHAINS (loop_vinfo)[i]
- = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (first));
+ = STMT_VINFO_RELATED_STMT (first);
  }
   }
 }
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c 2018-07-24 10:23:08.536970400 +0100
+++ gcc/tree-vect-slp.c 2018-07-24 10:23:12.060939107 +0100
@@ -2202,7 +2202,7 @@ vect_analyze_slp_instance (vec_info *vin
 vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size)
 {
   unsigned int i;
-  gimple *first_element;
+  stmt_vec_info first_element;
 
   DUMP_VECT_SCOPE ("vect_analyze_slp");
 
@@ -2220,17 +2220,15 @@ vect_analyze_slp (vec_info *vinfo, unsig
 max_tree_size))
  {
/* Dissolve reduction chain group.  */
-   gimple *stmt = first_element;
-   while (stmt)
+   stmt_vec_info vinfo = first_element;
+   while (vinfo)
  {
-   stmt_vec_info vinfo = vinfo_for_stmt (stmt);
stmt_vec_info next = REDUC_GROUP_NEXT_ELEMENT (vinfo);
REDUC_GROUP_FIRST_ELEMENT (vinfo) = NULL;
REDUC_GROUP_NEXT_ELEMENT (vinfo) = NULL;
-   stmt = next;
+   vinfo = next;
  }
-   STMT_VINFO_DEF_TYPE (vinfo_for_stmt (first_element))
- = vect_internal_def;
+   STMT_VINFO_DEF_TYPE (first_element) = vect_internal_def;
  }
}
 


[23/46] Make LOOP_VINFO_MAY_MISALIGN_STMTS use stmt_vec_info

2018-07-24 Thread Richard Sandiford
This patch changes LOOP_VINFO_MAY_MISALIGN_STMTS from an
auto_vec to an auto_vec.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_loop_vec_info::may_misalign_stmts): Change
from an auto_vec to an auto_vec.
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Update
accordingly.
* tree-vect-loop-manip.c (vect_create_cond_for_align_checks): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:23:15.756906285 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:23:18.856878757 +0100
@@ -472,7 +472,7 @@ typedef struct _loop_vec_info : public v
 
   /* Statements in the loop that have data references that are candidates for a
  runtime (loop versioning) misalignment check.  */
-  auto_vec may_misalign_stmts;
+  auto_vec may_misalign_stmts;
 
   /* Reduction cycles detected in the loop. Used in loop-aware SLP.  */
   auto_vec reductions;
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:23:08.532970436 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:23:18.856878757 +0100
@@ -2231,16 +2231,15 @@ vect_enhance_data_refs_alignment (loop_v
 
   if (do_versioning)
 {
-  vec may_misalign_stmts
+  vec may_misalign_stmts
 = LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo);
-  gimple *stmt;
+  stmt_vec_info stmt_info;
 
   /* It can now be assumed that the data references in the statements
  in LOOP_VINFO_MAY_MISALIGN_STMTS will be aligned in the version
  of the loop being vectorized.  */
-  FOR_EACH_VEC_ELT (may_misalign_stmts, i, stmt)
+  FOR_EACH_VEC_ELT (may_misalign_stmts, i, stmt_info)
 {
-  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   dr = STMT_VINFO_DATA_REF (stmt_info);
  SET_DR_MISALIGNMENT (dr, 0);
  if (dump_enabled_p ())
Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c  2018-07-24 10:23:04.029010432 +0100
+++ gcc/tree-vect-loop-manip.c  2018-07-24 10:23:18.856878757 +0100
@@ -2772,9 +2772,9 @@ vect_create_cond_for_align_checks (loop_
tree *cond_expr,
   gimple_seq *cond_expr_stmt_list)
 {
-  vec may_misalign_stmts
+  vec may_misalign_stmts
 = LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo);
-  gimple *ref_stmt;
+  stmt_vec_info stmt_info;
   int mask = LOOP_VINFO_PTR_MASK (loop_vinfo);
   tree mask_cst;
   unsigned int i;
@@ -2795,23 +2795,22 @@ vect_create_cond_for_align_checks (loop_
   /* Create expression (mask & (dr_1 || ... || dr_n)) where dr_i is the address
  of the first vector of the i'th data reference. */
 
-  FOR_EACH_VEC_ELT (may_misalign_stmts, i, ref_stmt)
+  FOR_EACH_VEC_ELT (may_misalign_stmts, i, stmt_info)
 {
   gimple_seq new_stmt_list = NULL;
   tree addr_base;
   tree addr_tmp_name;
   tree new_or_tmp_name;
   gimple *addr_stmt, *or_stmt;
-  stmt_vec_info stmt_vinfo = vinfo_for_stmt (ref_stmt);
-  tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
   bool negative = tree_int_cst_compare
-   (DR_STEP (STMT_VINFO_DATA_REF (stmt_vinfo)), size_zero_node) < 0;
+   (DR_STEP (STMT_VINFO_DATA_REF (stmt_info)), size_zero_node) < 0;
   tree offset = negative
? size_int (-TYPE_VECTOR_SUBPARTS (vectype) + 1) : size_zero_node;
 
   /* create: addr_tmp = (int)(address_of_first_vector) */
   addr_base =
-   vect_create_addr_base_for_vector_ref (ref_stmt, _stmt_list,
+   vect_create_addr_base_for_vector_ref (stmt_info, _stmt_list,
  offset);
   if (new_stmt_list != NULL)
gimple_seq_add_seq (cond_expr_stmt_list, new_stmt_list);


[40/46] Add vec_info::lookup_dr

2018-07-24 Thread Richard Sandiford
Previous patches got rid of a lot of calls to vect_dr_stmt.
This patch replaces the remaining ones with calls to a new
vec_info::lookup_dr function, so that the lookup is relative
to a particular vec_info rather than to global state.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vec_info::lookup_dr): New member function.
(vect_dr_stmt): Delete.
* tree-vectorizer.c (vec_info::lookup_dr): New function.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Use it instead
of vect_dr_stmt.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr)
(vect_analyze_data_ref_dependence, vect_record_base_alignments)
(vect_verify_datarefs_alignment, vect_peeling_supportable)
(vect_analyze_data_ref_accesses, vect_prune_runtime_alias_test_list)
(vect_analyze_data_refs): Likewise.
(vect_slp_analyze_data_ref_dependence): Likewise.  Take a vec_info
argument.
(vect_find_same_alignment_drs): Likewise.
(vect_slp_analyze_node_dependences): Update calls accordingly.
(vect_analyze_data_refs_alignment): Likewise.  Use vec_info::lookup_dr
instead of vect_dr_stmt.
(vect_get_peeling_costs_all_drs): Take a loop_vec_info instead
of a vector data references.  Use vec_info::lookup_dr instead of
vect_dr_stmt.
(vect_peeling_hash_get_lowest_cost): Update calls accordingly.
(vect_enhance_data_refs_alignment): Likewise.  Use vec_info::lookup_dr
instead of vect_dr_stmt.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:24:12.252404574 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:24:16.552366384 +0100
@@ -240,6 +240,7 @@ struct vec_info {
   stmt_vec_info lookup_stmt (gimple *);
   stmt_vec_info lookup_def (tree);
   stmt_vec_info lookup_single_use (tree);
+  stmt_vec_info lookup_dr (data_reference *);
 
   /* The type of vectorization.  */
   vec_kind kind;
@@ -1327,22 +1328,6 @@ vect_dr_behavior (stmt_vec_info stmt_inf
 return _VINFO_DR_WRT_VEC_LOOP (stmt_info);
 }
 
-/* Return the stmt DR is in.  For DR_STMT that have been replaced by
-   a pattern this returns the corresponding pattern stmt.  Otherwise
-   DR_STMT is returned.  */
-
-inline stmt_vec_info
-vect_dr_stmt (data_reference *dr)
-{
-  gimple *stmt = DR_STMT (dr);
-  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
-  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
-return STMT_VINFO_RELATED_STMT (stmt_info);
-  /* DR_STMT should never refer to a stmt in a pattern replacement.  */
-  gcc_checking_assert (!STMT_VINFO_RELATED_STMT (stmt_info));
-  return stmt_info;
-}
-
 /* Return true if the vect cost model is unlimited.  */
 static inline bool
 unlimited_cost_model (loop_p loop)
Index: gcc/tree-vectorizer.c
===
--- gcc/tree-vectorizer.c   2018-07-24 10:22:30.401309046 +0100
+++ gcc/tree-vectorizer.c   2018-07-24 10:24:16.552366384 +0100
@@ -562,6 +562,21 @@ vec_info::lookup_single_use (tree lhs)
   return NULL;
 }
 
+/* Return the stmt DR is in.  For DR_STMT that have been replaced by
+   a pattern this returns the corresponding pattern stmt.  Otherwise
+   it returns the information for DR_STMT itself.  */
+
+stmt_vec_info
+vec_info::lookup_dr (data_reference *dr)
+{
+  stmt_vec_info stmt_info = lookup_stmt (DR_STMT (dr));
+  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
+return STMT_VINFO_RELATED_STMT (stmt_info);
+  /* DR_STMT should never refer to a stmt in a pattern replacement.  */
+  gcc_checking_assert (!STMT_VINFO_RELATED_STMT (stmt_info));
+  return stmt_info;
+}
+
 /* A helper function to free scev and LOOP niter information, as well as
clear loop constraint LOOP_C_FINITE.  */
 
Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c  2018-07-24 10:24:12.248404609 +0100
+++ gcc/tree-vect-loop-manip.c  2018-07-24 10:24:16.552366384 +0100
@@ -1752,8 +1752,8 @@ vect_update_inits_of_drs (loop_vec_info
 
   FOR_EACH_VEC_ELT (datarefs, i, dr)
 {
-  gimple *stmt = DR_STMT (dr);
-  if (!STMT_VINFO_GATHER_SCATTER_P (vinfo_for_stmt (stmt)))
+  stmt_vec_info stmt_info = loop_vinfo->lookup_dr (dr);
+  if (!STMT_VINFO_GATHER_SCATTER_P (stmt_info))
vect_update_init_of_dr (dr, niters, code);
 }
 }
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:24:12.248404609 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:24:16.552366384 +0100
@@ -267,10 +267,10 @@ vect_analyze_possibly_independent_ddr (d
 
 Note that the alias checks will be removed if the VF ends up
 being small enough.  */
- return (!STMT_VINFO_GATHER_SCATTER_P
-(vinfo_for_stmt (DR_STMT (DDR_A (ddr
-  

[41/46] Add vec_info::remove_stmt

2018-07-24 Thread Richard Sandiford
This patch adds a new helper function for permanently removing a
statement and its associated stmt_vec_info.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vec_info::remove_stmt): Declare.
* tree-vectorizer.c (vec_info::remove_stmt): New function.
* tree-vect-loop-manip.c (vect_set_loop_condition): Use it.
* tree-vect-loop.c (vect_transform_loop): Likewise.
* tree-vect-slp.c (vect_schedule_slp): Likewise.
* tree-vect-stmts.c (vect_remove_stores): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:24:16.552366384 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:24:19.544339803 +0100
@@ -241,6 +241,7 @@ struct vec_info {
   stmt_vec_info lookup_def (tree);
   stmt_vec_info lookup_single_use (tree);
   stmt_vec_info lookup_dr (data_reference *);
+  void remove_stmt (stmt_vec_info);
 
   /* The type of vectorization.  */
   vec_kind kind;
Index: gcc/tree-vectorizer.c
===
--- gcc/tree-vectorizer.c   2018-07-24 10:24:16.552366384 +0100
+++ gcc/tree-vectorizer.c   2018-07-24 10:24:19.544339803 +0100
@@ -577,6 +577,20 @@ vec_info::lookup_dr (data_reference *dr)
   return stmt_info;
 }
 
+/* Permanently remove the statement described by STMT_INFO from the
+   function.  */
+
+void
+vec_info::remove_stmt (stmt_vec_info stmt_info)
+{
+  gcc_assert (!stmt_info->pattern_stmt_p);
+  gimple_stmt_iterator si = gsi_for_stmt (stmt_info->stmt);
+  unlink_stmt_vdef (stmt_info->stmt);
+  gsi_remove (, true);
+  release_defs (stmt_info->stmt);
+  free_stmt_vec_info (stmt_info);
+}
+
 /* A helper function to free scev and LOOP niter information, as well as
clear loop constraint LOOP_C_FINITE.  */
 
Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c  2018-07-24 10:24:16.552366384 +0100
+++ gcc/tree-vect-loop-manip.c  2018-07-24 10:24:19.540339838 +0100
@@ -935,8 +935,12 @@ vect_set_loop_condition (struct loop *lo
  loop_cond_gsi);
 
   /* Remove old loop exit test.  */
-  gsi_remove (_cond_gsi, true);
-  free_stmt_vec_info (orig_cond);
+  stmt_vec_info orig_cond_info;
+  if (loop_vinfo
+  && (orig_cond_info = loop_vinfo->lookup_stmt (orig_cond)))
+loop_vinfo->remove_stmt (orig_cond_info);
+  else
+gsi_remove (_cond_gsi, true);
 
   if (dump_enabled_p ())
 {
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:24:12.252404574 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:24:19.540339838 +0100
@@ -8487,28 +8487,18 @@ vect_transform_loop (loop_vec_info loop_
  vect_transform_loop_stmt (loop_vinfo, stmt_info, ,
_store, _scheduled);
}
+ gsi_next ();
  if (seen_store)
{
  if (STMT_VINFO_GROUPED_ACCESS (seen_store))
-   {
- /* Interleaving.  If IS_STORE is TRUE, the
-vectorization of the interleaving chain was
-completed - free all the stores in the chain.  */
- gsi_next ();
- vect_remove_stores (DR_GROUP_FIRST_ELEMENT (seen_store));
-   }
+   /* Interleaving.  If IS_STORE is TRUE, the
+  vectorization of the interleaving chain was
+  completed - free all the stores in the chain.  */
+   vect_remove_stores (DR_GROUP_FIRST_ELEMENT (seen_store));
  else
-   {
- /* Free the attached stmt_vec_info and remove the
-stmt.  */
- free_stmt_vec_info (stmt);
- unlink_stmt_vdef (stmt);
- gsi_remove (, true);
- release_defs (stmt);
-   }
+   /* Free the attached stmt_vec_info and remove the stmt.  */
+   loop_vinfo->remove_stmt (stmt_info);
}
- else
-   gsi_next ();
}
}
 
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c 2018-07-24 10:24:02.360492422 +0100
+++ gcc/tree-vect-slp.c 2018-07-24 10:24:19.540339838 +0100
@@ -4087,7 +4087,6 @@ vect_schedule_slp (vec_info *vinfo)
   slp_tree root = SLP_INSTANCE_TREE (instance);
   stmt_vec_info store_info;
   unsigned int j;
-  gimple_stmt_iterator gsi;
 
   /* Remove scalar call stmts.  Do not do this for basic-block
 vectorization as not all uses may be vectorized.
@@ -4108,11 +4107,7 @@ vect_schedule_slp (vec_info *vinfo)
  if 

[43/46] Make free_stmt_vec_info take a stmt_vec_info

2018-07-24 Thread Richard Sandiford
This patch makes free_stmt_vec_info take the stmt_vec_info that
it's supposed to free and makes it free only that stmt_vec_info.
Callers need to update the statement mapping where necessary
(but now there are only a couple of callers).

This in turns means that we can leave ~vec_info to do the actual
freeing, since there's no longer a need to do it before resetting
the gimple_uids.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (free_stmt_vec_info): Take a stmt_vec_info
rather than a gimple stmt.
* tree-vect-stmts.c (free_stmt_vec_info): Likewise.  Don't free
information for pattern statements when passed the original
statement; instead wait to be passed the pattern statement itself.
Don't call set_vinfo_for_stmt here.
(free_stmt_vec_infos): Update call to free_stmt_vec_info.
* tree-vect-loop.c (_loop_vec_info::~loop_vec_info): Don't free
stmt_vec_infos here.
* tree-vect-slp.c (_bb_vec_info::~bb_vec_info): Likewise.
* tree-vectorizer.c (vec_info::remove_stmt): Nullify the statement's
stmt_vec_infos entry.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:24:22.684311906 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:24:26.084281700 +0100
@@ -1484,7 +1484,7 @@ extern bool supportable_narrowing_operat
 enum tree_code *,
 int *, vec *);
 extern stmt_vec_info new_stmt_vec_info (gimple *stmt, vec_info *);
-extern void free_stmt_vec_info (gimple *stmt);
+extern void free_stmt_vec_info (stmt_vec_info);
 extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
  enum vect_cost_for_stmt, stmt_vec_info,
  int, enum vect_cost_model_location);
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:24:22.684311906 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:24:26.084281700 +0100
@@ -9916,7 +9916,7 @@ free_stmt_vec_infos (vec
   stmt_vec_info info;
   FOR_EACH_VEC_ELT (*v, i, info)
 if (info != NULL_STMT_VEC_INFO)
-  free_stmt_vec_info (STMT_VINFO_STMT (info));
+  free_stmt_vec_info (info);
   if (v == stmt_vec_info_vec)
 stmt_vec_info_vec = NULL;
   v->release ();
@@ -9926,44 +9926,18 @@ free_stmt_vec_infos (vec
 /* Free stmt vectorization related info.  */
 
 void
-free_stmt_vec_info (gimple *stmt)
+free_stmt_vec_info (stmt_vec_info stmt_info)
 {
-  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
-
-  if (!stmt_info)
-return;
-
-  /* Check if this statement has a related "pattern stmt"
- (introduced by the vectorizer during the pattern recognition
- pass).  Free pattern's stmt_vec_info and def stmt's stmt_vec_info
- too.  */
-  if (STMT_VINFO_IN_PATTERN_P (stmt_info))
+  if (stmt_info->pattern_stmt_p)
 {
-  if (gimple_seq seq = STMT_VINFO_PATTERN_DEF_SEQ (stmt_info))
-   for (gimple_stmt_iterator si = gsi_start (seq);
-!gsi_end_p (si); gsi_next ())
- {
-   gimple *seq_stmt = gsi_stmt (si);
-   gimple_set_bb (seq_stmt, NULL);
-   tree lhs = gimple_get_lhs (seq_stmt);
-   if (lhs && TREE_CODE (lhs) == SSA_NAME)
- release_ssa_name (lhs);
-   free_stmt_vec_info (seq_stmt);
- }
-  stmt_vec_info patt_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
-  if (patt_stmt_info)
-   {
- gimple_set_bb (patt_stmt_info->stmt, NULL);
- tree lhs = gimple_get_lhs (patt_stmt_info->stmt);
- if (lhs && TREE_CODE (lhs) == SSA_NAME)
-   release_ssa_name (lhs);
- free_stmt_vec_info (patt_stmt_info);
-   }
+  gimple_set_bb (stmt_info->stmt, NULL);
+  tree lhs = gimple_get_lhs (stmt_info->stmt);
+  if (lhs && TREE_CODE (lhs) == SSA_NAME)
+   release_ssa_name (lhs);
 }
 
   STMT_VINFO_SAME_ALIGN_REFS (stmt_info).release ();
   STMT_VINFO_SIMD_CLONE_INFO (stmt_info).release ();
-  set_vinfo_for_stmt (stmt, NULL);
   free (stmt_info);
 }
 
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:24:19.540339838 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:24:26.080281735 +0100
@@ -894,9 +894,6 @@ _loop_vec_info::~_loop_vec_info ()
   for (j = 0; j < nbbs; j++)
 {
   basic_block bb = bbs[j];
-  for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next ())
-free_stmt_vec_info (gsi_stmt (si));
-
   for (si = gsi_start_bb (bb); !gsi_end_p (si); )
 {
  gimple *stmt = gsi_stmt (si);
@@ -936,9 +933,6 @@ _loop_vec_info::~_loop_vec_info ()
}
}
}
-
- /* Free stmt_vec_info.  */
- free_stmt_vec_info (stmt);
   

[42/46] Add vec_info::replace_stmt

2018-07-24 Thread Richard Sandiford
This patch adds a helper for replacing a stmt_vec_info's statement with
a new statement.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vec_info::replace_stmt): Declare.
* tree-vectorizer.c (vec_info::replace_stmt): New function.
* tree-vect-slp.c (vect_remove_slp_scalar_calls): Use it.
* tree-vect-stmts.c (vectorizable_call): Likewise.
(vectorizable_simd_clone_call): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:24:19.544339803 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:24:22.684311906 +0100
@@ -242,6 +242,7 @@ struct vec_info {
   stmt_vec_info lookup_single_use (tree);
   stmt_vec_info lookup_dr (data_reference *);
   void remove_stmt (stmt_vec_info);
+  void replace_stmt (gimple_stmt_iterator *, stmt_vec_info, gimple *);
 
   /* The type of vectorization.  */
   vec_kind kind;
Index: gcc/tree-vectorizer.c
===
--- gcc/tree-vectorizer.c   2018-07-24 10:24:19.544339803 +0100
+++ gcc/tree-vectorizer.c   2018-07-24 10:24:22.684311906 +0100
@@ -591,6 +591,22 @@ vec_info::remove_stmt (stmt_vec_info stm
   free_stmt_vec_info (stmt_info);
 }
 
+/* Replace the statement at GSI by NEW_STMT, both the vectorization
+   information and the function itself.  STMT_INFO describes the statement
+   at GSI.  */
+
+void
+vec_info::replace_stmt (gimple_stmt_iterator *gsi, stmt_vec_info stmt_info,
+   gimple *new_stmt)
+{
+  gimple *old_stmt = stmt_info->stmt;
+  gcc_assert (!stmt_info->pattern_stmt_p && old_stmt == gsi_stmt (*gsi));
+  set_vinfo_for_stmt (old_stmt, NULL);
+  set_vinfo_for_stmt (new_stmt, stmt_info);
+  stmt_info->stmt = new_stmt;
+  gsi_replace (gsi, new_stmt, true);
+}
+
 /* A helper function to free scev and LOOP niter information, as well as
clear loop constraint LOOP_C_FINITE.  */
 
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c 2018-07-24 10:24:19.540339838 +0100
+++ gcc/tree-vect-slp.c 2018-07-24 10:24:22.680311942 +0100
@@ -4048,11 +4048,8 @@ vect_remove_slp_scalar_calls (slp_tree n
continue;
   lhs = gimple_call_lhs (stmt);
   new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
-  set_vinfo_for_stmt (new_stmt, stmt_info);
-  set_vinfo_for_stmt (stmt, NULL);
-  STMT_VINFO_STMT (stmt_info) = new_stmt;
   gsi = gsi_for_stmt (stmt);
-  gsi_replace (, new_stmt, false);
+  stmt_info->vinfo->replace_stmt (, stmt_info, new_stmt);
   SSA_NAME_DEF_STMT (gimple_assign_lhs (new_stmt)) = new_stmt;
 }
 }
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:24:19.544339803 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:24:22.684311906 +0100
@@ -3629,10 +3629,7 @@ vectorizable_call (stmt_vec_info stmt_in
 
   gassign *new_stmt
 = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
-  set_vinfo_for_stmt (new_stmt, stmt_info);
-  set_vinfo_for_stmt (stmt_info->stmt, NULL);
-  STMT_VINFO_STMT (stmt_info) = new_stmt;
-  gsi_replace (gsi, new_stmt, false);
+  vinfo->replace_stmt (gsi, stmt_info, new_stmt);
 
   return true;
 }
@@ -4370,10 +4367,7 @@ vectorizable_simd_clone_call (stmt_vec_i
 }
   else
 new_stmt = gimple_build_nop ();
-  set_vinfo_for_stmt (new_stmt, stmt_info);
-  set_vinfo_for_stmt (stmt, NULL);
-  STMT_VINFO_STMT (stmt_info) = new_stmt;
-  gsi_replace (gsi, new_stmt, true);
+  vinfo->replace_stmt (gsi, stmt_info, new_stmt);
   unlink_stmt_vdef (stmt);
 
   return true;


[46/46] Turn stmt_vec_info back into a typedef

2018-07-24 Thread Richard Sandiford
This patch removes the stmt_vec_info wrapper class added near the
beginning of the series and turns stmt_vec_info back into a typedef.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (stmt_vec_info): Turn back into a typedef.
(NULL_STMT_VEC_INFO): Delete.
(stmt_vec_info::operator*): Likewise.
(stmt_vec_info::operator gimple *): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Use NULL instead
of NULL_STMT_VEC_INFO.
* tree-vect-patterns.c (vect_init_pattern_stmt): Likewise.
(vect_reassociating_reduction_p): Likewise.
* tree-vect-stmts.c (vect_build_gather_load_calls): Likewise.
(vectorizable_store): Likewise.
* tree-vectorizer.c (vec_info::set_vinfo_for_stmt): Likewise.
(vec_info::free_stmt_vec_infos): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:24:32.472224947 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:24:35.888194598 +0100
@@ -21,26 +21,7 @@ Software Foundation; either version 3, o
 #ifndef GCC_TREE_VECTORIZER_H
 #define GCC_TREE_VECTORIZER_H
 
-class stmt_vec_info {
-public:
-  stmt_vec_info () {}
-  stmt_vec_info (struct _stmt_vec_info *ptr) : m_ptr (ptr) {}
-  struct _stmt_vec_info *operator-> () const { return m_ptr; }
-  struct _stmt_vec_info * () const;
-  operator struct _stmt_vec_info * () const { return m_ptr; }
-  operator gimple * () const;
-  operator void * () const { return m_ptr; }
-  operator bool () const { return m_ptr; }
-  bool operator == (const stmt_vec_info ) { return x.m_ptr == m_ptr; }
-  bool operator == (_stmt_vec_info *x) { return x == m_ptr; }
-  bool operator != (const stmt_vec_info ) { return x.m_ptr != m_ptr; }
-  bool operator != (_stmt_vec_info *x) { return x != m_ptr; }
-
-private:
-  struct _stmt_vec_info *m_ptr;
-};
-
-#define NULL_STMT_VEC_INFO (stmt_vec_info (NULL))
+typedef struct _stmt_vec_info *stmt_vec_info;
 
 #include "tree-data-ref.h"
 #include "tree-hash-traits.h"
@@ -1080,17 +1061,6 @@ #define VECT_SCALAR_BOOLEAN_TYPE_P(TYPE)
&& TYPE_PRECISION (TYPE) == 1   \
&& TYPE_UNSIGNED (TYPE)))
 
-inline _stmt_vec_info &
-stmt_vec_info::operator* () const
-{
-  return *m_ptr;
-}
-
-inline stmt_vec_info::operator gimple * () const
-{
-  return m_ptr ? m_ptr->stmt : NULL;
-}
-
 static inline bool
 nested_in_vect_loop_p (struct loop *loop, stmt_vec_info stmt_info)
 {
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:24:29.296253164 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:24:35.884194634 +0100
@@ -6755,7 +6755,7 @@ vectorizable_reduction (stmt_vec_info st
   if (slp_node)
 neutral_op = neutral_op_for_slp_reduction
   (slp_node_instance->reduc_phis, code,
-   REDUC_GROUP_FIRST_ELEMENT (stmt_info) != NULL_STMT_VEC_INFO);
+   REDUC_GROUP_FIRST_ELEMENT (stmt_info) != NULL);
 
   if (double_reduc && reduction_type == FOLD_LEFT_REDUCTION)
 {
Index: gcc/tree-vect-patterns.c
===
--- gcc/tree-vect-patterns.c2018-07-24 10:24:02.360492422 +0100
+++ gcc/tree-vect-patterns.c2018-07-24 10:24:35.884194634 +0100
@@ -104,7 +104,7 @@ vect_init_pattern_stmt (gimple *pattern_
 {
   vec_info *vinfo = orig_stmt_info->vinfo;
   stmt_vec_info pattern_stmt_info = vinfo->lookup_stmt (pattern_stmt);
-  if (pattern_stmt_info == NULL_STMT_VEC_INFO)
+  if (pattern_stmt_info == NULL)
 pattern_stmt_info = orig_stmt_info->vinfo->add_stmt (pattern_stmt);
   gimple_set_bb (pattern_stmt, gimple_bb (orig_stmt_info->stmt));
 
@@ -819,7 +819,7 @@ vect_reassociating_reduction_p (stmt_vec
 {
   return (STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_reduction_def
  ? STMT_VINFO_REDUC_TYPE (stmt_vinfo) != FOLD_LEFT_REDUCTION
- : REDUC_GROUP_FIRST_ELEMENT (stmt_vinfo) != NULL_STMT_VEC_INFO);
+ : REDUC_GROUP_FIRST_ELEMENT (stmt_vinfo) != NULL);
 }
 
 /* As above, but also require it to have code CODE and to be a reduction
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:24:29.300253129 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:24:35.888194598 +0100
@@ -2842,7 +2842,7 @@ vect_build_gather_load_calls (stmt_vec_i
  new_stmt_info = loop_vinfo->lookup_def (var);
}
 
-  if (prev_stmt_info == NULL_STMT_VEC_INFO)
+  if (prev_stmt_info == NULL)
STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt_info;
   else
STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt_info;
@@ -6574,7 +6574,7 @@ vectorizable_store (stmt_vec_info stmt_i
  stmt_vec_info new_stmt_info
= vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
 
- if (prev_stmt_info == NULL_STMT_VEC_INFO)
+ if 

RE: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-24 Thread tamar . christina
Hi Jeff,

This patch defines a configure option to allow the setting of the default
guard size via configure flags when building the target.

The new flag is:

 * --with-stack-clash-protection-guard-size=

The patch defines a new macro DEFAULT_STK_CLASH_GUARD_SIZE which targets need
to use explicitly is they want to support this configure flag and values that
users may have set.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no 
issues.
Both targets were tested with stack clash on and off by default.

Ok for trunk?

Thanks,
Tamar

gcc/
2018-07-24  Tamar Christina  

PR target/86486
* configure.ac: Add stack-clash-protection-guard-size.
* config.in (DEFAULT_STK_CLASH_GUARD_SIZE): New.
* params.def: Update comment for guard-size.
* configure: Regenerate.

> -Original Message-
> From: Jeff Law 
> Sent: Monday, July 23, 2018 23:19
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ;
> jos...@codesourcery.com; bonz...@gnu.org; d...@redhat.com;
> nero...@gcc.gnu.org; aol...@redhat.com; ralf.wildenh...@gmx.de
> Subject: Re: [PATCH][GCC][front-end][build-machinery][opt-framework]
> Allow setting of stack-clash via configure options. [Patch (4/6)]
> 
> On 07/20/2018 09:39 AM, Tamar Christina wrote:
> >>
> >> On 07/20/2018 05:03 AM, Tamar Christina wrote:
>  Understood.  Thanks for verifying.  I wonder if we could just bury
>  this entirely in the aarch64 config files and not expose the
>  default into
> >> params.def?
> 
> >>>
> >>> Burying it in config.gcc isn't ideal because if your C runtime is
> >>> configurable (like uClibc) it means you have to go and modify this
> >>> file every time you change something. If the argument is against
> >>> having these defines in the params and not the configure flag itself
> >>> then I
> >> can just have an aarch64 specific configure flag and just use the
> >> created define directly in the AArch64 back-end.
> >> Not config.gcc, but in a .h/.c file for the target.
> >>
> >> If we leave the generic option, but override the default in the target 
> >> files.
> >> Would that work?
> >
> > So leaving the generic configure option? Yes that would work too. The
> > only downside is that if we have want to do any validation on the
> > value at configure time it would need to be manually kept in sync with
> > those in params.def. Or we'd just have to not do any checking at
> > configure time.  This would mean you can get to the end of your build and
> only when you try to use the compiler would it complain.
> >
> > Both aren't a real deal breaker to me.
> >
> > Shall I then just leave the configure flag but remove the params plumbing?
> Yea, I think any sanity check essentially has to move to when the compiler
> runs.  We can always return to param removal at a later point.
> 
> Can you post an updated patch?  Note I'm on PTO starting Wed for a week.
>  If you post it tomorrow I'll try to take a look before I disappear.
> 
> jeff
diff --git a/gcc/config.in b/gcc/config.in
index 2856e72d627df537a301a6c7ab6b5bbb75f6b43f..f3b301ef5afdaf0db8865e11601980f19ea0b3dd 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -55,6 +55,12 @@
 #endif
 
 
+/* Define to larger than zero set the default stack clash protector size. */
+#ifndef USED_FOR_TARGET
+#undef DEFAULT_STK_CLASH_GUARD_SIZE
+#endif
+
+
 /* Define if you want to use __cxa_atexit, rather than atexit, to register C++
destructors for local statics and global objects. This is essential for
fully standards-compliant handling of destructors, but requires
diff --git a/gcc/configure b/gcc/configure
index 60d373982fd38fe51c285e2b02941754d1b833d6..42ec5b536bee90adb319d172eb7cca1a363a87b6 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -905,6 +905,7 @@ enable_valgrind_annotations
 with_stabs
 enable_multilib
 enable_multiarch
+with_stack_clash_protection_guard_size
 enable___cxa_atexit
 enable_decimal_float
 enable_fixed_point
@@ -1724,6 +1725,9 @@ Optional Packages:
   --with-gnu-as   arrange to work with GNU as
   --with-as   arrange to use the specified as (full pathname)
   --with-stabsarrange to use stabs instead of host debug format
+  --with-stack-clash-protection-guard-size=size
+  Set the default stack clash protection guard size
+  for specific targets.
   --with-dwarf2   force the default debug format to be DWARF 2
   --with-specs=SPECS  add SPECS to driver command-line processing
   --with-pkgversion=PKG   Use PKG in the version string in place of "GCC"
@@ -7436,6 +7440,35 @@ $as_echo "$enable_multiarch$ma_msg_suffix" >&6; }
 
 
 
+# default stack clash protection guard size
+# Please keep these in sync with params.def.
+stk_clash_min=12
+stk_clash_max=30
+stk_clash_default=12
+
+# Keep the default value when the option is not used to 0, this allows us to
+# distinguish between the cases where the user specifially set a value via
+# 

Re: [PATCH] Fix up pr19476-{1,5}.C (PR testsuite/86649)

2018-07-24 Thread Jakub Jelinek
On Tue, Jul 24, 2018 at 12:08:35PM +0200, Richard Biener wrote:
> OK - can you add a variant with -O2 that tests it at EVRP time then?

Here is what I've committed to trunk then:

2018-07-24  Jakub Jelinek  

PR testsuite/86649
* g++.dg/tree-ssa-/pr19476-1.C: Check dom2 dump instead of ccp1.
* g++.dg/tree-ssa-/pr19476-5.C: Likewise.
* g++.dg/tree-ssa-/pr19476-6.C: New test.
* g++.dg/tree-ssa-/pr19476-7.C: New test.

--- gcc/testsuite/g++.dg/tree-ssa/pr19476-1.C.jj2015-05-29 
15:04:33.037803445 +0200
+++ gcc/testsuite/g++.dg/tree-ssa/pr19476-1.C   2018-07-24 11:39:10.108897097 
+0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-ccp1 -fdelete-null-pointer-checks" } */
+/* { dg-options "-O -fdump-tree-dom2 -fdelete-null-pointer-checks" } */
 /* { dg-skip-if "" keeps_null_pointer_checks } */
 
 // See pr19476-5.C for a version without including .
@@ -12,5 +12,5 @@ int g(){
   return 42 + (0 == new int[50]);
 }
 
-/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */
-/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */
+/* { dg-final { scan-tree-dump "return 42" "dom2" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "dom2" } } */
--- gcc/testsuite/g++.dg/tree-ssa/pr19476-5.C.jj2015-05-29 
15:04:33.038803430 +0200
+++ gcc/testsuite/g++.dg/tree-ssa/pr19476-5.C   2018-07-24 11:39:26.190913802 
+0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-ccp1 -fdelete-null-pointer-checks" } */
+/* { dg-options "-O -fdump-tree-dom2 -fdelete-null-pointer-checks" } */
 /* { dg-skip-if "" keeps_null_pointer_checks } */
 
 // See pr19476-1.C for a version that includes .
@@ -8,4 +8,4 @@ int g(){
   return 42 + (0 == new int[50]);
 }
 
-/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */
+/* { dg-final { scan-tree-dump "return 42" "dom2" } } */
--- gcc/testsuite/g++.dg/tree-ssa/pr19476-6.C.jj2018-07-24 
12:09:30.321890628 +0200
+++ gcc/testsuite/g++.dg/tree-ssa/pr19476-6.C   2018-07-24 12:10:49.812987922 
+0200
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp -fdelete-null-pointer-checks" } */
+/* { dg-skip-if "" keeps_null_pointer_checks } */
+
+// See pr19476-7.C for a version without including .
+#include 
+
+int f(){
+  return 33 + (0 == new(std::nothrow) int);
+}
+int g(){
+  return 42 + (0 == new int[50]);
+}
+
+/* { dg-final { scan-tree-dump "return 42" "evrp" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "evrp" } } */
--- gcc/testsuite/g++.dg/tree-ssa/pr19476-7.C.jj2018-07-24 
12:09:33.034893945 +0200
+++ gcc/testsuite/g++.dg/tree-ssa/pr19476-7.C   2018-07-24 12:11:03.657004866 
+0200
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp -fdelete-null-pointer-checks" } */
+/* { dg-skip-if "" keeps_null_pointer_checks } */
+
+// See pr19476-6.C for a version that includes .
+
+int g(){
+  return 42 + (0 == new int[50]);
+}
+
+/* { dg-final { scan-tree-dump "return 42" "evrp" } } */


Jakub


Re: [RFC][debug] Add -fadd-debug-nops

2018-07-24 Thread Tom de Vries
On 07/16/2018 05:10 PM, Tom de Vries wrote:
> On 07/16/2018 03:50 PM, Richard Biener wrote:
>> On Mon, 16 Jul 2018, Tom de Vries wrote:
>>> Any comments?
>>
>> Interesting idea.  I wonder if that should be generalized
>> to other places
> 
> I kept the option name general, to allow for that.
> 
> And indeed, this is a point-fix patch. I've been playing around with a
> more generic patch that adds nops such that each is_stmt .loc is
> associated with a unique insn, but that was embedded in an
> fkeep-vars-live branch, so this patch is minimally addressing the first
> problem I managed to reproduce on trunk.
> 
>> and how we can avoid compare-debug failures
>> (var-tracking usually doesn't change code-generation).
>>
> 

I'll post this patch series (the current state of my fkeep-vars-live
branch) in reply to this email:

 1  [debug] Add fdebug-nops
 2  [debug] Add fkeep-vars-live
 3  [debug] Add fdebug-nops and fkeep-vars-live to Og only

Bootstrapped and reg-tested on x86_64. ChangeLog entries and function
header comments missing.

Comments welcome.

Thanks,
- Tom


[09/46] Add vec_info::lookup_single_use

2018-07-24 Thread Richard Sandiford
This patch adds a helper function for seeing whether there is a single
user of an SSA name, and whether that user has a stmt_vec_info.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vec_info::lookup_single_use): Declare.
* tree-vectorizer.c (vec_info::lookup_single_use): New function.
* tree-vect-loop.c (vectorizable_reduction): Use it instead of
a single_imm_use-based sequence.
* tree-vect-stmts.c (supportable_widening_operation): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:27.285336715 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:30.401309046 +0100
@@ -220,6 +220,7 @@ struct vec_info {
   stmt_vec_info add_stmt (gimple *);
   stmt_vec_info lookup_stmt (gimple *);
   stmt_vec_info lookup_def (tree);
+  stmt_vec_info lookup_single_use (tree);
 
   /* The type of vectorization.  */
   vec_kind kind;
Index: gcc/tree-vectorizer.c
===
--- gcc/tree-vectorizer.c   2018-07-24 10:22:27.285336715 +0100
+++ gcc/tree-vectorizer.c   2018-07-24 10:22:30.401309046 +0100
@@ -548,6 +548,20 @@ vec_info::lookup_def (tree name)
   return NULL;
 }
 
+/* See whether there is a single non-debug statement that uses LHS and
+   whether that statement has an associated stmt_vec_info.  Return the
+   stmt_vec_info if so, otherwise return null.  */
+
+stmt_vec_info
+vec_info::lookup_single_use (tree lhs)
+{
+  use_operand_p dummy;
+  gimple *use_stmt;
+  if (single_imm_use (lhs, , _stmt))
+return lookup_stmt (use_stmt);
+  return NULL;
+}
+
 /* A helper function to free scev and LOOP niter information, as well as
clear loop constraint LOOP_C_FINITE.  */
 
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:22:27.277336786 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:30.401309046 +0100
@@ -6138,6 +6138,7 @@ vectorizable_reduction (gimple *stmt, gi
 
   if (gimple_code (stmt) == GIMPLE_PHI)
 {
+  tree phi_result = gimple_phi_result (stmt);
   /* Analysis is fully done on the reduction stmt invocation.  */
   if (! vec_stmt)
{
@@ -6158,7 +6159,8 @@ vectorizable_reduction (gimple *stmt, gi
   if (STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (reduc_stmt)))
reduc_stmt = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (reduc_stmt));
 
-  if (STMT_VINFO_VEC_REDUCTION_TYPE (vinfo_for_stmt (reduc_stmt))
+  stmt_vec_info reduc_stmt_info = vinfo_for_stmt (reduc_stmt);
+  if (STMT_VINFO_VEC_REDUCTION_TYPE (reduc_stmt_info)
  == EXTRACT_LAST_REDUCTION)
/* Leave the scalar phi in place.  */
return true;
@@ -6185,15 +6187,12 @@ vectorizable_reduction (gimple *stmt, gi
   else
ncopies = vect_get_num_copies (loop_vinfo, vectype_in);
 
-  use_operand_p use_p;
-  gimple *use_stmt;
+  stmt_vec_info use_stmt_info;
   if (ncopies > 1
- && (STMT_VINFO_RELEVANT (vinfo_for_stmt (reduc_stmt))
- <= vect_used_only_live)
- && single_imm_use (gimple_phi_result (stmt), _p, _stmt)
- && (use_stmt == reduc_stmt
- || (STMT_VINFO_RELATED_STMT (vinfo_for_stmt (use_stmt))
- == reduc_stmt)))
+ && STMT_VINFO_RELEVANT (reduc_stmt_info) <= vect_used_only_live
+ && (use_stmt_info = loop_vinfo->lookup_single_use (phi_result))
+ && (use_stmt_info == reduc_stmt_info
+ || STMT_VINFO_RELATED_STMT (use_stmt_info) == reduc_stmt))
single_defuse_cycle = true;
 
   /* Create the destination vector  */
@@ -6955,13 +6954,13 @@ vectorizable_reduction (gimple *stmt, gi
This only works when we see both the reduction PHI and its only consumer
in vectorizable_reduction and there are no intermediate stmts
participating.  */
-  use_operand_p use_p;
-  gimple *use_stmt;
+  stmt_vec_info use_stmt_info;
+  tree reduc_phi_result = gimple_phi_result (reduc_def_stmt);
   if (ncopies > 1
   && (STMT_VINFO_RELEVANT (stmt_info) <= vect_used_only_live)
-  && single_imm_use (gimple_phi_result (reduc_def_stmt), _p, _stmt)
-  && (use_stmt == stmt
- || STMT_VINFO_RELATED_STMT (vinfo_for_stmt (use_stmt)) == stmt))
+  && (use_stmt_info = loop_vinfo->lookup_single_use (reduc_phi_result))
+  && (use_stmt_info == stmt_info
+ || STMT_VINFO_RELATED_STMT (use_stmt_info) == stmt))
 {
   single_defuse_cycle = true;
   epilog_copies = 1;
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:22:27.281336751 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:22:30.401309046 +0100
@@ -10310,14 +10310,11 @@ supportable_widening_operation (enum tre
  same operation.  One such an example is s += a * b, where elements
   

[08/46] Add vec_info::lookup_def

2018-07-24 Thread Richard Sandiford
This patch adds a vec_info helper for checking whether an operand is an
SSA_NAME that is defined in the vectorisable region.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vec_info::lookup_def): Declare.
* tree-vectorizer.c (vec_info::lookup_def): New function.
* tree-vect-patterns.c (vect_get_internal_def): Use it.
(vect_widened_op_tree): Likewise.
* tree-vect-stmts.c (vect_is_simple_use): Likewise.
* tree-vect-loop.c (vect_analyze_loop_operations): Likewise.
(vectorizable_reduction): Likewise.
(vect_valid_reduction_input_p): Take a stmt_vec_info instead
of a gimple *.
(vect_is_slp_reduction): Update calls accordingly.  Use
vec_info::lookup_def.
(vect_is_simple_reduction): Likewise
* tree-vect-slp.c (vect_detect_hybrid_slp_1): Use vec_info::lookup_def.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:23.797367688 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:27.285336715 +0100
@@ -219,6 +219,7 @@ struct vec_info {
 
   stmt_vec_info add_stmt (gimple *);
   stmt_vec_info lookup_stmt (gimple *);
+  stmt_vec_info lookup_def (tree);
 
   /* The type of vectorization.  */
   vec_kind kind;
Index: gcc/tree-vectorizer.c
===
--- gcc/tree-vectorizer.c   2018-07-24 10:22:23.797367688 +0100
+++ gcc/tree-vectorizer.c   2018-07-24 10:22:27.285336715 +0100
@@ -535,6 +535,19 @@ vec_info::lookup_stmt (gimple *stmt)
   return NULL;
 }
 
+/* If NAME is an SSA_NAME and its definition has an associated stmt_vec_info,
+   return that stmt_vec_info, otherwise return null.  It is safe to call
+   this on arbitrary operands.  */
+
+stmt_vec_info
+vec_info::lookup_def (tree name)
+{
+  if (TREE_CODE (name) == SSA_NAME
+  && !SSA_NAME_IS_DEFAULT_DEF (name))
+return lookup_stmt (SSA_NAME_DEF_STMT (name));
+  return NULL;
+}
+
 /* A helper function to free scev and LOOP niter information, as well as
clear loop constraint LOOP_C_FINITE.  */
 
Index: gcc/tree-vect-patterns.c
===
--- gcc/tree-vect-patterns.c2018-07-24 10:22:23.793367723 +0100
+++ gcc/tree-vect-patterns.c2018-07-24 10:22:27.281336751 +0100
@@ -227,14 +227,11 @@ vect_element_precision (unsigned int pre
 static stmt_vec_info
 vect_get_internal_def (vec_info *vinfo, tree op)
 {
-  vect_def_type dt;
-  gimple *def_stmt;
-  if (TREE_CODE (op) != SSA_NAME
-  || !vect_is_simple_use (op, vinfo, , _stmt)
-  || dt != vect_internal_def)
-return NULL;
-
-  return vinfo_for_stmt (def_stmt);
+  stmt_vec_info def_stmt_info = vinfo->lookup_def (op);
+  if (def_stmt_info
+  && STMT_VINFO_DEF_TYPE (def_stmt_info) == vect_internal_def)
+return def_stmt_info;
+  return NULL;
 }
 
 /* Check whether NAME, an ssa-name used in USE_STMT,
@@ -528,6 +525,7 @@ vect_widened_op_tree (stmt_vec_info stmt
  vect_unpromoted_value *unprom, tree *common_type)
 {
   /* Check for an integer operation with the right code.  */
+  vec_info *vinfo = stmt_info->vinfo;
   gassign *assign = dyn_cast  (stmt_info->stmt);
   if (!assign)
 return 0;
@@ -584,7 +582,7 @@ vect_widened_op_tree (stmt_vec_info stmt
 
  /* Recursively process the definition of the operand.  */
  stmt_vec_info def_stmt_info
-   = vinfo_for_stmt (SSA_NAME_DEF_STMT (this_unprom->op));
+   = vinfo->lookup_def (this_unprom->op);
  nops = vect_widened_op_tree (def_stmt_info, code, widened_code,
   shift_p, max_nops, this_unprom,
   common_type);
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:22:23.797367688 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:22:27.281336751 +0100
@@ -10092,11 +10092,11 @@ vect_is_simple_use (tree operand, vec_in
   else
 {
   gimple *def_stmt = SSA_NAME_DEF_STMT (operand);
-  if (! vect_stmt_in_region_p (vinfo, def_stmt))
+  stmt_vec_info stmt_vinfo = vinfo->lookup_def (operand);
+  if (!stmt_vinfo)
*dt = vect_external_def;
   else
{
- stmt_vec_info stmt_vinfo = vinfo_for_stmt (def_stmt);
  if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo))
{
  def_stmt = STMT_VINFO_RELATED_STMT (stmt_vinfo);
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:22:23.793367723 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:27.277336786 +0100
@@ -1569,26 +1569,19 @@ vect_analyze_loop_operations (loop_vec_i
   if (STMT_VINFO_RELEVANT_P (stmt_info))
 {
   tree phi_op;
-

[16/46] Make STMT_VINFO_REDUC_DEF a stmt_vec_info

2018-07-24 Thread Richard Sandiford
This patch changes STMT_VINFO_REDUC_DEF from a gimple stmt to a
stmt_vec_info.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_stmt_vec_info::reduc_def): Change from
a gimple stmt to a stmt_vec_info.
* tree-vect-loop.c (vect_active_double_reduction_p)
(vect_force_simple_reduction, vectorizable_reduction): Update
accordingly.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:50.777128110 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:53.909100298 +0100
@@ -921,7 +921,7 @@ struct _stmt_vec_info {
   /* On a reduction PHI the def returned by vect_force_simple_reduction.
  On the def returned by vect_force_simple_reduction the
  corresponding PHI.  */
-  gimple *reduc_def;
+  stmt_vec_info reduc_def;
 
   /* The number of scalar stmt references from active SLP instances.  */
   unsigned int num_slp_uses;
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:22:50.777128110 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:53.909100298 +0100
@@ -1499,8 +1499,7 @@ vect_active_double_reduction_p (stmt_vec
   if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_double_reduction_def)
 return false;
 
-  gimple *other_phi = STMT_VINFO_REDUC_DEF (stmt_info);
-  return STMT_VINFO_RELEVANT_P (vinfo_for_stmt (other_phi));
+  return STMT_VINFO_RELEVANT_P (STMT_VINFO_REDUC_DEF (stmt_info));
 }
 
 /* Function vect_analyze_loop_operations.
@@ -3293,12 +3292,12 @@ vect_force_simple_reduction (loop_vec_in
  _reduc_type);
   if (def)
 {
-  stmt_vec_info reduc_def_info = vinfo_for_stmt (phi);
-  STMT_VINFO_REDUC_TYPE (reduc_def_info) = v_reduc_type;
-  STMT_VINFO_REDUC_DEF (reduc_def_info) = def;
-  reduc_def_info = vinfo_for_stmt (def);
-  STMT_VINFO_REDUC_TYPE (reduc_def_info) = v_reduc_type;
-  STMT_VINFO_REDUC_DEF (reduc_def_info) = phi;
+  stmt_vec_info phi_info = vinfo_for_stmt (phi);
+  stmt_vec_info def_info = vinfo_for_stmt (def);
+  STMT_VINFO_REDUC_TYPE (phi_info) = v_reduc_type;
+  STMT_VINFO_REDUC_DEF (phi_info) = def_info;
+  STMT_VINFO_REDUC_TYPE (def_info) = v_reduc_type;
+  STMT_VINFO_REDUC_DEF (def_info) = phi_info;
 }
   return def;
 }
@@ -6153,17 +6152,16 @@ vectorizable_reduction (gimple *stmt, gi
   for reductions involving a single statement.  */
return true;
 
-  gimple *reduc_stmt = STMT_VINFO_REDUC_DEF (stmt_info);
-  if (STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (reduc_stmt)))
-   reduc_stmt = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (reduc_stmt));
+  stmt_vec_info reduc_stmt_info = STMT_VINFO_REDUC_DEF (stmt_info);
+  if (STMT_VINFO_IN_PATTERN_P (reduc_stmt_info))
+   reduc_stmt_info = STMT_VINFO_RELATED_STMT (reduc_stmt_info);
 
-  stmt_vec_info reduc_stmt_info = vinfo_for_stmt (reduc_stmt);
   if (STMT_VINFO_VEC_REDUCTION_TYPE (reduc_stmt_info)
  == EXTRACT_LAST_REDUCTION)
/* Leave the scalar phi in place.  */
return true;
 
-  gcc_assert (is_gimple_assign (reduc_stmt));
+  gassign *reduc_stmt = as_a  (reduc_stmt_info->stmt);
   for (unsigned k = 1; k < gimple_num_ops (reduc_stmt); ++k)
{
  tree op = gimple_op (reduc_stmt, k);
@@ -6314,7 +6312,7 @@ vectorizable_reduction (gimple *stmt, gi
  The last use is the reduction variable.  In case of nested cycle this
  assumption is not true: we use reduc_index to record the index of the
  reduction variable.  */
-  gimple *reduc_def_stmt = NULL;
+  stmt_vec_info reduc_def_info = NULL;
   int reduc_index = -1;
   for (i = 0; i < op_type; i++)
 {
@@ -6329,7 +6327,7 @@ vectorizable_reduction (gimple *stmt, gi
   gcc_assert (is_simple_use);
   if (dt == vect_reduction_def)
{
- reduc_def_stmt = def_stmt_info;
+ reduc_def_info = def_stmt_info;
  reduc_index = i;
  continue;
}
@@ -6353,7 +6351,7 @@ vectorizable_reduction (gimple *stmt, gi
   if (dt == vect_nested_cycle)
{
  found_nested_cycle_def = true;
- reduc_def_stmt = def_stmt_info;
+ reduc_def_info = def_stmt_info;
  reduc_index = i;
}
 
@@ -6391,12 +6389,16 @@ vectorizable_reduction (gimple *stmt, gi
}
 
   if (orig_stmt_info)
-   reduc_def_stmt = STMT_VINFO_REDUC_DEF (orig_stmt_info);
+   reduc_def_info = STMT_VINFO_REDUC_DEF (orig_stmt_info);
   else
-   reduc_def_stmt = STMT_VINFO_REDUC_DEF (stmt_info);
+   reduc_def_info = STMT_VINFO_REDUC_DEF (stmt_info);
 }
 
-  if (! reduc_def_stmt || gimple_code (reduc_def_stmt) != GIMPLE_PHI)
+  if (! reduc_def_info)
+return false;
+
+  gphi *reduc_def_phi = dyn_cast  (reduc_def_info->stmt);
+  if (!reduc_def_phi)
 return false;
 
   if (!(reduc_index == -1

[17/46] Make LOOP_VINFO_REDUCTIONS an auto_vec

2018-07-24 Thread Richard Sandiford
This patch changes LOOP_VINFO_REDUCTIONS from an auto_vec
to an auto_vec.  It also changes the associated
vect_force_simple_reduction so that it takes and returns stmt_vec_infos
instead of gimple stmts.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_loop_vec_info::reductions): Change from an
auto_vec to an auto_vec.
(vect_force_simple_reduction): Take and return stmt_vec_infos rather
than gimple stmts.
* tree-parloops.c (valid_reduction_p): Take a stmt_vec_info instead
of a gimple stmt.
(gather_scalar_reductions): Update after above interface changes.
* tree-vect-loop.c (vect_analyze_scalar_cycles_1): Likewise.
(vect_is_simple_reduction): Take and return stmt_vec_infos rather
than gimple stmts.
(vect_force_simple_reduction): Likewise.
* tree-vect-patterns.c (vect_pattern_recog_1): Update use of
LOOP_VINFO_REDUCTIONS.
* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:53.909100298 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:57.277070390 +0100
@@ -475,7 +475,7 @@ typedef struct _loop_vec_info : public v
   auto_vec may_misalign_stmts;
 
   /* Reduction cycles detected in the loop. Used in loop-aware SLP.  */
-  auto_vec reductions;
+  auto_vec reductions;
 
   /* All reduction chains in the loop, represented by the first
  stmt in the chain.  */
@@ -1627,8 +1627,8 @@ extern tree vect_create_addr_base_for_ve
 
 /* In tree-vect-loop.c.  */
 /* FORNOW: Used in tree-parloops.c.  */
-extern gimple *vect_force_simple_reduction (loop_vec_info, gimple *,
-   bool *, bool);
+extern stmt_vec_info vect_force_simple_reduction (loop_vec_info, stmt_vec_info,
+ bool *, bool);
 /* Used in gimple-loop-interchange.c.  */
 extern bool check_reduction_path (dump_user_location_t, loop_p, gphi *, tree,
  enum tree_code);
Index: gcc/tree-parloops.c
===
--- gcc/tree-parloops.c 2018-06-27 10:27:09.778650686 +0100
+++ gcc/tree-parloops.c 2018-07-24 10:22:57.273070426 +0100
@@ -2570,15 +2570,14 @@ set_reduc_phi_uids (reduction_info **slo
   return 1;
 }
 
-/* Return true if the type of reduction performed by STMT is suitable
+/* Return true if the type of reduction performed by STMT_INFO is suitable
for this pass.  */
 
 static bool
-valid_reduction_p (gimple *stmt)
+valid_reduction_p (stmt_vec_info stmt_info)
 {
   /* Parallelization would reassociate the operation, which isn't
  allowed for in-order reductions.  */
-  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   vect_reduction_type reduc_type = STMT_VINFO_REDUC_TYPE (stmt_info);
   return reduc_type != FOLD_LEFT_REDUCTION;
 }
@@ -2615,10 +2614,11 @@ gather_scalar_reductions (loop_p loop, r
   if (simple_iv (loop, loop, res, , true))
continue;
 
-  gimple *reduc_stmt
-   = vect_force_simple_reduction (simple_loop_info, phi,
+  stmt_vec_info reduc_stmt_info
+   = vect_force_simple_reduction (simple_loop_info,
+  simple_loop_info->lookup_stmt (phi),
   _reduc, true);
-  if (!reduc_stmt || !valid_reduction_p (reduc_stmt))
+  if (!reduc_stmt_info || !valid_reduction_p (reduc_stmt_info))
continue;
 
   if (double_reduc)
@@ -2627,11 +2627,11 @@ gather_scalar_reductions (loop_p loop, r
continue;
 
  double_reduc_phis.safe_push (phi);
- double_reduc_stmts.safe_push (reduc_stmt);
+ double_reduc_stmts.safe_push (reduc_stmt_info->stmt);
  continue;
}
 
-  build_new_reduction (reduction_list, reduc_stmt, phi);
+  build_new_reduction (reduction_list, reduc_stmt_info->stmt, phi);
 }
   delete simple_loop_info;
 
@@ -2661,12 +2661,15 @@ gather_scalar_reductions (loop_p loop, r
 , true))
continue;
 
- gimple *inner_reduc_stmt
-   = vect_force_simple_reduction (simple_loop_info, inner_phi,
+ stmt_vec_info inner_phi_info
+   = simple_loop_info->lookup_stmt (inner_phi);
+ stmt_vec_info inner_reduc_stmt_info
+   = vect_force_simple_reduction (simple_loop_info,
+  inner_phi_info,
   _reduc, true);
  gcc_assert (!double_reduc);
- if (inner_reduc_stmt == NULL
- || !valid_reduction_p (inner_reduc_stmt))
+ if (!inner_reduc_stmt_info
+ || !valid_reduction_p (inner_reduc_stmt_info))
continue;
 
  build_new_reduction (reduction_list, 

[15/46] Make SLP_TREE_VEC_STMTS a vec

2018-07-24 Thread Richard Sandiford
This patch changes SLP_TREE_VEC_STMTS from a vec to a
vec.  This involved making the same change to the
phis vector in vectorizable_reduction, since SLP_TREE_VEC_STMTS is
spliced into it here:

  phis.splice (SLP_TREE_VEC_STMTS (slp_node_instance->reduc_phis));


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_slp_tree::vec_stmts): Change from a
vec to a vec.
* tree-vect-loop.c (vect_create_epilog_for_reduction): Change
the reduction_phis argument from a vec to a
vec.
(vectorizable_reduction): Likewise the phis local variable that
is passed to vect_create_epilog_for_reduction.  Update for new type
of SLP_TREE_VEC_STMTS.
(vectorizable_induction): Update for new type of SLP_TREE_VEC_STMTS.
(vectorizable_live_operation): Likewise.
* tree-vect-slp.c (vect_get_slp_vect_defs): Likewise.
(vect_transform_slp_perm_load, vect_schedule_slp_instance): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:47.489157307 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:50.777128110 +0100
@@ -143,7 +143,7 @@ struct _slp_tree {
  permutation.  */
   vec load_permutation;
   /* Vectorized stmt/s.  */
-  vec vec_stmts;
+  vec vec_stmts;
   /* Number of vector stmts that are created to replace the group of scalar
  stmts. It is calculated during the transformation phase as the number of
  scalar elements in one scalar iteration (GROUP_SIZE) multiplied by VF
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:22:47.489157307 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:50.777128110 +0100
@@ -4412,7 +4412,7 @@ get_initial_defs_for_reduction (slp_tree
 vect_create_epilog_for_reduction (vec vect_defs, gimple *stmt,
  gimple *reduc_def_stmt,
  int ncopies, internal_fn reduc_fn,
- vec reduction_phis,
+ vec reduction_phis,
   bool double_reduc, 
  slp_tree slp_node,
  slp_instance slp_node_instance,
@@ -4429,6 +4429,7 @@ vect_create_epilog_for_reduction (vec new_phis;
   auto_vec inner_phis;
@@ -4540,7 +4542,7 @@ vect_create_epilog_for_reduction (vec (phi_info->stmt);
  if (STMT_VINFO_VEC_REDUCTION_TYPE (stmt_info)
  == INTEGER_INDUC_COND_REDUCTION)
{
@@ -4569,19 +4572,18 @@ vect_create_epilog_for_reduction (vec (phi), induc_val_vec,
-  loop_preheader_edge (loop), UNKNOWN_LOCATION);
+ add_phi_arg (phi, induc_val_vec, loop_preheader_edge (loop),
+  UNKNOWN_LOCATION);
}
  else
-   add_phi_arg (as_a  (phi), vec_init_def,
-loop_preheader_edge (loop), UNKNOWN_LOCATION);
+   add_phi_arg (phi, vec_init_def, loop_preheader_edge (loop),
+UNKNOWN_LOCATION);
 
   /* Set the loop-latch arg for the reduction-phi.  */
   if (j > 0)
 def = vect_get_vec_def_for_stmt_copy (vect_unknown_def_type, def);
 
-  add_phi_arg (as_a  (phi), def, loop_latch_edge (loop),
-  UNKNOWN_LOCATION);
+ add_phi_arg (phi, def, loop_latch_edge (loop), UNKNOWN_LOCATION);
 
   if (dump_enabled_p ())
 {
@@ -5599,7 +5601,7 @@ vect_create_epilog_for_reduction (vecdest_idx, vect_phi_res);
-  use = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (use));
-}
+ stmt_vec_info use_info = reduction_phi_info;
+ for (j = 0; j < ncopies; j++)
+   {
+ edge pr_edge = loop_preheader_edge (loop);
+ SET_PHI_ARG_DEF (as_a  (use_info->stmt),
+  pr_edge->dest_idx, vect_phi_res);
+ use_info = STMT_VINFO_RELATED_STMT (use_info);
+   }
 }
 }
 }
@@ -6112,7 +6114,7 @@ vectorizable_reduction (gimple *stmt, gi
   auto_vec vec_oprnds1;
   auto_vec vec_oprnds2;
   auto_vec vect_defs;
-  auto_vec phis;
+  auto_vec phis;
   int vec_num;
   tree def0, tem;
   tree cr_index_scalar_type = NULL_TREE, cr_index_vector_type = NULL_TREE;
@@ -6218,7 +6220,7 @@ vectorizable_reduction (gimple *stmt, gi
  stmt_vec_info new_phi_info = loop_vinfo->add_stmt (new_phi);
 
  if (slp_node)
-   SLP_TREE_VEC_STMTS (slp_node).quick_push (new_phi);
+   SLP_TREE_VEC_STMTS (slp_node).quick_push (new_phi_info);
  else
{
  if (j == 0)
@@ -7075,9 +7077,9 @@ vectorizable_reduction (gimple *stmt, 

[38/46] Pass stmt_vec_infos instead of data_references where relevant

2018-07-24 Thread Richard Sandiford
This patch makes various routines (mostly in tree-vect-data-refs.c)
take stmt_vec_infos rather than data_references.  The affected routines
are really dealing with the way that an access is going to vectorised
for a particular stmt_vec_info, rather than with the original scalar
access described by the data_reference.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_supportable_dr_alignment): Take
a stmt_vec_info rather than a data_reference.
* tree-vect-data-refs.c (vect_calculate_target_alignment)
(vect_compute_data_ref_alignment, vect_update_misalignment_for_peel)
(verify_data_ref_alignment, vector_alignment_reachable_p)
(vect_get_data_access_cost, vect_get_peeling_costs_all_drs)
(vect_peeling_supportable, vect_analyze_group_access_1)
(vect_analyze_group_access, vect_analyze_data_ref_access)
(vect_vfa_segment_size, vect_vfa_access_size, vect_small_gap_p)
(vectorizable_with_step_bound_p, vect_duplicate_ssa_name_ptr_info)
(vect_supportable_dr_alignment): Likewise.  Update calls to other
functions for which the same change is being made.
(vect_verify_datarefs_alignment, vect_find_same_alignment_drs)
(vect_analyze_data_refs_alignment): Update calls accordingly.
(vect_slp_analyze_and_verify_node_alignment): Likewise.
(vect_analyze_data_ref_accesses): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
(vect_create_addr_base_for_vector_ref): Likewise.
(vect_create_data_ref_ptr): Likewise.
(_vect_peel_info::dr): Replace with...
(_vect_peel_info::stmt_info): ...this new field.
(vect_peeling_hash_get_most_frequent): Update _vect_peel_info uses
accordingly, and update after above interface changes.
(vect_peeling_hash_get_lowest_cost): Likewise
(vect_peeling_hash_choose_best_peeling): Likewise.
(vect_enhance_data_refs_alignment): Likewise.
(vect_peeling_hash_insert): Likewise.  Take a stmt_vec_info
rather than a data_reference.
* tree-vect-stmts.c (vect_get_store_cost, vect_get_load_cost)
(get_negative_load_store_type): Update calls to
vect_supportable_dr_alignment.
(vect_get_data_ptr_increment, ensure_base_align): Take a
stmt_vec_info instead of a data_reference.
(vectorizable_store, vectorizable_load): Update calls after
above interface changes.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:24:05.744462369 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:24:08.924434128 +0100
@@ -1541,7 +1541,7 @@ extern tree vect_get_mask_type_for_stmt
 /* In tree-vect-data-refs.c.  */
 extern bool vect_can_force_dr_alignment_p (const_tree, unsigned int);
 extern enum dr_alignment_support vect_supportable_dr_alignment
-   (struct data_reference *, bool);
+  (stmt_vec_info, bool);
 extern tree vect_get_smallest_scalar_type (stmt_vec_info, HOST_WIDE_INT *,
HOST_WIDE_INT *);
 extern bool vect_analyze_data_ref_dependences (loop_vec_info, unsigned int *);
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:24:05.740462405 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:24:08.924434128 +0100
@@ -858,19 +858,19 @@ vect_record_base_alignments (vec_info *v
 }
 }
 
-/* Return the target alignment for the vectorized form of DR.  */
+/* Return the target alignment for the vectorized form of the load or store
+   in STMT_INFO.  */
 
 static unsigned int
-vect_calculate_target_alignment (struct data_reference *dr)
+vect_calculate_target_alignment (stmt_vec_info stmt_info)
 {
-  stmt_vec_info stmt_info = vect_dr_stmt (dr);
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
   return targetm.vectorize.preferred_vector_alignment (vectype);
 }
 
 /* Function vect_compute_data_ref_alignment
 
-   Compute the misalignment of the data reference DR.
+   Compute the misalignment of the load or store in STMT_INFO.
 
Output:
1. dr_misalignment (STMT_INFO) is defined.
@@ -879,9 +879,9 @@ vect_calculate_target_alignment (struct
only for trivial cases. TODO.  */
 
 static void
-vect_compute_data_ref_alignment (struct data_reference *dr)
+vect_compute_data_ref_alignment (stmt_vec_info stmt_info)
 {
-  stmt_vec_info stmt_info = vect_dr_stmt (dr);
+  data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
   vec_base_alignments *base_alignments = _info->vinfo->base_alignments;
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
   struct loop *loop = NULL;
@@ -905,7 +905,7 @@ vect_compute_data_ref_alignment (struct
   bool step_preserves_misalignment_p;
 
   unsigned HOST_WIDE_INT vector_alignment
-= vect_calculate_target_alignment (dr) / BITS_PER_UNIT;

Re: [PATCH] Fix up pr19476-{1,5}.C (PR testsuite/86649)

2018-07-24 Thread Richard Biener
On Tue, 24 Jul 2018, Jakub Jelinek wrote:

> Hi!
> 
> When looking at PR86569 testresults, I must have missed these two tests
> (but looking at test_summary outputs, I see it now).
> When we no longer fold this during cp_fold (to avoid code generation
> changes between -Wnonnull-compare and -Wno-nonnull-compare), it isn't
> folded from the first pass; with -O2 it is folded during evrp and with
> -O1 during dom2.
> 
> Note, the test would fail before with -Wnonnull-compare, e.g. on 8
> branch (which doesn't have the PR86569 changes), I see:
> make check-c++-all RUNTESTFLAGS='--target_board=unix\{,-Wnonnull-compare\} 
> dg.exp=pr19476*'
>   === g++ Summary for unix ===
> 
> # of expected passes  72
> Running target unix/-Wnonnull-compare
> Using /usr/share/dejagnu/baseboards/unix.exp as board description file for 
> target.
> Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
> Using /usr/src/gcc-8/gcc/testsuite/config/default.exp as 
> tool-and-target-specific interface file.
> Running /usr/src/gcc-8/gcc/testsuite/g++.dg/dg.exp ...
> FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++98  scan-tree-dump ccp1 "return 
> 42"
> FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++11  scan-tree-dump ccp1 "return 
> 42"
> FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++14  scan-tree-dump ccp1 "return 
> 42"
> FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++17  scan-tree-dump ccp1 "return 
> 42"
> FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++2a  scan-tree-dump ccp1 "return 
> 42"
> FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++17 -fconcepts  scan-tree-dump 
> ccp1 "return 42"
> FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++98  scan-tree-dump ccp1 "return 
> 42"
> FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++11  scan-tree-dump ccp1 "return 
> 42"
> FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++14  scan-tree-dump ccp1 "return 
> 42"
> FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++17  scan-tree-dump ccp1 "return 
> 42"
> FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++2a  scan-tree-dump ccp1 "return 
> 42"
> FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++17 -fconcepts  scan-tree-dump 
> ccp1 "return 42"
> 
>   === g++ Summary for unix/-Wnonnull-compare ===
> 
> # of expected passes  60
> # of unexpected failures  12
> 
> Especially for -O2 that people use most, folding it at evrp time seems to be
> early enough for me.
> Fixed by testing this only in dom2, tested on x86_64-linux, ok for trunk?

OK - can you add a variant with -O2 that tests it at EVRP time then?

Thanks,
Richard.

> 2018-07-24  Jakub Jelinek  
> 
>   PR testsuite/86649
>   * g++.dg/tree-ssa-/pr19476-1.C: Check dom2 dump instead of ccp1.
>   * g++.dg/tree-ssa-/pr19476-5.C: Likewise.
> 
> --- gcc/testsuite/g++.dg/tree-ssa/pr19476-1.C.jj  2015-05-29 
> 15:04:33.037803445 +0200
> +++ gcc/testsuite/g++.dg/tree-ssa/pr19476-1.C 2018-07-24 11:39:10.108897097 
> +0200
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O -fdump-tree-ccp1 -fdelete-null-pointer-checks" } */
> +/* { dg-options "-O -fdump-tree-dom2 -fdelete-null-pointer-checks" } */
>  /* { dg-skip-if "" keeps_null_pointer_checks } */
>  
>  // See pr19476-5.C for a version without including .
> @@ -12,5 +12,5 @@ int g(){
>return 42 + (0 == new int[50]);
>  }
>  
> -/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */
> -/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */
> +/* { dg-final { scan-tree-dump "return 42" "dom2" } } */
> +/* { dg-final { scan-tree-dump-not "return 33" "dom2" } } */
> --- gcc/testsuite/g++.dg/tree-ssa/pr19476-5.C.jj  2015-05-29 
> 15:04:33.038803430 +0200
> +++ gcc/testsuite/g++.dg/tree-ssa/pr19476-5.C 2018-07-24 11:39:26.190913802 
> +0200
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O -fdump-tree-ccp1 -fdelete-null-pointer-checks" } */
> +/* { dg-options "-O -fdump-tree-dom2 -fdelete-null-pointer-checks" } */
>  /* { dg-skip-if "" keeps_null_pointer_checks } */
>  
>  // See pr19476-1.C for a version that includes .
> @@ -8,4 +8,4 @@ int g(){
>return 42 + (0 == new int[50]);
>  }
>  
> -/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */
> +/* { dg-final { scan-tree-dump "return 42" "dom2" } } */
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[37/46] Associate alignment information with stmt_vec_infos

2018-07-24 Thread Richard Sandiford
Alignment information is really a property of a stmt_vec_info
(and the way we want to vectorise it) rather than the original scalar dr.
I think that was true even before the recent dr sharing.

This patch therefore makes the alignment-related interfaces take
stmt_vec_infos rather than data_references.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (STMT_VINFO_TARGET_ALIGNMENT): New macro.
(DR_VECT_AUX, DR_MISALIGNMENT, SET_DR_MISALIGNMENT)
(DR_TARGET_ALIGNMENT): Delete.
(set_dr_misalignment, dr_misalignment, aligned_access_p)
(known_alignment_for_access_p, vect_known_alignment_in_bytes)
(vect_dr_behavior): Take a stmt_vec_info rather than a data_reference.
* tree-vect-data-refs.c (vect_calculate_target_alignment)
(vect_compute_data_ref_alignment, vect_update_misalignment_for_peel)
(vector_alignment_reachable_p, vect_get_peeling_costs_all_drs)
(vect_peeling_supportable, vect_enhance_data_refs_alignment)
(vect_duplicate_ssa_name_ptr_info): Update after above changes.
(vect_create_addr_base_for_vector_ref, vect_create_data_ref_ptr)
(vect_setup_realignment, vect_supportable_dr_alignment): Likewise.
* tree-vect-loop-manip.c (get_misalign_in_elems): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
* tree-vect-stmts.c (vect_get_store_cost, vect_get_load_cost)
(compare_step_with_zero, get_group_load_store_type): Likewise.
(vect_get_data_ptr_increment, ensure_base_align, vectorizable_store)
(vectorizable_load): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:24:02.364492386 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:24:05.744462369 +0100
@@ -1031,6 +1031,9 @@ #define STMT_VINFO_NUM_SLP_USES(S)(S)->
 #define STMT_VINFO_REDUC_TYPE(S)   (S)->reduc_type
 #define STMT_VINFO_REDUC_DEF(S)(S)->reduc_def
 
+/* Only defined once dr_misalignment is defined.  */
+#define STMT_VINFO_TARGET_ALIGNMENT(S) (S)->dr_aux.target_alignment
+
 #define DR_GROUP_FIRST_ELEMENT(S)  (gcc_checking_assert ((S)->data_ref_info), 
(S)->first_element)
 #define DR_GROUP_NEXT_ELEMENT(S)   (gcc_checking_assert ((S)->data_ref_info), 
(S)->next_element)
 #define DR_GROUP_SIZE(S)   (gcc_checking_assert ((S)->data_ref_info), 
(S)->size)
@@ -1048,8 +1051,6 @@ #define HYBRID_SLP_STMT(S)
 #define PURE_SLP_STMT(S)  ((S)->slp_type == pure_slp)
 #define STMT_SLP_TYPE(S)   (S)->slp_type
 
-#define DR_VECT_AUX(dr) (_for_stmt (DR_STMT (dr))->dr_aux)
-
 #define VECT_MAX_COST 1000
 
 /* The maximum number of intermediate steps required in multi-step type
@@ -1256,73 +1257,72 @@ add_stmt_costs (void *data, stmt_vector_
 #define DR_MISALIGNMENT_UNKNOWN (-1)
 #define DR_MISALIGNMENT_UNINITIALIZED (-2)
 
+/* Record that the vectorized form of the data access in STMT_INFO
+   will be misaligned by VAL bytes wrt its target alignment.
+   Negative values have the meanings above.  */
+
 inline void
-set_dr_misalignment (struct data_reference *dr, int val)
+set_dr_misalignment (stmt_vec_info stmt_info, int val)
 {
-  dataref_aux *data_aux = DR_VECT_AUX (dr);
-  data_aux->misalignment = val;
+  stmt_info->dr_aux.misalignment = val;
 }
 
+/* Return the misalignment in bytes of the vectorized form of the data
+   access in STMT_INFO, relative to its target alignment.  Negative
+   values have the meanings above.  */
+
 inline int
-dr_misalignment (struct data_reference *dr)
+dr_misalignment (stmt_vec_info stmt_info)
 {
-  int misalign = DR_VECT_AUX (dr)->misalignment;
+  int misalign = stmt_info->dr_aux.misalignment;
   gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED);
   return misalign;
 }
 
-/* Reflects actual alignment of first access in the vectorized loop,
-   taking into account peeling/versioning if applied.  */
-#define DR_MISALIGNMENT(DR) dr_misalignment (DR)
-#define SET_DR_MISALIGNMENT(DR, VAL) set_dr_misalignment (DR, VAL)
-
-/* Only defined once DR_MISALIGNMENT is defined.  */
-#define DR_TARGET_ALIGNMENT(DR) DR_VECT_AUX (DR)->target_alignment
-
-/* Return true if data access DR is aligned to its target alignment
-   (which may be less than a full vector).  */
+/* Return true if the vectorized form of the data access in STMT_INFO is
+   aligned to its target alignment (which may be less than a full vector).  */
 
 static inline bool
-aligned_access_p (struct data_reference *data_ref_info)
+aligned_access_p (stmt_vec_info stmt_info)
 {
-  return (DR_MISALIGNMENT (data_ref_info) == 0);
+  return (dr_misalignment (stmt_info) == 0);
 }
 
-/* Return TRUE if the alignment of the data access is known, and FALSE
-   otherwise.  */
+/* Return true if the alignment of the vectorized form of the data
+   access in STMT_INFO is known at compile time.  */
 
 static inline bool
-known_alignment_for_access_p (struct 

[39/46] Replace STMT_VINFO_UNALIGNED_DR with the associated statement

2018-07-24 Thread Richard Sandiford
After previous changes, it makes more sense to record which stmt's
access is going to be aligned via peeling, rather than the associated
scalar data reference.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_loop_vec_info::unaligned_dr): Replace with...
(_loop_vec_info::unaligned_stmt): ...this new field.
(LOOP_VINFO_UNALIGNED_DR): Delete.
(LOOP_VINFO_UNALIGNED_STMT): New macro.
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Use
LOOP_VINFO_UNALIGNED_STMT instead of LOOP_VINFO_UNALIGNED_DR.
* tree-vect-loop-manip.c (get_misalign_in_elems): Likewise.
(vect_gen_prolog_loop_niters): Likewise.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Update
after above change to _loop_vec_info.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:24:08.924434128 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:24:12.252404574 +0100
@@ -436,7 +436,7 @@ typedef struct _loop_vec_info : public v
   tree mask_compare_type;
 
   /* Unknown DRs according to which loop was peeled.  */
-  struct data_reference *unaligned_dr;
+  stmt_vec_info unaligned_stmt;
 
   /* peeling_for_alignment indicates whether peeling for alignment will take
  place, and what the peeling factor should be:
@@ -445,7 +445,7 @@ typedef struct _loop_vec_info : public v
 If X>0: Peel first X iterations.
 If X=-1: Generate a runtime test to calculate the number of iterations
  to be peeled, using the dataref recorded in the field
- unaligned_dr.  */
+ unaligned_stmt.  */
   int peeling_for_alignment;
 
   /* The mask used to check the alignment of pointers or arrays.  */
@@ -576,7 +576,7 @@ #define LOOP_VINFO_DATAREFS(L)
 #define LOOP_VINFO_DDRS(L) (L)->shared->ddrs
 #define LOOP_VINFO_INT_NITERS(L)   (TREE_INT_CST_LOW ((L)->num_iters))
 #define LOOP_VINFO_PEELING_FOR_ALIGNMENT(L) (L)->peeling_for_alignment
-#define LOOP_VINFO_UNALIGNED_DR(L) (L)->unaligned_dr
+#define LOOP_VINFO_UNALIGNED_STMT(L)   (L)->unaligned_stmt
 #define LOOP_VINFO_MAY_MISALIGN_STMTS(L)   (L)->may_misalign_stmts
 #define LOOP_VINFO_MAY_ALIAS_DDRS(L)   (L)->may_alias_ddrs
 #define LOOP_VINFO_COMP_ALIAS_DDRS(L)  (L)->comp_alias_ddrs
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:24:08.924434128 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:24:12.248404609 +0100
@@ -2134,7 +2134,7 @@ vect_enhance_data_refs_alignment (loop_v
   peel_stmt_info, npeel);
  }
 
-  LOOP_VINFO_UNALIGNED_DR (loop_vinfo) = dr0;
+  LOOP_VINFO_UNALIGNED_STMT (loop_vinfo) = peel_stmt_info;
   if (npeel)
 LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) = npeel;
   else
Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c  2018-07-24 10:24:05.740462405 +0100
+++ gcc/tree-vect-loop-manip.c  2018-07-24 10:24:12.248404609 +0100
@@ -1560,8 +1560,8 @@ vect_update_ivs_after_vectorizer (loop_v
 static tree
 get_misalign_in_elems (gimple **seq, loop_vec_info loop_vinfo)
 {
-  struct data_reference *dr = LOOP_VINFO_UNALIGNED_DR (loop_vinfo);
-  stmt_vec_info stmt_info = vect_dr_stmt (dr);
+  stmt_vec_info stmt_info = LOOP_VINFO_UNALIGNED_STMT (loop_vinfo);
+  struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
 
   unsigned int target_align = STMT_VINFO_TARGET_ALIGNMENT (stmt_info);
@@ -1594,8 +1594,8 @@ get_misalign_in_elems (gimple **seq, loo
 /* Function vect_gen_prolog_loop_niters
 
Generate the number of iterations which should be peeled as prolog for the
-   loop represented by LOOP_VINFO.  It is calculated as the misalignment of
-   DR - the data reference recorded in LOOP_VINFO_UNALIGNED_DR (LOOP_VINFO).
+   loop represented by LOOP_VINFO.  It is calculated as the misalignment of DR
+   - the data reference recorded in LOOP_VINFO_UNALIGNED_STMT (LOOP_VINFO).
As a result, after the execution of this loop, the data reference DR will
refer to an aligned location.  The following computation is generated:
 
@@ -1626,12 +1626,12 @@ get_misalign_in_elems (gimple **seq, loo
 vect_gen_prolog_loop_niters (loop_vec_info loop_vinfo,
 basic_block bb, int *bound)
 {
-  struct data_reference *dr = LOOP_VINFO_UNALIGNED_DR (loop_vinfo);
+  stmt_vec_info stmt_info = LOOP_VINFO_UNALIGNED_STMT (loop_vinfo);
+  data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
   tree var;
   tree niters_type = TREE_TYPE (LOOP_VINFO_NITERS (loop_vinfo));
   gimple_seq stmts = NULL, new_stmts = NULL;
   tree iters, iters_name;
-  stmt_vec_info stmt_info = 

[36/46] Add a pattern_stmt_p field to stmt_vec_info

2018-07-24 Thread Richard Sandiford
This patch adds a pattern_stmt_p field to stmt_vec_info, so that it's
possible to tell whether the statement is a pattern statement without
referring to other statements.  The new field goes in what was
previously a hole in the structure, so the size is the same as before.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_stmt_vec_info::pattern_stmt_p): New field.
(is_pattern_stmt_p): Delete.
* tree-vect-patterns.c (vect_init_pattern_stmt): Set pattern_stmt_p
on pattern statements.
(vect_split_statement, vect_mark_pattern_stmts): Use the new
pattern_stmt_p field instead of is_pattern_stmt_p.
* tree-vect-data-refs.c (vect_preserves_scalar_order_p): Likewise.
* tree-vect-loop.c (vectorizable_live_operation): Likewise.
* tree-vect-slp.c (vect_build_slp_tree_2): Likewise.
(vect_find_last_scalar_stmt_in_slp, vect_remove_slp_scalar_calls)
(vect_schedule_slp): Likewise.
* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Likewise.
(vectorizable_call, vectorizable_simd_clone_call, vectorizable_shift)
(vectorizable_store, vect_remove_stores): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:23:56.440544995 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:24:02.364492386 +0100
@@ -791,6 +791,12 @@ struct _stmt_vec_info {
   /* Stmt is part of some pattern (computation idiom)  */
   bool in_pattern_p;
 
+  /* True if the statement was created during pattern recognition as
+ part of the replacement for RELATED_STMT.  This implies that the
+ statement isn't part of any basic block, although for convenience
+ its gimple_bb is the same as for RELATED_STMT.  */
+  bool pattern_stmt_p;
+
   /* Is this statement vectorizable or should it be skipped in (partial)
  vectorization.  */
   bool vectorizable;
@@ -1151,16 +1157,6 @@ get_later_stmt (stmt_vec_info stmt1_info
 return stmt2_info;
 }
 
-/* Return TRUE if a statement represented by STMT_INFO is a part of a
-   pattern.  */
-
-static inline bool
-is_pattern_stmt_p (stmt_vec_info stmt_info)
-{
-  stmt_vec_info related_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
-  return related_stmt_info && STMT_VINFO_IN_PATTERN_P (related_stmt_info);
-}
-
 /* Return true if BB is a loop header.  */
 
 static inline bool
Index: gcc/tree-vect-patterns.c
===
--- gcc/tree-vect-patterns.c2018-07-24 10:23:59.408518638 +0100
+++ gcc/tree-vect-patterns.c2018-07-24 10:24:02.360492422 +0100
@@ -108,6 +108,7 @@ vect_init_pattern_stmt (gimple *pattern_
 pattern_stmt_info = orig_stmt_info->vinfo->add_stmt (pattern_stmt);
   gimple_set_bb (pattern_stmt, gimple_bb (orig_stmt_info->stmt));
 
+  pattern_stmt_info->pattern_stmt_p = true;
   STMT_VINFO_RELATED_STMT (pattern_stmt_info) = orig_stmt_info;
   STMT_VINFO_DEF_TYPE (pattern_stmt_info)
 = STMT_VINFO_DEF_TYPE (orig_stmt_info);
@@ -630,7 +631,7 @@ vect_recog_temp_ssa_var (tree type, gimp
 vect_split_statement (stmt_vec_info stmt2_info, tree new_rhs,
  gimple *stmt1, tree vectype)
 {
-  if (is_pattern_stmt_p (stmt2_info))
+  if (stmt2_info->pattern_stmt_p)
 {
   /* STMT2_INFO is part of a pattern.  Get the statement to which
 the pattern is attached.  */
@@ -4726,7 +4727,7 @@ vect_mark_pattern_stmts (stmt_vec_info o
   gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt_info);
 
   gimple *orig_pattern_stmt = NULL;
-  if (is_pattern_stmt_p (orig_stmt_info))
+  if (orig_stmt_info->pattern_stmt_p)
 {
   /* We're replacing a statement in an existing pattern definition
 sequence.  */
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:23:53.204573732 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:24:02.356492457 +0100
@@ -212,9 +212,9 @@ vect_preserves_scalar_order_p (stmt_vec_
  (but could happen later) while reads will happen no later than their
  current position (but could happen earlier).  Reordering is therefore
  only possible if the first access is a write.  */
-  if (is_pattern_stmt_p (stmtinfo_a))
+  if (stmtinfo_a->pattern_stmt_p)
 stmtinfo_a = STMT_VINFO_RELATED_STMT (stmtinfo_a);
-  if (is_pattern_stmt_p (stmtinfo_b))
+  if (stmtinfo_b->pattern_stmt_p)
 stmtinfo_b = STMT_VINFO_RELATED_STMT (stmtinfo_b);
   stmt_vec_info earlier_stmt_info = get_earlier_stmt (stmtinfo_a, stmtinfo_b);
   return !DR_IS_WRITE (STMT_VINFO_DATA_REF (earlier_stmt_info));
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:23:56.436545030 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:24:02.360492422 +0100
@@ -7907,7 +7907,7 @@ vectorizable_live_operation 

Re: [PATCH][debug] Handle references to skipped params in remap_ssa_name

2018-07-24 Thread Tom de Vries
On 07/19/2018 10:30 AM, Richard Biener wrote:
> On Wed, Jul 18, 2018 at 3:42 PM Tom de Vries  wrote:
>>
>> On 07/06/2018 12:28 PM, Richard Biener wrote:
>>> On Thu, Jul 5, 2018 at 4:12 PM Tom de Vries  wrote:

 On 07/05/2018 01:39 PM, Richard Biener wrote:
> On Thu, Jul 5, 2018 at 1:25 PM Tom de Vries  wrote:
>>
>> [ was: Re: [testsuite/guality, committed] Prevent optimization of local 
>> in
>> vla-1.c ]
>>
>> On Wed, Jul 04, 2018 at 02:32:27PM +0200, Tom de Vries wrote:
>>> On 07/03/2018 11:05 AM, Tom de Vries wrote:
 On 07/02/2018 10:16 AM, Jakub Jelinek wrote:
> On Mon, Jul 02, 2018 at 09:44:04AM +0200, Richard Biener wrote:
>> Given the array has size i + 1 it's upper bound should be 'i' and 'i'
>> should be available via DW_OP_[GNU_]entry_value.
>>
>> I see it is
>>
>> <175>   DW_AT_upper_bound : 10 byte block: 75 1 8 20 24 8 20 26 
>> 31
>> 1c   (DW_OP_breg5 (rdi): 1; DW_OP_const1u: 32; DW_OP_shl;
>> DW_OP_const1u: 32; DW_OP_shra; DW_OP_lit1; DW_OP_minus)
>>
>> and %rdi is 1.  Not sure why gdb fails to print it's length.  Yes, 
>> the
>> storage itself doesn't have a location but the
>> type specifies the size.
>>
>> (gdb) ptype a
>> type = char [variable length]
>> (gdb) p sizeof(a)
>> $3 = 0
>>
>> this looks like a gdb bug to me?
>>

 With gdb patch:
 ...
 diff --git a/gdb/findvar.c b/gdb/findvar.c
 index 8ad5e25cb2..ebaff923a1 100644
 --- a/gdb/findvar.c
 +++ b/gdb/findvar.c
 @@ -789,6 +789,8 @@ default_read_var_value
break;

  case LOC_OPTIMIZED_OUT:
 +  if (is_dynamic_type (type))
 +   type = resolve_dynamic_type (type, NULL,
 +/* Unused address.  */ 0);
return allocate_optimized_out_value (type);

  default:
 ...

 I get:
 ...
 $ ./gdb -batch -ex "b f1" -ex "r" -ex "p sizeof (a)" vla-1.exe
 Breakpoint 1 at 0x4004a8: file vla-1.c, line 17.

 Breakpoint 1, f1 (i=i@entry=5) at vla-1.c:17
 17return a[0];
 $1 = 6
 ...

>>>
>>> Well, for -O1 and -O2.
>>>
>>> For O3, I get instead:
>>> ...
>>> $ ./gdb vla-1.exe -q -batch -ex "b f1" -ex "run" -ex "p sizeof (a)"
>>> Breakpoint 1 at 0x4004b0: f1. (2 locations)
>>>
>>> Breakpoint 1, f1 (i=5) at vla-1.c:17
>>> 17return a[0];
>>> $1 = 0
>>> ...
>>>
>>
>> Hi,
>>
>> When compiling guality/vla-1.c with -O3 -g, vla 'a[i + 1]' in f1 is 
>> optimized
>> away, but f1 still contains a debug expression describing the upper 
>> bound of the
>> vla (D.1914):
>> ...
>>  __attribute__((noinline))
>>  f1 (intD.6 iD.1900)
>>  {
>>
>>saved_stack.1_2 = __builtin_stack_save ();
>># DEBUG BEGIN_STMT
>># DEBUG D#3 => i_1(D) + 1
>># DEBUG D#2 => (long intD.8) D#3
>># DEBUG D#1 => D#2 + -1
>># DEBUG D.1914 => (sizetype) D#1
>> ...
>>
>> Then f1 is cloned to a version f1.constprop with no parameters, 
>> eliminating
>> parameter i, and 'DEBUG D#3 => i_1(D) + 1' turns into 'D#3 => NULL'.
>> Consequently, 'print sizeof (a)' yields '0' in gdb.
>
> So does gdb correctly recognize there isn't any size available or do we 
> somehow
> generate invalid debug info, not recognizing that D#3 => NULL means
> "optimized out" and thus all dependent expressions are "optimized out" as 
> well?
>
> That is, shouldn't gdb do
>
> (gdb) print sizeof (a)
> 
>
> ?

 The type for the vla gcc is emitting is an DW_TAG_array_type with
 DW_TAG_subrange_type without DW_AT_upper_bound or DW_AT_count, which
 makes the upper bound value 'unknown'. So I'd say the debug info is valid.
>>>
>>> OK, that sounds reasonable.  I wonder if languages like Ada have a way
>>> to declare an array type with unknown upper bound but known lower bound.
>>> For
>>>
>>> typedef int arr[];
>>> arr *x;
>>>
>>> we generate just
>>>
>>>  <1><2d>: Abbrev Number: 2 (DW_TAG_typedef)
>>> <2e>   DW_AT_name: arr
>>> <32>   DW_AT_decl_file   : 1
>>> <33>   DW_AT_decl_line   : 1
>>> <34>   DW_AT_decl_column : 13
>>> <35>   DW_AT_type: <0x39>
>>>  <1><39>: Abbrev Number: 3 (DW_TAG_array_type)
>>> <3a>   DW_AT_type: <0x44>
>>> <3e>   DW_AT_sibling : <0x44>
>>>  <2><42>: Abbrev Number: 4 (DW_TAG_subrange_type)
>>>  <2><43>: Abbrev Number: 0
>>>
>>> which does
>>>
>>> (gdb) ptype arr
>>> type = int []
>>> (gdb) ptype x
>>> type = int (*)[]
>>> (gdb) p sizeof 

RE: [PATCH][GCC][front-end][opt-framework] Update options framework for parameters to properly handle and validate configure time params. [Patch (2/3)]

2018-07-24 Thread tamar . christina
Hi All,

This patch is re-spun to handle the configure changes in patch 4 / 6 of the 
previous series.

This patch now changes it so that default parameters are validated during
initialization. This change is needed to ensure parameters set via by the
target specific common initialization routines still keep the parameters within
the valid range.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no 
issues.
Both targets were tested with stack clash on and off by default.

Ok for trunk?

Thanks,
Tamar

gcc/
2018-07-24  Tamar Christina  

* params.c (validate_param): New.
(add_params): Use it.
(set_param_value): Refactor param validation into validate_param.
(diagnostic.h): Include.
* diagnostic.h (diagnostic_ready_p): New.

> -Original Message-
> From: Jeff Law 
> Sent: Wednesday, July 11, 2018 20:24
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; jos...@codesourcery.com
> Subject: Re: [PATCH][GCC][front-end][opt-framework] Update options
> framework for parameters to properly handle and validate configure time
> params. [Patch (2/3)]
> 
> On 07/11/2018 05:24 AM, Tamar Christina wrote:
> > Hi All,
> >
> > This patch builds on a previous patch to pass param options down from
> > configure by adding more expansive validation and correctness checks.
> >
> > These are set very early on and allow the target to validate or reject
> > the values as they see fit.
> >
> > To do this compiler_param has been extended to hold a value set at
> > configure time, this value is used to be able to distinguish between
> >
> > 1) default value
> > 2) configure value
> > 3) back-end default
> > 4) user specific value.
> >
> > The priority of the values should be 4 > 2 > 3 > 1.  The compiler will
> > now also validate the values in params.def after setting them.  This
> > means invalid values will no longer be accepted.
> >
> > This also changes it so that default parameters are validated during
> > initialization. This change is needed to ensure parameters set via
> > configure or by the target specific common initialization routines
> > still keep the parameters within the valid range.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no issues.
> > Both targets were tested with stack clash on and off by default.
> >
> > Ok for trunk?
> >
> > Thanks,
> > Tamar
> >
> > gcc/
> > 2018-07-11  Tamar Christina  
> >
> > * params.h (struct param_info): Add configure_value.
> > * params.c (DEFPARAMCONF): New.
> > (DEFPARAM, DEFPARAMENUM5): Set configure_value.
> > (validate_param): New.
> > (add_params): Use it.
> > (set_param_value): Refactor param validation into validate_param.
> > (maybe_set_param_value): Don't override value from configure.
> > (diagnostic.h): Include.
> > * params-enum.h (DEFPARAMCONF): New.
> > * params-list.h: Likewise.
> > * params-options.h: Likewise.
> > * params.def (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE):
> Use it.
> > * diagnostic.h (diagnostic_ready_p): New.
> Generally OK, though probably should depend on what we decide WRT
> configurability.  ie, I'm not convinced we need to be able to set the default
> via a configure time option.  And if we don't support that this patch gets
> somewhat simpler.
> 
> jeff
> >

diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index cf3a610f3d945f2dbbfde7d9cf7a66f46ad6f0b1..584b5877b489d3cce5c18da2db5f73b7b41a72a4 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -250,6 +250,10 @@ diagnostic_inhibit_notes (diagnostic_context * context)
and similar functions.  */
 extern diagnostic_context *global_dc;
 
+/* Returns whether the diagnostic framework has been intialized already and is
+   ready for use.  */
+#define diagnostic_ready_p() (global_dc->printer != NULL)
+
 /* The total count of a KIND of diagnostics emitted so far.  */
 #define diagnostic_kind_count(DC, DK) (DC)->diagnostic_count[(int) (DK)]
 
diff --git a/gcc/params.c b/gcc/params.c
index eb663be880a91dc0adce2a84c6bad7e06b4c72c3..b6a33dfd6bf8c4df43fdac91e30ac6d082f39071 100644
--- a/gcc/params.c
+++ b/gcc/params.c
@@ -25,6 +25,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "params-enum.h"
 #include "diagnostic-core.h"
+#include "diagnostic.h"
 #include "spellcheck.h"
 
 /* An array containing the compiler parameters and their current
@@ -58,6 +59,10 @@ static const param_info lang_independent_params[] = {
   { NULL, 0, 0, 0, NULL, NULL }
 };
 
+static bool
+validate_param (const int value, const param_info param, const int index);
+
+
 /* Add the N PARAMS to the current list of compiler parameters.  */
 
 void
@@ -68,12 +73,26 @@ add_params (const param_info params[], size_t n)
   /* Allocate enough space for the new parameters.  */
   compiler_params = XRESIZEVEC (param_info, compiler_params,
 num_compiler_params + n);
+  param_info *dst_params = compiler_params + num_compiler_params;

[C PATCH] Fix endless loop in the C FE initializer handling (PR c/85704)

2018-07-24 Thread Jakub Jelinek
Hi!

Starting with r258497 aka PR46921 fix the C FE can loop forever
in initializers where a zero length field's initializer has side-effects
(in this testcase merely because it is a compound literal) and that
zero length field is followed by some other fields.

Previously, we'd throw initializers of such zero length fields away,
but after the above mentioned commit we do it only if they have
side-effects.

The problem is that for FIELD_DECLs, output_pending_init_elements
uses just their bit_position to compare whic field precedes or follows the
other one (or if it is the same).  With zero sized FIELD_DECLs that is
ambiguous (as we can have many consecutive FIELD_DECLs that have zero
size and have the same bit_position (plus at most one non-zero sized field
after them) and previously it worked exactly because we'd throw away
all initializers for those fields.  The infinite loop is because we
output_init_element, which sees the initializer has side-effects and doesn't
throw it away, but adds to the pending tree, then returns to
output_pending_init_elements, which looks at the pending elt which is for
the same field again, but constructor_unfilled_fields at that point is
already the next field.  They have the same bit_position though, so next
time it calls output_init_element for it again rather than advancing to
something next, and loops this way forever.

The following patch fixes it by not relying solely on bit_position;
instead a field comparator is introduced, which determines from two
FIELD_DECLs (required to be from the same structure) which one comes first.
If they have different bit_position, the answer is clear, if one of them has
non-zero size, then the non-zero size must be the last one, if they are
pointer equal, they are the same, otherwise it walks the DECL_CHAIN of those
fields (starting from both of them simultaneously) and if it finds the other
field, or walks at the end of DECL_CHAIN (to last field), or to a field with
non-zero size, it stops.  I think walking both is better for the common case
that the two compared fields are close to each other, but we really don't
know which one is first (and hopefully people don't have thousands of
consecutive zero sized FIELD_DECLs, I'd hope they would use an array of them
in those cases).  Another option would be during structure layout attach
some extra ids to FIELD_DECLs that would allow finding out the FIELD_DECL
chain ordering instantly.  Not all FIELD_DECLs are ordered though (e.g. C++
ones I think) and we don't have bits to spare in tree_field_decl.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
release branches?

2018-07-24  Jakub Jelinek  

PR c/85704
* c-typeck.c (field_decl_cmp): New function.
(output_pending_init_elements): Use it for field comparisons
instead of pure bit_position comparisons.

* gcc.c-torture/compile/pr85704.c: New test.

--- gcc/c/c-typeck.c.jj 2018-06-22 19:17:10.308436152 +0200
+++ gcc/c/c-typeck.c2018-07-23 15:36:56.455204401 +0200
@@ -9330,6 +9330,65 @@ output_init_element (location_t loc, tre
 output_pending_init_elements (0, braced_init_obstack);
 }
 
+/* For two FIELD_DECLs in the same chain, return -1 if field1
+   comes before field2, 1 if field1 comes after field2 and
+   0 if field1 == field2.  */
+
+static int
+field_decl_cmp (tree field1, tree field2)
+{
+  if (field1 == field2)
+return 0;
+
+  tree bitpos1 = bit_position (field1);
+  tree bitpos2 = bit_position (field2);
+  if (tree_int_cst_equal (bitpos1, bitpos2))
+{
+  /* If one of the fields has non-zero bitsize, then that
+field must be the last one in a sequence of zero
+sized fields, fields after it will have bigger
+bit_position.  */
+  if (TREE_TYPE (field1) != error_mark_node
+ && COMPLETE_TYPE_P (TREE_TYPE (field1))
+ && integer_nonzerop (TREE_TYPE (field1)))
+   return 1;
+  if (TREE_TYPE (field2) != error_mark_node
+ && COMPLETE_TYPE_P (TREE_TYPE (field2))
+ && integer_nonzerop (TREE_TYPE (field2)))
+   return -1;
+  /* Otherwise, fallback to DECL_CHAIN walk to find out
+which field comes earlier.  Walk chains of both
+fields, so that if field1 and field2 are close to each
+other in either order, it is found soon even for large
+sequences of zero sized fields.  */
+  tree f1 = field1, f2 = field2;
+  while (1)
+   {
+ f1 = DECL_CHAIN (f1);
+ f2 = DECL_CHAIN (f2);
+ if (f1 == NULL_TREE)
+   {
+ gcc_assert (f2);
+ return 1;
+   }
+ if (f2 == NULL_TREE)
+   return -1;
+ if (f1 == field2)
+   return -1;
+ if (f2 == field1)
+   return 1;
+ if (!tree_int_cst_equal (bit_position (f1), bitpos1))
+   return 1;
+ if (!tree_int_cst_equal (bit_position (f2), bitpos1))
+   return -1;
+   }
+}
+  else if 

Re: [Patch, Fortran] PR 57160: short-circuit IF only with -ffrontend-optimize

2018-07-24 Thread Dominique d'Humières
Hi Janus,

> gfortran currently does short-circuiting, and after my patch for PR
> 85599 warns about cases where this might remove an impure function
> call (which potentially can change results).
>
> Now, this PR (57160) is about code which relies on the
> short-circuiting behavior. Since short-circuiting is not guaranteed by
> the standard, such code is invalid. Generating a warning or an error
> at compile-time is a bit harder here, though, since there are multiple
> variations of such a situation, e.g.:
> * ASSOCIATED(p) .AND. p%T
> * ALLOCATED(a) .AND. a%T
> * i * …
>

Aren’t you confusing portability with validity?
The above codes are indeed invalid without short-circuit evaluation,
 but I did not find anything in the standard saying such codes are
invalid with short-circuit evaluation.

> The suggestion in the PR was to do short-circuiting only with
> optimization flags, but inhibit it with -O0, so that the faulty code
> will run into a segfault (or runtime error) at least when
> optimizations are disabled, and the problem can be identified.

This PR has nothing to do with optimization and I think
 it is a very bad idea to (ab)use any optimization option.

Please leave the default behavior (and test) as they are now.
If you want non short-circuit evaluation, introduce an option for it.

Note that the warning introduce for pr85599 should be disabled
for non short-circuit evaluations.

TIA

Dominique



Re: committed: remove redundant -Wall from -Warray-bounds (PR 82063)

2018-07-24 Thread Franz Sirl

Am 2018-07-20 um 23:22 schrieb Martin Sebor:

As the last observation in PR 82063 Jim points out that

   Both -Warray-bounds and -Warray-bounds= are listed in the c.opt
   file as being enabled by -Wall, but they are the same option,
   and it causes this one option to be processed twice in the
   C_handle_option_auto function in the generated options.c file.
   It gets set to the same value twice, so it does work as intended,
   but this is wasteful.

I have removed the redundant -Wall from the first option and
committed the change as obvious in r262912.


Hi Martin,

this looks related to PR 68845 and my patch in there. I never posted it 
to gcc-patches because I couldn't find a definitive answer on how 
options duplicated between common.opt and c-family/c.opt are supposed to 
be handled.
For example, Warray-bounds in common.opt is a separate option (not an 
alias to Warray-bounds=), leading to separate enums for them. Is this 
intended? Warray-bounds seemed to be the only option with an equal sign 
doing it like that at that time. Now Wcast-align is doing the same...


Can you shed some light on this?

Franz


[13/46] Make STMT_VINFO_RELATED_STMT a stmt_vec_info

2018-07-24 Thread Richard Sandiford
This patch changes STMT_VINFO_RELATED_STMT from a gimple stmt to a
stmt_vec_info.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_stmt_vec_info::related_stmt): Change from
a gimple stmt to a stmt_vec_info.
(is_pattern_stmt_p): Update accordingly.
* tree-vect-data-refs.c (vect_preserves_scalar_order_p): Likewise.
(vect_record_grouped_load_vectors): Likewise.
* tree-vect-loop.c (vect_determine_vf_for_stmt): Likewise.
(vect_fixup_reduc_chain, vect_update_vf_for_slp): Likewise.
(vect_model_reduction_cost): Likewise.
(vect_create_epilog_for_reduction): Likewise.
(vectorizable_reduction, vectorizable_induction): Likewise.
* tree-vect-patterns.c (vect_init_pattern_stmt): Likewise.
Return the stmt_vec_info for the pattern statement.
(vect_set_pattern_stmt): Update use of STMT_VINFO_RELATED_STMT.
(vect_split_statement, vect_mark_pattern_stmts): Likewise.
* tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Likewise.
(vect_detect_hybrid_slp, vect_get_slp_defs): Likewise.
* tree-vect-stmts.c (vect_mark_relevant): Likewise.
(vect_get_vec_def_for_operand_1, vectorizable_call): Likewise.
(vectorizable_simd_clone_call, vect_analyze_stmt, new_stmt_vec_info)
(free_stmt_vec_info, vect_is_simple_use): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:40.725217371 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:44.297185652 +0100
@@ -847,7 +847,7 @@ struct _stmt_vec_info {
 related_stmt of the "pattern stmt" points back to this stmt (which is
 the last stmt in the original sequence of stmts that constitutes the
 pattern).  */
-  gimple *related_stmt;
+  stmt_vec_info related_stmt;
 
   /* Used to keep a sequence of def stmts of a pattern stmt if such exists.
  The sequence is attached to the original statement rather than the
@@ -1189,16 +1189,8 @@ get_later_stmt (gimple *stmt1, gimple *s
 static inline bool
 is_pattern_stmt_p (stmt_vec_info stmt_info)
 {
-  gimple *related_stmt;
-  stmt_vec_info related_stmt_info;
-
-  related_stmt = STMT_VINFO_RELATED_STMT (stmt_info);
-  if (related_stmt
-  && (related_stmt_info = vinfo_for_stmt (related_stmt))
-  && STMT_VINFO_IN_PATTERN_P (related_stmt_info))
-return true;
-
-  return false;
+  stmt_vec_info related_stmt_info = STMT_VINFO_RELATED_STMT (stmt_info);
+  return related_stmt_info && STMT_VINFO_IN_PATTERN_P (related_stmt_info);
 }
 
 /* Return true if BB is a loop header.  */
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:22:19.801403171 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:22:44.285185759 +0100
@@ -213,10 +213,10 @@ vect_preserves_scalar_order_p (gimple *s
  current position (but could happen earlier).  Reordering is therefore
  only possible if the first access is a write.  */
   if (is_pattern_stmt_p (stmtinfo_a))
-stmt_a = STMT_VINFO_RELATED_STMT (stmtinfo_a);
+stmtinfo_a = STMT_VINFO_RELATED_STMT (stmtinfo_a);
   if (is_pattern_stmt_p (stmtinfo_b))
-stmt_b = STMT_VINFO_RELATED_STMT (stmtinfo_b);
-  gimple *earlier_stmt = get_earlier_stmt (stmt_a, stmt_b);
+stmtinfo_b = STMT_VINFO_RELATED_STMT (stmtinfo_b);
+  gimple *earlier_stmt = get_earlier_stmt (stmtinfo_a, stmtinfo_b);
   return !DR_IS_WRITE (STMT_VINFO_DATA_REF (vinfo_for_stmt (earlier_stmt)));
 }
 
@@ -6359,8 +6359,10 @@ vect_transform_grouped_load (gimple *stm
 void
 vect_record_grouped_load_vectors (gimple *stmt, vec result_chain)
 {
-  gimple *first_stmt = DR_GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt));
-  gimple *next_stmt, *new_stmt;
+  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  vec_info *vinfo = stmt_info->vinfo;
+  gimple *first_stmt = DR_GROUP_FIRST_ELEMENT (stmt_info);
+  gimple *next_stmt;
   unsigned int i, gap_count;
   tree tmp_data_ref;
 
@@ -6389,29 +6391,28 @@ vect_record_grouped_load_vectors (gimple
 
   while (next_stmt)
 {
- new_stmt = SSA_NAME_DEF_STMT (tmp_data_ref);
+ stmt_vec_info new_stmt_info = vinfo->lookup_def (tmp_data_ref);
  /* We assume that if VEC_STMT is not NULL, this is a case of multiple
 copies, and we put the new vector statement in the first available
 RELATED_STMT.  */
  if (!STMT_VINFO_VEC_STMT (vinfo_for_stmt (next_stmt)))
-   STMT_VINFO_VEC_STMT (vinfo_for_stmt (next_stmt)) = new_stmt;
+   STMT_VINFO_VEC_STMT (vinfo_for_stmt (next_stmt)) = new_stmt_info;
  else
 {
   if (!DR_GROUP_SAME_DR_STMT (vinfo_for_stmt (next_stmt)))
 {
  gimple *prev_stmt =
STMT_VINFO_VEC_STMT (vinfo_for_stmt (next_stmt));
- gimple *rel_stmt =
-  

[12/46] Make vect_finish_stmt_generation return a stmt_vec_info

2018-07-24 Thread Richard Sandiford
This patch makes vect_finish_replace_stmt and vect_finish_stmt_generation
return the stmt_vec_info for the vectorised statement, so that the caller
doesn't need a separate vinfo_for_stmt to get at it.

This involved changing the structure of the statement-generating loops
so that they use narrow scopes for the vectorised gimple statements
and use the existing (wider) scopes for the associated stmt_vec_infos.
This helps with gimple stmt->stmt_vec_info changes further down the line.

The way we do this generation is another area ripe for clean-up,
but that's too much of a rabbit-hole for this series.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_finish_replace_stmt): Return a stmt_vec_info
(vect_finish_stmt_generation): Likewise.
* tree-vect-stmts.c (vect_finish_stmt_generation_1): Likewise.
(vect_finish_replace_stmt, vect_finish_stmt_generation): Likewise.
(vect_build_gather_load_calls): Use the return value of the above
functions instead of a separate call to vinfo_for_stmt.  Use narrow
scopes for the input gimple stmt and wider scopes for the associated
stmt_vec_info.  Use vec_info::lookup_def when setting these
stmt_vec_infos from an SSA_NAME definition.
(vectorizable_bswap, vectorizable_call, vectorizable_simd_clone_call)
(vect_create_vectorized_demotion_stmts, vectorizable_conversion)
(vectorizable_assignment, vectorizable_shift, vectorizable_operation)
(vectorizable_store, vectorizable_load, vectorizable_condition)
(vectorizable_comparison): Likewise.
* tree-vect-loop.c (vectorize_fold_left_reduction): Likewise.
(vectorizable_reduction): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:37.257248166 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:40.725217371 +0100
@@ -1548,9 +1548,9 @@ extern void free_stmt_vec_info (gimple *
 extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
  enum vect_cost_for_stmt, stmt_vec_info,
  int, enum vect_cost_model_location);
-extern void vect_finish_replace_stmt (gimple *, gimple *);
-extern void vect_finish_stmt_generation (gimple *, gimple *,
- gimple_stmt_iterator *);
+extern stmt_vec_info vect_finish_replace_stmt (gimple *, gimple *);
+extern stmt_vec_info vect_finish_stmt_generation (gimple *, gimple *,
+ gimple_stmt_iterator *);
 extern bool vect_mark_stmts_to_be_vectorized (loop_vec_info);
 extern tree vect_get_store_rhs (gimple *);
 extern tree vect_get_vec_def_for_operand_1 (gimple *, enum vect_def_type);
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:22:37.257248166 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:22:40.725217371 +0100
@@ -1729,15 +1729,15 @@ vect_get_vec_defs (tree op0, tree op1, g
 
 /* Helper function called by vect_finish_replace_stmt and
vect_finish_stmt_generation.  Set the location of the new
-   statement and create a stmt_vec_info for it.  */
+   statement and create and return a stmt_vec_info for it.  */
 
-static void
+static stmt_vec_info
 vect_finish_stmt_generation_1 (gimple *stmt, gimple *vec_stmt)
 {
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   vec_info *vinfo = stmt_info->vinfo;
 
-  vinfo->add_stmt (vec_stmt);
+  stmt_vec_info vec_stmt_info = vinfo->add_stmt (vec_stmt);
 
   if (dump_enabled_p ())
 {
@@ -1753,12 +1753,15 @@ vect_finish_stmt_generation_1 (gimple *s
   int lp_nr = lookup_stmt_eh_lp (stmt);
   if (lp_nr != 0 && stmt_could_throw_p (vec_stmt))
 add_stmt_to_eh_lp (vec_stmt, lp_nr);
+
+  return vec_stmt_info;
 }
 
 /* Replace the scalar statement STMT with a new vector statement VEC_STMT,
-   which sets the same scalar result as STMT did.  */
+   which sets the same scalar result as STMT did.  Create and return a
+   stmt_vec_info for VEC_STMT.  */
 
-void
+stmt_vec_info
 vect_finish_replace_stmt (gimple *stmt, gimple *vec_stmt)
 {
   gcc_assert (gimple_get_lhs (stmt) == gimple_get_lhs (vec_stmt));
@@ -1766,14 +1769,13 @@ vect_finish_replace_stmt (gimple *stmt,
   gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
   gsi_replace (, vec_stmt, false);
 
-  vect_finish_stmt_generation_1 (stmt, vec_stmt);
+  return vect_finish_stmt_generation_1 (stmt, vec_stmt);
 }
 
-/* Function vect_finish_stmt_generation.
-
-   Insert a new stmt.  */
+/* Add VEC_STMT to the vectorized implementation of STMT and insert it
+   before *GSI.  Create and return a stmt_vec_info for VEC_STMT.  */
 
-void
+stmt_vec_info
 vect_finish_stmt_generation (gimple *stmt, gimple *vec_stmt,
 gimple_stmt_iterator *gsi)
 {
@@ -1806,7 +1808,7 @@ vect_finish_stmt_generation (gimple *stm
  

[14/46] Make STMT_VINFO_VEC_STMT a stmt_vec_info

2018-07-24 Thread Richard Sandiford
This patch changes STMT_VINFO_VEC_STMT from a gimple stmt to a
stmt_vec_info and makes the vectorizable_* routines pass back
a stmt_vec_info to vect_transform_stmt.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (_stmt_vec_info::vectorized_stmt): Change from
a gimple stmt to a stmt_vec_info.
(vectorizable_condition, vectorizable_live_operation)
(vectorizable_reduction, vectorizable_induction): Pass back the
vectorized statement as a stmt_vec_info.
* tree-vect-data-refs.c (vect_record_grouped_load_vectors): Update
use of STMT_VINFO_VEC_STMT.
* tree-vect-loop.c (vect_create_epilog_for_reduction): Likewise,
accumulating the inner phis that feed the STMT_VINFO_VEC_STMT
as stmt_vec_infos rather than gimple stmts.
(vectorize_fold_left_reduction): Change vec_stmt from a gimple stmt
to a stmt_vec_info.
(vectorizable_live_operation): Likewise.
(vectorizable_reduction, vectorizable_induction): Likewise,
updating use of STMT_VINFO_VEC_STMT.
* tree-vect-stmts.c (vect_get_vec_def_for_operand_1): Update use
of STMT_VINFO_VEC_STMT.
(vect_build_gather_load_calls, vectorizable_bswap, vectorizable_call)
(vectorizable_simd_clone_call, vectorizable_conversion)
(vectorizable_assignment, vectorizable_shift, vectorizable_operation)
(vectorizable_store, vectorizable_load, vectorizable_condition)
(vectorizable_comparison, can_vectorize_live_stmts): Change vec_stmt
from a gimple stmt to a stmt_vec_info.
(vect_transform_stmt): Update use of STMT_VINFO_VEC_STMT.  Pass a
pointer to a stmt_vec_info to the vectorizable_* routines.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:44.297185652 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:47.489157307 +0100
@@ -812,7 +812,7 @@ struct _stmt_vec_info {
   tree vectype;
 
   /* The vectorized version of the stmt.  */
-  gimple *vectorized_stmt;
+  stmt_vec_info vectorized_stmt;
 
 
   /* The following is relevant only for stmts that contain a non-scalar
@@ -1560,7 +1560,7 @@ extern void vect_remove_stores (gimple *
 extern bool vect_analyze_stmt (gimple *, bool *, slp_tree, slp_instance,
   stmt_vector_for_cost *);
 extern bool vectorizable_condition (gimple *, gimple_stmt_iterator *,
-   gimple **, tree, int, slp_tree,
+   stmt_vec_info *, tree, int, slp_tree,
stmt_vector_for_cost *);
 extern void vect_get_load_cost (stmt_vec_info, int, bool,
unsigned int *, unsigned int *,
@@ -1649,13 +1649,13 @@ extern tree vect_get_loop_mask (gimple_s
 extern struct loop *vect_transform_loop (loop_vec_info);
 extern loop_vec_info vect_analyze_loop_form (struct loop *, vec_info_shared *);
 extern bool vectorizable_live_operation (gimple *, gimple_stmt_iterator *,
-slp_tree, int, gimple **,
+slp_tree, int, stmt_vec_info *,
 stmt_vector_for_cost *);
 extern bool vectorizable_reduction (gimple *, gimple_stmt_iterator *,
-   gimple **, slp_tree, slp_instance,
+   stmt_vec_info *, slp_tree, slp_instance,
stmt_vector_for_cost *);
 extern bool vectorizable_induction (gimple *, gimple_stmt_iterator *,
-   gimple **, slp_tree,
+   stmt_vec_info *, slp_tree,
stmt_vector_for_cost *);
 extern tree get_initial_def_for_reduction (gimple *, tree, tree *);
 extern bool vect_worthwhile_without_simd_p (vec_info *, tree_code);
Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:22:44.285185759 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:22:47.485157343 +0100
@@ -6401,18 +6401,17 @@ vect_record_grouped_load_vectors (gimple
 {
   if (!DR_GROUP_SAME_DR_STMT (vinfo_for_stmt (next_stmt)))
 {
- gimple *prev_stmt =
-   STMT_VINFO_VEC_STMT (vinfo_for_stmt (next_stmt));
+ stmt_vec_info prev_stmt_info
+   = STMT_VINFO_VEC_STMT (vinfo_for_stmt (next_stmt));
  stmt_vec_info rel_stmt_info
-   = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (prev_stmt));
+   = STMT_VINFO_RELATED_STMT (prev_stmt_info);
  while (rel_stmt_info)
{
- prev_stmt = rel_stmt_info;
+ prev_stmt_info = rel_stmt_info;
  rel_stmt_info = 

Re: [PATCH] Fix a missing case of PR 21458 similar to fc6141f097056f830a412afebed8d81a9d72b696.

2018-07-24 Thread Robert Schiele
On Tue, Jul 24, 2018 at 11:05 AM Kyrill Tkachov
 wrote:
> Patches to gas should be sent to the binutils list: binut...@sourceware.org
> rather than gcc-patches.

That indeed is a very good point and I'd like to express my apologies
for that. Obviously I did too many things at one point in time again.

Robert


[34/46] Alter interface to vect_get_vec_def_for_stmt_copy

2018-07-24 Thread Richard Sandiford
This patch makes vect_get_vec_def_for_stmt_copy take a vec_info
rather than a vect_def_type.  If the vector operand passed in is
defined in the vectorised region, we should look for copies in
the normal way.  If it's defined in an external statement
(such as by vect_init_vector_1) we should just use the original value.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_get_vec_defs_for_stmt_copy)
(vect_get_vec_def_for_stmt_copy): Take a vec_info rather than
a vect_def_type for the first argument.
* tree-vect-stmts.c (vect_get_vec_defs_for_stmt_copy): Likewise.
(vect_get_vec_def_for_stmt_copy): Likewise.  Return the original
operand if it isn't defined by a vectorized statement.
(vect_build_gather_load_calls): Remove the mask_dt argument and
update calls to vect_get_vec_def_for_stmt_copy.
(vectorizable_bswap): Likewise the dt argument.
(vectorizable_call): Update calls to vectorizable_bswap and
vect_get_vec_def_for_stmt_copy.
(vectorizable_simd_clone_call, vectorizable_assignment)
(vectorizable_shift, vectorizable_operation, vectorizable_condition)
(vectorizable_comparison): Update calls to
vect_get_vec_def_for_stmt_copy.
(vectorizable_store): Likewise.  Remove now-unnecessary calls to
vect_is_simple_use.
(vect_get_loop_based_defs): Remove dt argument and update call
to vect_get_vec_def_for_stmt_copy.
(vectorizable_conversion): Update calls to vect_get_loop_based_defs
and vect_get_vec_def_for_stmt_copy.
(vectorizable_load): Update calls to vect_build_gather_load_calls
and vect_get_vec_def_for_stmt_copy.
* tree-vect-loop.c (vect_create_epilog_for_reduction)
(vectorizable_reduction, vectorizable_live_operation): Update calls
to vect_get_vec_def_for_stmt_copy.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:23:50.008602115 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:23:56.440544995 +0100
@@ -1514,11 +1514,11 @@ extern tree vect_get_vec_def_for_operand
 extern tree vect_get_vec_def_for_operand (tree, stmt_vec_info, tree = NULL);
 extern void vect_get_vec_defs (tree, tree, stmt_vec_info, vec *,
   vec *, slp_tree);
-extern void vect_get_vec_defs_for_stmt_copy (enum vect_def_type *,
+extern void vect_get_vec_defs_for_stmt_copy (vec_info *,
 vec *, vec *);
 extern tree vect_init_vector (stmt_vec_info, tree, tree,
   gimple_stmt_iterator *);
-extern tree vect_get_vec_def_for_stmt_copy (enum vect_def_type, tree);
+extern tree vect_get_vec_def_for_stmt_copy (vec_info *, tree);
 extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *,
  bool *, slp_tree, slp_instance);
 extern void vect_remove_stores (stmt_vec_info);
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:23:50.008602115 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:23:56.440544995 +0100
@@ -1580,8 +1580,7 @@ vect_get_vec_def_for_operand (tree op, s
created in case the vectorized result cannot fit in one vector, and several
copies of the vector-stmt are required.  In this case the vector-def is
retrieved from the vector stmt recorded in the STMT_VINFO_RELATED_STMT field
-   of the stmt that defines VEC_OPRND.
-   DT is the type of the vector def VEC_OPRND.
+   of the stmt that defines VEC_OPRND.  VINFO describes the vectorization.
 
Context:
 In case the vectorization factor (VF) is bigger than the number
@@ -1625,29 +1624,24 @@ vect_get_vec_def_for_operand (tree op, s
STMT_VINFO_RELATED_STMT field of 'VS1.0' we obtain the next copy - 'VS1.1',
and return its def ('vx.1').
Overall, to create the above sequence this function will be called 3 times:
-vx.1 = vect_get_vec_def_for_stmt_copy (dt, vx.0);
-vx.2 = vect_get_vec_def_for_stmt_copy (dt, vx.1);
-vx.3 = vect_get_vec_def_for_stmt_copy (dt, vx.2);  */
+   vx.1 = vect_get_vec_def_for_stmt_copy (vinfo, vx.0);
+   vx.2 = vect_get_vec_def_for_stmt_copy (vinfo, vx.1);
+   vx.3 = vect_get_vec_def_for_stmt_copy (vinfo, vx.2);  */
 
 tree
-vect_get_vec_def_for_stmt_copy (enum vect_def_type dt, tree vec_oprnd)
+vect_get_vec_def_for_stmt_copy (vec_info *vinfo, tree vec_oprnd)
 {
-  gimple *vec_stmt_for_operand;
-  stmt_vec_info def_stmt_info;
-
-  /* Do nothing; can reuse same def.  */
-  if (dt == vect_external_def || dt == vect_constant_def )
+  stmt_vec_info def_stmt_info = vinfo->lookup_def (vec_oprnd);
+  if (!def_stmt_info)
+/* Do nothing; can reuse same def.  */
 return vec_oprnd;
 
-  vec_stmt_for_operand = SSA_NAME_DEF_STMT (vec_oprnd);
-  def_stmt_info = 

[32/46] Use stmt_vec_info in function interfaces (part 2)

2018-07-24 Thread Richard Sandiford
This second part handles the mechanical change from a gimple stmt
argument to a stmt_vec_info argument.  It updates the function
comments if they referred to the argument by name, but it doesn't
try to retrofit mentions to other functions.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (nested_in_vect_loop_p): Move further down
file and take a stmt_vec_info instead of a gimple stmt.
(supportable_widening_operation, vect_finish_replace_stmt)
(vect_finish_stmt_generation, vect_get_store_rhs)
(vect_get_vec_def_for_operand_1, vect_get_vec_def_for_operand)
(vect_get_vec_defs, vect_init_vector, vect_transform_stmt)
(vect_remove_stores, vect_analyze_stmt, vectorizable_condition)
(vect_get_smallest_scalar_type, vect_check_gather_scatter)
(vect_create_data_ref_ptr, bump_vector_ptr)
(vect_permute_store_chain, vect_setup_realignment)
(vect_transform_grouped_load, vect_record_grouped_load_vectors)
(vect_create_addr_base_for_vector_ref, vectorizable_live_operation)
(vectorizable_reduction, vectorizable_induction)
(get_initial_def_for_reduction, is_simple_and_all_uses_invariant)
(vect_get_place_in_interleaving_chain): Take stmt_vec_infos rather
than gimple stmts as arguments.
* tree-vect-data-refs.c (vect_get_smallest_scalar_type)
(vect_preserves_scalar_order_p, vect_slp_analyze_node_dependences)
(can_group_stmts_p, vect_check_gather_scatter)
(vect_create_addr_base_for_vector_ref, vect_create_data_ref_ptr)
(bump_vector_ptr, vect_permute_store_chain, vect_setup_realignment)
(vect_permute_load_chain, vect_shift_permute_load_chain)
(vect_transform_grouped_load)
(vect_record_grouped_load_vectors): Likewise.
* tree-vect-loop.c (vect_fixup_reduc_chain)
(get_initial_def_for_reduction, vect_create_epilog_for_reduction)
(vectorize_fold_left_reduction, is_nonwrapping_integer_induction)
(vectorizable_reduction, vectorizable_induction)
(vectorizable_live_operation, vect_loop_kill_debug_uses): Likewise.
* tree-vect-patterns.c (type_conversion_p, adjust_bool_stmts)
(vect_get_load_store_mask): Likewise.
* tree-vect-slp.c (vect_get_place_in_interleaving_chain)
(vect_analyze_slp_instance, vect_mask_constant_operand_p): Likewise.
* tree-vect-stmts.c (vect_mark_relevant)
(is_simple_and_all_uses_invariant)
(exist_non_indexing_operands_for_use_p, process_use)
(vect_init_vector_1, vect_init_vector, vect_get_vec_def_for_operand_1)
(vect_get_vec_def_for_operand, vect_get_vec_defs)
(vect_finish_stmt_generation_1, vect_finish_replace_stmt)
(vect_finish_stmt_generation, vect_truncate_gather_scatter_offset)
(compare_step_with_zero, vect_get_store_rhs, get_group_load_store_type)
(get_negative_load_store_type, get_load_store_type)
(vect_check_load_store_mask, vect_check_store_rhs)
(vect_build_gather_load_calls, vect_get_strided_load_store_ops)
(vectorizable_bswap, vectorizable_call, vectorizable_simd_clone_call)
(vect_create_vectorized_demotion_stmts, vectorizable_conversion)
(vectorizable_assignment, vectorizable_shift, vectorizable_operation)
(get_group_alias_ptr_type, vectorizable_store, hoist_defs_of_uses)
(vectorizable_load, vectorizable_condition, vectorizable_comparison)
(vect_analyze_stmt, vect_transform_stmt, vect_remove_stores)
(supportable_widening_operation): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:23:35.384731983 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:23:50.008602115 +0100
@@ -627,13 +627,6 @@ loop_vec_info_for_loop (struct loop *loo
   return (loop_vec_info) loop->aux;
 }
 
-static inline bool
-nested_in_vect_loop_p (struct loop *loop, gimple *stmt)
-{
-  return (loop->inner
-  && (loop->inner == (gimple_bb (stmt))->loop_father));
-}
-
 typedef struct _bb_vec_info : public vec_info
 {
   _bb_vec_info (gimple_stmt_iterator, gimple_stmt_iterator, vec_info_shared *);
@@ -1119,6 +1112,13 @@ set_vinfo_for_stmt (gimple *stmt, stmt_v
 }
 }
 
+static inline bool
+nested_in_vect_loop_p (struct loop *loop, stmt_vec_info stmt_info)
+{
+  return (loop->inner
+ && (loop->inner == (gimple_bb (stmt_info->stmt))->loop_father));
+}
+
 /* Return the earlier statement between STMT1_INFO and STMT2_INFO.  */
 
 static inline stmt_vec_info
@@ -1493,8 +1493,8 @@ extern bool vect_is_simple_use (tree, ve
 extern bool vect_is_simple_use (tree, vec_info *, enum vect_def_type *,
tree *, stmt_vec_info * = NULL,
gimple ** = NULL);
-extern bool supportable_widening_operation (enum tree_code, gimple *, tree,
-  

[31/46] Use stmt_vec_info in function interfaces (part 1)

2018-07-24 Thread Richard Sandiford
This first (less mechanical) part handles cases that involve changes in
the callers or non-trivial changes in the functions themselves.


2018-07-24  Richard Sandiford  

gcc/
* tree-vect-data-refs.c (vect_describe_gather_scatter_call): Take
a stmt_vec_info instead of a gcall.
(vect_check_gather_scatter): Update call accordingly.
* tree-vect-loop-manip.c (iv_phi_p): Take a stmt_vec_info instead
of a gphi.
(vect_can_advance_ivs_p, vect_update_ivs_after_vectorizer)
(slpeel_update_phi_nodes_for_loops):): Update calls accordingly.
* tree-vect-loop.c (vect_transform_loop_stmt): Take a stmt_vec_info
instead of a gimple stmt.
(vect_transform_loop): Update calls accordingly.
* tree-vect-slp.c (vect_split_slp_store_group): Take and return
stmt_vec_infos instead of gimple stmts.
(vect_analyze_slp_instance): Update use accordingly.
* tree-vect-stmts.c (read_vector_array, write_vector_array)
(vect_clobber_variable, vect_stmt_relevant_p, permute_vec_elements)
(vect_use_strided_gather_scatters_p, vect_build_all_ones_mask)
(vect_build_zero_merge_argument, vect_get_gather_scatter_ops)
(vect_gen_widened_results_half, vect_get_loop_based_defs)
(vect_create_vectorized_promotion_stmts, can_vectorize_live_stmts):
Take a stmt_vec_info instead of a gimple stmt and pass stmt_vec_infos
down to subroutines.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:23:35.376732054 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:23:46.108636749 +0100
@@ -3621,13 +3621,14 @@ vect_gather_scatter_fn_p (bool read_p, b
   return true;
 }
 
-/* CALL is a call to an internal gather load or scatter store function.
+/* STMT_INFO is a call to an internal gather load or scatter store function.
Describe the operation in INFO.  */
 
 static void
-vect_describe_gather_scatter_call (gcall *call, gather_scatter_info *info)
+vect_describe_gather_scatter_call (stmt_vec_info stmt_info,
+  gather_scatter_info *info)
 {
-  stmt_vec_info stmt_info = vinfo_for_stmt (call);
+  gcall *call = as_a  (stmt_info->stmt);
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
   data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
 
@@ -3672,7 +3673,7 @@ vect_check_gather_scatter (gimple *stmt,
   ifn = gimple_call_internal_fn (call);
   if (internal_gather_scatter_fn_p (ifn))
{
- vect_describe_gather_scatter_call (call, info);
+ vect_describe_gather_scatter_call (stmt_info, info);
  return true;
}
   masked_p = (ifn == IFN_MASK_LOAD || ifn == IFN_MASK_STORE);
Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c  2018-07-24 10:23:35.376732054 +0100
+++ gcc/tree-vect-loop-manip.c  2018-07-24 10:23:46.112636713 +0100
@@ -1335,16 +1335,16 @@ find_loop_location (struct loop *loop)
   return dump_user_location_t ();
 }
 
-/* Return true if PHI defines an IV of the loop to be vectorized.  */
+/* Return true if the phi described by STMT_INFO defines an IV of the
+   loop to be vectorized.  */
 
 static bool
-iv_phi_p (gphi *phi)
+iv_phi_p (stmt_vec_info stmt_info)
 {
+  gphi *phi = as_a  (stmt_info->stmt);
   if (virtual_operand_p (PHI_RESULT (phi)))
 return false;
 
-  stmt_vec_info stmt_info = vinfo_for_stmt (phi);
-  gcc_assert (stmt_info != NULL_STMT_VEC_INFO);
   if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
   || STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
 return false;
@@ -1388,7 +1388,7 @@ vect_can_advance_ivs_p (loop_vec_info lo
 virtual defs/uses (i.e., memory accesses) are analyzed elsewhere.
 
 Skip reduction phis.  */
-  if (!iv_phi_p (phi))
+  if (!iv_phi_p (phi_info))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
@@ -1509,7 +1509,7 @@ vect_update_ivs_after_vectorizer (loop_v
}
 
   /* Skip reduction and virtual phis.  */
-  if (!iv_phi_p (phi))
+  if (!iv_phi_p (phi_info))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
@@ -2088,7 +2088,8 @@ slpeel_update_phi_nodes_for_loops (loop_
   tree arg = PHI_ARG_DEF_FROM_EDGE (orig_phi, first_latch_e);
   /* Generate lcssa PHI node for the first loop.  */
   gphi *vect_phi = (loop == first) ? orig_phi : update_phi;
-  if (create_lcssa_for_iv_phis || !iv_phi_p (vect_phi))
+  stmt_vec_info vect_phi_info = loop_vinfo->lookup_stmt (vect_phi);
+  if (create_lcssa_for_iv_phis || !iv_phi_p (vect_phi_info))
{
  tree new_res = copy_ssa_name (PHI_RESULT (orig_phi));
  gphi *lcssa_phi = create_phi_node (new_res, between_bb);
Index: gcc/tree-vect-loop.c

[35/46] Alter interfaces within vect_pattern_recog

2018-07-24 Thread Richard Sandiford
vect_pattern_recog_1 took a gimple_stmt_iterator as argument, but was
only interested in the gsi_stmt, not anything else.  This patch makes
the associated routines operate directly on stmt_vec_infos.


2018-07-24  Richard Sandiford  

gcc/
* tree-vect-patterns.c (vect_mark_pattern_stmts): Take the
original stmt as a stmt_vec_info rather than a gimple stmt.
(vect_pattern_recog_1): Take the statement directly as a
stmt_vec_info, rather than via a gimple_stmt_iterator.
Update call to vect_mark_pattern_stmts.
(vect_pattern_recog): Update calls accordingly.

Index: gcc/tree-vect-patterns.c
===
--- gcc/tree-vect-patterns.c2018-07-24 10:23:50.004602150 +0100
+++ gcc/tree-vect-patterns.c2018-07-24 10:23:59.408518638 +0100
@@ -4720,29 +4720,29 @@ const unsigned int NUM_PATTERNS = ARRAY_
 /* Mark statements that are involved in a pattern.  */
 
 static inline void
-vect_mark_pattern_stmts (gimple *orig_stmt, gimple *pattern_stmt,
+vect_mark_pattern_stmts (stmt_vec_info orig_stmt_info, gimple *pattern_stmt,
  tree pattern_vectype)
 {
-  stmt_vec_info orig_stmt_info = vinfo_for_stmt (orig_stmt);
   gimple *def_seq = STMT_VINFO_PATTERN_DEF_SEQ (orig_stmt_info);
 
-  bool old_pattern_p = is_pattern_stmt_p (orig_stmt_info);
-  if (old_pattern_p)
+  gimple *orig_pattern_stmt = NULL;
+  if (is_pattern_stmt_p (orig_stmt_info))
 {
   /* We're replacing a statement in an existing pattern definition
 sequence.  */
+  orig_pattern_stmt = orig_stmt_info->stmt;
   if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location,
   "replacing earlier pattern ");
- dump_gimple_stmt (MSG_NOTE, TDF_SLIM, orig_stmt, 0);
+ dump_gimple_stmt (MSG_NOTE, TDF_SLIM, orig_pattern_stmt, 0);
}
 
   /* To keep the book-keeping simple, just swap the lhs of the
 old and new statements, so that the old one has a valid but
 unused lhs.  */
-  tree old_lhs = gimple_get_lhs (orig_stmt);
-  gimple_set_lhs (orig_stmt, gimple_get_lhs (pattern_stmt));
+  tree old_lhs = gimple_get_lhs (orig_pattern_stmt);
+  gimple_set_lhs (orig_pattern_stmt, gimple_get_lhs (pattern_stmt));
   gimple_set_lhs (pattern_stmt, old_lhs);
 
   if (dump_enabled_p ())
@@ -4755,7 +4755,8 @@ vect_mark_pattern_stmts (gimple *orig_st
   orig_stmt_info = STMT_VINFO_RELATED_STMT (orig_stmt_info);
 
   /* We shouldn't be replacing the main pattern statement.  */
-  gcc_assert (STMT_VINFO_RELATED_STMT (orig_stmt_info) != orig_stmt);
+  gcc_assert (STMT_VINFO_RELATED_STMT (orig_stmt_info)->stmt
+ != orig_pattern_stmt);
 }
 
   if (def_seq)
@@ -4763,13 +4764,14 @@ vect_mark_pattern_stmts (gimple *orig_st
 !gsi_end_p (si); gsi_next ())
   vect_init_pattern_stmt (gsi_stmt (si), orig_stmt_info, pattern_vectype);
 
-  if (old_pattern_p)
+  if (orig_pattern_stmt)
 {
   vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, pattern_vectype);
 
   /* Insert all the new pattern statements before the original one.  */
   gimple_seq *orig_def_seq = _VINFO_PATTERN_DEF_SEQ (orig_stmt_info);
-  gimple_stmt_iterator gsi = gsi_for_stmt (orig_stmt, orig_def_seq);
+  gimple_stmt_iterator gsi = gsi_for_stmt (orig_pattern_stmt,
+  orig_def_seq);
   gsi_insert_seq_before_without_update (, def_seq, GSI_SAME_STMT);
   gsi_insert_before_without_update (, pattern_stmt, GSI_SAME_STMT);
 
@@ -4785,12 +4787,12 @@ vect_mark_pattern_stmts (gimple *orig_st
Input:
PATTERN_RECOG_FUNC: A pointer to a function that detects a certain
 computation pattern.
-   STMT: A stmt from which the pattern search should start.
+   STMT_INFO: A stmt from which the pattern search should start.
 
If PATTERN_RECOG_FUNC successfully detected the pattern, it creates
a sequence of statements that has the same functionality and can be
-   used to replace STMT.  It returns the last statement in the sequence
-   and adds any earlier statements to STMT's STMT_VINFO_PATTERN_DEF_SEQ.
+   used to replace STMT_INFO.  It returns the last statement in the sequence
+   and adds any earlier statements to STMT_INFO's STMT_VINFO_PATTERN_DEF_SEQ.
PATTERN_RECOG_FUNC also sets *TYPE_OUT to the vector type of the final
statement, having first checked that the target supports the new operation
in that type.
@@ -4799,10 +4801,10 @@ vect_mark_pattern_stmts (gimple *orig_st
for vect_recog_pattern.  */
 
 static void
-vect_pattern_recog_1 (vect_recog_func *recog_func, gimple_stmt_iterator si)
+vect_pattern_recog_1 (vect_recog_func *recog_func, stmt_vec_info stmt_info)
 {
-  gimple *stmt = gsi_stmt (si), *pattern_stmt;
-  stmt_vec_info stmt_info;
+  vec_info *vinfo = stmt_info->vinfo;
+  gimple *pattern_stmt;
   loop_vec_info 

[33/46] Use stmt_vec_infos instead of vec_info/gimple stmt pairs

2018-07-24 Thread Richard Sandiford
This patch makes vect_record_max_nunits and vect_record_base_alignment
take a stmt_vec_info instead of a vec_info/gimple pair.


2018-07-24  Richard Sandiford  

gcc/
* tree-vect-data-refs.c (vect_record_base_alignment): Replace vec_info
and gimple stmt arguments with a stmt_vec_info.
(vect_record_base_alignments): Update calls accordingly.
* tree-vect-slp.c (vect_record_max_nunits): Replace vec_info
and gimple stmt arguments with a stmt_vec_info.
(vect_build_slp_tree_1): Remove vinfo argument and update call
to vect_record_max_nunits.
(vect_build_slp_tree_2): Update calls to vect_build_slp_tree_1
and vect_record_max_nunits.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:23:50.000602186 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:23:53.204573732 +0100
@@ -794,14 +794,14 @@ vect_slp_analyze_instance_dependence (sl
   return res;
 }
 
-/* Record in VINFO the base alignment guarantee given by DRB.  STMT is
-   the statement that contains DRB, which is useful for recording in the
-   dump file.  */
+/* Record the base alignment guarantee given by DRB, which occurs
+   in STMT_INFO.  */
 
 static void
-vect_record_base_alignment (vec_info *vinfo, gimple *stmt,
+vect_record_base_alignment (stmt_vec_info stmt_info,
innermost_loop_behavior *drb)
 {
+  vec_info *vinfo = stmt_info->vinfo;
   bool existed;
   innermost_loop_behavior *
 = vinfo->base_alignments.get_or_insert (drb->base_address, );
@@ -820,7 +820,7 @@ vect_record_base_alignment (vec_info *vi
   "  misalignment: %d\n", drb->base_misalignment);
  dump_printf_loc (MSG_NOTE, vect_location,
   "  based on: ");
- dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
+ dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_info->stmt, 0);
}
 }
 }
@@ -847,13 +847,13 @@ vect_record_base_alignments (vec_info *v
  && STMT_VINFO_VECTORIZABLE (stmt_info)
  && !STMT_VINFO_GATHER_SCATTER_P (stmt_info))
{
- vect_record_base_alignment (vinfo, stmt_info, _INNERMOST (dr));
+ vect_record_base_alignment (stmt_info, _INNERMOST (dr));
 
  /* If DR is nested in the loop that is being vectorized, we can also
 record the alignment of the base wrt the outer loop.  */
  if (loop && nested_in_vect_loop_p (loop, stmt_info))
vect_record_base_alignment
-   (vinfo, stmt_info, _VINFO_DR_WRT_VEC_LOOP (stmt_info));
+ (stmt_info, _VINFO_DR_WRT_VEC_LOOP (stmt_info));
}
 }
 }
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c 2018-07-24 10:23:50.004602150 +0100
+++ gcc/tree-vect-slp.c 2018-07-24 10:23:53.204573732 +0100
@@ -609,14 +609,14 @@ compatible_calls_p (gcall *call1, gcall
 }
 
 /* A subroutine of vect_build_slp_tree for checking VECTYPE, which is the
-   caller's attempt to find the vector type in STMT with the narrowest
+   caller's attempt to find the vector type in STMT_INFO with the narrowest
element type.  Return true if VECTYPE is nonnull and if it is valid
-   for VINFO.  When returning true, update MAX_NUNITS to reflect the
-   number of units in VECTYPE.  VINFO, GORUP_SIZE and MAX_NUNITS are
-   as for vect_build_slp_tree.  */
+   for STMT_INFO.  When returning true, update MAX_NUNITS to reflect the
+   number of units in VECTYPE.  GROUP_SIZE and MAX_NUNITS are as for
+   vect_build_slp_tree.  */
 
 static bool
-vect_record_max_nunits (vec_info *vinfo, gimple *stmt, unsigned int group_size,
+vect_record_max_nunits (stmt_vec_info stmt_info, unsigned int group_size,
tree vectype, poly_uint64 *max_nunits)
 {
   if (!vectype)
@@ -625,7 +625,8 @@ vect_record_max_nunits (vec_info *vinfo,
{
  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
   "Build SLP failed: unsupported data-type in ");
- dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM, stmt, 0);
+ dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM,
+   stmt_info->stmt, 0);
  dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
}
   /* Fatal mismatch.  */
@@ -636,7 +637,7 @@ vect_record_max_nunits (vec_info *vinfo,
  before adjusting *max_nunits for basic-block vectorization.  */
   poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
   unsigned HOST_WIDE_INT const_nunits;
-  if (is_a  (vinfo)
+  if (STMT_VINFO_BB_VINFO (stmt_info)
   && (!nunits.is_constant (_nunits)
  || const_nunits > group_size))
 {
@@ -696,7 +697,7 @@ vect_two_operations_perm_ok_p (vec stmts, unsigned int group_size,
   poly_uint64 *max_nunits, bool *matches,
   bool *two_operators)
@@ -763,7 +764,7 @@ 

Re: Improve std::rotate usages

2018-07-24 Thread François Dumont

Ping.

On 08/06/2018 07:54, François Dumont wrote:

Gentle reminder.

On 27/05/2018 19:25, François Dumont wrote:

Still no chance to review it ?

I'd like this one to go in before submitting other algo related patches.

    * include/bits/stl_algo.h
    (__rotate(_Ite, _Ite, _Ite, forward_iterator_tag))
    (__rotate(_Ite, _Ite, _Ite, bidirectional_iterator_tag))
    (__rotate(_Ite, _Ite, _Ite, random_access_iterator_tag)): Move 
code duplication...

    (rotate(_Ite, _Ite, _Ite)): ...here.
    (__stable_partition_adaptive(_FIt, _FIt, _Pred, _Dist, _Pointer, 
_Dist)):

    Simplify rotate call.
    (__rotate_adaptive(_BIt1, _BIt1, _BIt1, _Dist, _Dist, _Bit2, 
_Dist)):

    Likewise.
    (__merge_without_buffer(_BIt, _BIt, _BIt, _Dist, _Dist, _Comp)):
    Likewise.

François

On 14/05/2018 22:14, François Dumont wrote:

Any feedback regarding this patch ?


On 02/05/2018 07:26, François Dumont wrote:

Hi

    std::rotate already returns the expected iterator so there is 
no need for calls to std::advance/std::distance.


Tested under Linux x86_64, ok to commit ?

François











RE: [PATCH][GCC][AArch64] Cleanup the AArch64 testsuite when stack-clash is on [Patch (6/6)]

2018-07-24 Thread tamar . christina
Hi All,

This patch cleans up the testsuite when a run is done with stack clash
protection turned on.

Concretely this switches off -fstack-clash-protection for a couple of tests:

* sve: We don't yet support stack-clash-protection and sve, so for now turn 
these off.
* assembler scan: some tests are quite fragile in that they check for exact
   assembly output, e.g. check for exact amount of sub etc.  These won't
   match now.
* vla: Some of the ubsan tests negative array indices. Because the arrays 
weren't
   used before the incorrect $sp wouldn't have been used. The correct value 
is
   restored on ret.  Now however we probe the $sp which causes a segfault.
* params: When testing the parameters we have to skip these on AArch64 because 
of our
  custom constraints on them.  We already test them separately so this 
isn't a
  loss.

Note that the testsuite is not entire clean due to gdb failure caused by alloca 
with
stack clash. On AArch64 we output an incorrect .loc directive, but this is 
already the
case with the current implementation in GCC and is a bug unrelated to this 
patch series.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no 
issues.
Both targets were tested with stack clash on and off by default.

Ok for trunk?

Thanks,
Tamar

gcc/testsuite/
2018-07-24  Tamar Christina  

PR target/86486
* gcc.dg/pr82788.c: Skip for AArch64.
* gcc.dg/guality/vla-1.c: Turn off stack-clash.
* gcc.target/aarch64/subsp.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_3.c: Likewise.
* gcc.target/aarch64/sve/mask_struct_store_4.c: Likewise.
* gcc.dg/params/blocksort-part.c: Skip stack-clash checks
on AArch64.
* gcc.dg/stack-check-10.c: Add AArch64 specific checks.
* gcc.dg/stack-check-5.c: Add AArch64 specific checks.
* gcc.dg/stack-check-6a.c: Skip on AArch64, we don't support this.
* testsuite/lib/target-supports.exp
(check_effective_target_frame_pointer_for_non_leaf): AArch64 does not
require frame pointer for non-leaf functions.

> -Original Message-
> From: Tamar Christina 
> Sent: Wednesday, July 11, 2018 12:23
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; James Greenhalgh ;
> Richard Earnshaw ; Marcus Shawcroft
> 
> Subject: [PATCH][GCC][AArch64] Cleanup the AArch64 testsuite when stack-
> clash is on [Patch (6/6)]
> 
> Hi All,
> 
> This patch cleans up the testsuite when a run is done with stack clash
> protection turned on.
> 
> Concretely this switches off -fstack-clash-protection for a couple of tests:
> 
> * sve: We don't yet support stack-clash-protection and sve, so for now turn
> these off.
> * assembler scan: some tests are quite fragile in that they check for exact
>assembly output, e.g. check for exact amount of sub etc.  These won't
>match now.
> * vla: Some of the ubsan tests negative array indices. Because the arrays
> weren't
>used before the incorrect $sp wouldn't have been used. The correct
> value is
>restored on ret.  Now however we probe the $sp which causes a segfault.
> * params: When testing the parameters we have to skip these on AArch64
> because of our
>   custom constraints on them.  We already test them separately so this
> isn't a
>   loss.
> 
> Note that the testsuite is not entire clean due to gdb failure caused by 
> alloca
> with stack clash. On AArch64 we output an incorrect .loc directive, but this 
> is
> already the case with the current implementation in GCC and is a bug
> unrelated to this patch series.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no issues.
> Both targets were tested with stack clash on and off by default.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/testsuite/
> 2018-07-11  Tamar Christina  
> 
>   PR target/86486
>   gcc.dg/pr82788.c: Skip for AArch64.
>   gcc.dg/guality/vla-1.c: Turn off stack-clash.
>   gcc.target/aarch64/subsp.c: Likewise.
>   gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise.
>   gcc.target/aarch64/sve/mask_struct_store_3.c: Likewise.
>   gcc.target/aarch64/sve/mask_struct_store_4.c: Likewise.
>   gcc.dg/params/blocksort-part.c: Skip stack-clash checks
>   on AArch64.
> 
> --
diff --git a/gcc/testsuite/c-c++-common/ubsan/vla-1.c b/gcc/testsuite/c-c++-common/ubsan/vla-1.c
index 52ade3aab7566dce3ca7ef931ac65895005d5e13..c97465edae195442a71ee66ab25015a2ac4fc8fc 100644
--- a/gcc/testsuite/c-c++-common/ubsan/vla-1.c
+++ b/gcc/testsuite/c-c++-common/ubsan/vla-1.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-fsanitize=vla-bound -Wall -Wno-unused-variable" } */
+/* { dg-options "-fsanitize=vla-bound -Wall -Wno-unused-variable -fno-stack-clash-protection" } */
 
 typedef long int V;
 int x = -1;
diff --git a/gcc/testsuite/gcc.dg/params/blocksort-part.c 

RE: [PATCH][GCC][AArch64] Set default values for stack-clash and do basic validation in back-end. [Patch (5/6)]

2018-07-24 Thread tamar . christina
Hi All,

This patch is a cascade update from having to re-spin the configure patch (no# 
4 in the series).

This patch enforces that the default guard size for stack-clash protection for
AArch64 be 64KB unless the user has overriden it via configure in which case
the user value is used as long as that value is within the valid range.

It also does some basic validation to ensure that the guard size is only 4KB or
64KB and also enforces that for aarch64 the stack-clash probing interval is
equal to the guard size.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Target was tested with stack clash on and off by default.

Ok for trunk?

Thanks,
Tamar

gcc/
2018-07-24  Tamar Christina  

PR target/86486
* config/aarch64/aarch64.c (aarch64_override_options_internal):
Add validation for stack-clash parameters and set defaults.

> -Original Message-
> From: Tamar Christina 
> Sent: Wednesday, July 11, 2018 12:23
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; James Greenhalgh ;
> Richard Earnshaw ; Marcus Shawcroft
> 
> Subject: [PATCH][GCC][AArch64] Set default values for stack-clash and do
> basic validation in back-end. [Patch (5/6)]
> 
> Hi All,
> 
> This patch enforces that the default guard size for stack-clash protection for
> AArch64 be 64KB unless the user has overriden it via configure in which case
> the user value is used as long as that value is within the valid range.
> 
> It also does some basic validation to ensure that the guard size is only 4KB 
> or
> 64KB and also enforces that for aarch64 the stack-clash probing interval is
> equal to the guard size.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> Target was tested with stack clash on and off by default.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/
> 2018-07-11  Tamar Christina  
> 
>   PR target/86486
>   * config/aarch64/aarch64.c (aarch64_override_options_internal):
>   Add validation for stack-clash parameters.
> 
> --
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index e2c34cdfc96a1d3f99f7e4834c66a7551464a518..30c62c406e10793fe041d54c73316a6c8d7c229f 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -10916,6 +10916,37 @@ aarch64_override_options_internal (struct gcc_options *opts)
 			 opts->x_param_values,
 			 global_options_set.x_param_values);
 
+  /* If the user hasn't change it via configure then set the default to 64 KB
+ for the backend.  */
+  maybe_set_param_value (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE,
+			 DEFAULT_STK_CLASH_GUARD_SIZE == 0
+			   ? 16 : DEFAULT_STK_CLASH_GUARD_SIZE,
+			 opts->x_param_values,
+			 global_options_set.x_param_values);
+
+  /* Validate the guard size.  */
+  int guard_size = PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE);
+  if (guard_size != 12 && guard_size != 16)
+  error ("only values 12 (4 KB) and 16 (64 KB) are supported for guard "
+	 "size.  Given value %d (%llu KB) is out of range.\n",
+	 guard_size, (1ULL << guard_size) / 1024ULL);
+
+  /* Enforce that interval is the same size as size so the mid-end does the
+ right thing.  */
+  maybe_set_param_value (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL,
+			 guard_size,
+			 opts->x_param_values,
+			 global_options_set.x_param_values);
+
+  /* The maybe_set calls won't update the value if the user has explicitly set
+ one.  Which means we need to validate that probing interval and guard size
+ are equal.  */
+  int probe_interval
+= PARAM_VALUE (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL);
+  if (guard_size != probe_interval)
+error ("stack clash guard size '%d' must be equal to probing interval "
+	   "'%d'\n", guard_size, probe_interval);
+
   /* Enable sw prefetching at specified optimization level for
  CPUS that have prefetch.  Lower optimization level threshold by 1
  when profiling is enabled.  */



RE: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-24 Thread Joseph Myers
On Tue, 24 Jul 2018, tamar.christ...@arm.com wrote:

> This patch defines a configure option to allow the setting of the default
> guard size via configure flags when building the target.

If you add a configure option, you must also add documentation for it in 
install.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix a missing case of PR 21458 similar to fc6141f097056f830a412afebed8d81a9d72b696.

2018-07-24 Thread Kyrill Tkachov

Hi Robert,

On 24/07/18 09:48, Robert Schiele wrote:

The original fix for PR 21458 was causing some issues, which were
addressed to be fixed with a follow-up fix
fc6141f097056f830a412afebed8d81a9d72b696.  Unfortunately that follow-up
fix missed one case, which is handled by this fix.

Change-Id: Ie32e3f2514b3e4b6b35c0a693de6b65ef010bb9d
---
 gas/config/tc-arm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)



Patches to gas should be sent to the binutils list: binut...@sourceware.org
rather than gcc-patches.

Cheers,
Kyrill


diff --git a/gas/config/tc-arm.c b/gas/config/tc-arm.c
index feb725d..c92b6ef 100644
--- a/gas/config/tc-arm.c
+++ b/gas/config/tc-arm.c
@@ -10836,11 +10836,12 @@ do_t_adr (void)
   inst.instruction |= Rd << 4;
 }

-  if (inst.reloc.exp.X_op == O_symbol
+  if (support_interwork
+  && inst.reloc.exp.X_op == O_symbol
   && inst.reloc.exp.X_add_symbol != NULL
   && S_IS_DEFINED (inst.reloc.exp.X_add_symbol)
   && THUMB_IS_FUNC (inst.reloc.exp.X_add_symbol))
-inst.reloc.exp.X_add_number += 1;
+inst.reloc.exp.X_add_number |= 1;
 }

 /* Arithmetic instructions for which there is just one 16-bit
--
2.4.6




[PATCH] Fix up pr19476-{1,5}.C (PR testsuite/86649)

2018-07-24 Thread Jakub Jelinek
Hi!

When looking at PR86569 testresults, I must have missed these two tests
(but looking at test_summary outputs, I see it now).
When we no longer fold this during cp_fold (to avoid code generation
changes between -Wnonnull-compare and -Wno-nonnull-compare), it isn't
folded from the first pass; with -O2 it is folded during evrp and with
-O1 during dom2.

Note, the test would fail before with -Wnonnull-compare, e.g. on 8
branch (which doesn't have the PR86569 changes), I see:
make check-c++-all RUNTESTFLAGS='--target_board=unix\{,-Wnonnull-compare\} 
dg.exp=pr19476*'
=== g++ Summary for unix ===

# of expected passes72
Running target unix/-Wnonnull-compare
Using /usr/share/dejagnu/baseboards/unix.exp as board description file for 
target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using /usr/src/gcc-8/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.
Running /usr/src/gcc-8/gcc/testsuite/g++.dg/dg.exp ...
FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++98  scan-tree-dump ccp1 "return 42"
FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++11  scan-tree-dump ccp1 "return 42"
FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++14  scan-tree-dump ccp1 "return 42"
FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++17  scan-tree-dump ccp1 "return 42"
FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++2a  scan-tree-dump ccp1 "return 42"
FAIL: g++.dg/tree-ssa/pr19476-1.C  -std=gnu++17 -fconcepts  scan-tree-dump ccp1 
"return 42"
FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++98  scan-tree-dump ccp1 "return 42"
FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++11  scan-tree-dump ccp1 "return 42"
FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++14  scan-tree-dump ccp1 "return 42"
FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++17  scan-tree-dump ccp1 "return 42"
FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++2a  scan-tree-dump ccp1 "return 42"
FAIL: g++.dg/tree-ssa/pr19476-5.C  -std=gnu++17 -fconcepts  scan-tree-dump ccp1 
"return 42"

=== g++ Summary for unix/-Wnonnull-compare ===

# of expected passes60
# of unexpected failures12

Especially for -O2 that people use most, folding it at evrp time seems to be
early enough for me.
Fixed by testing this only in dom2, tested on x86_64-linux, ok for trunk?

2018-07-24  Jakub Jelinek  

PR testsuite/86649
* g++.dg/tree-ssa-/pr19476-1.C: Check dom2 dump instead of ccp1.
* g++.dg/tree-ssa-/pr19476-5.C: Likewise.

--- gcc/testsuite/g++.dg/tree-ssa/pr19476-1.C.jj2015-05-29 
15:04:33.037803445 +0200
+++ gcc/testsuite/g++.dg/tree-ssa/pr19476-1.C   2018-07-24 11:39:10.108897097 
+0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-ccp1 -fdelete-null-pointer-checks" } */
+/* { dg-options "-O -fdump-tree-dom2 -fdelete-null-pointer-checks" } */
 /* { dg-skip-if "" keeps_null_pointer_checks } */
 
 // See pr19476-5.C for a version without including .
@@ -12,5 +12,5 @@ int g(){
   return 42 + (0 == new int[50]);
 }
 
-/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */
-/* { dg-final { scan-tree-dump-not "return 33" "ccp1" } } */
+/* { dg-final { scan-tree-dump "return 42" "dom2" } } */
+/* { dg-final { scan-tree-dump-not "return 33" "dom2" } } */
--- gcc/testsuite/g++.dg/tree-ssa/pr19476-5.C.jj2015-05-29 
15:04:33.038803430 +0200
+++ gcc/testsuite/g++.dg/tree-ssa/pr19476-5.C   2018-07-24 11:39:26.190913802 
+0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-ccp1 -fdelete-null-pointer-checks" } */
+/* { dg-options "-O -fdump-tree-dom2 -fdelete-null-pointer-checks" } */
 /* { dg-skip-if "" keeps_null_pointer_checks } */
 
 // See pr19476-1.C for a version that includes .
@@ -8,4 +8,4 @@ int g(){
   return 42 + (0 == new int[50]);
 }
 
-/* { dg-final { scan-tree-dump "return 42" "ccp1" } } */
+/* { dg-final { scan-tree-dump "return 42" "dom2" } } */

Jakub


[11/46] Pass back a stmt_vec_info from vect_is_simple_use

2018-07-24 Thread Richard Sandiford
This patch makes vect_is_simple_use pass back a stmt_vec_info to
those callers that want it.  Most users only need the stmt_vec_info
but some need the gimple stmt too.

It's probably high time we added a class to represent "simple operands"
instead, but I have a separate series that tries to clean up how
operands are handled (with a view to allowing mixed vector sizes).


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (vect_is_simple_use): Add an optional
stmt_vec_info * parameter before the optional gimple **.
* tree-vect-stmts.c (vect_is_simple_use): Likewise.
(process_use, vect_get_vec_def_for_operand_1): Update callers.
(vect_get_vec_def_for_operand, vectorizable_shift): Likewise.
* tree-vect-loop.c (vectorizable_reduction): Likewise.
(vectorizable_live_operation): Likewise.
* tree-vect-patterns.c (type_conversion_p): Likewise.
(vect_look_through_possible_promotion): Likewise.
(vect_recog_rotate_pattern): Likewise.
* tree-vect-slp.c (vect_get_and_check_slp_defs): Likewise.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:33.829278607 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:37.257248166 +0100
@@ -1532,9 +1532,10 @@ extern tree get_mask_type_for_scalar_typ
 extern tree get_same_sized_vectype (tree, tree);
 extern bool vect_get_loop_mask_type (loop_vec_info);
 extern bool vect_is_simple_use (tree, vec_info *, enum vect_def_type *,
-   gimple ** = NULL);
+   stmt_vec_info * = NULL, gimple ** = NULL);
 extern bool vect_is_simple_use (tree, vec_info *, enum vect_def_type *,
-   tree *, gimple ** = NULL);
+   tree *, stmt_vec_info * = NULL,
+   gimple ** = NULL);
 extern bool supportable_widening_operation (enum tree_code, gimple *, tree,
tree, enum tree_code *,
enum tree_code *, int *,
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:22:33.829278607 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:22:37.257248166 +0100
@@ -459,11 +459,9 @@ process_use (gimple *stmt, tree use, loo
 enum vect_relevant relevant, vec *worklist,
 bool force)
 {
-  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   stmt_vec_info stmt_vinfo = vinfo_for_stmt (stmt);
   stmt_vec_info dstmt_vinfo;
   basic_block bb, def_bb;
-  gimple *def_stmt;
   enum vect_def_type dt;
 
   /* case 1: we are only interested in uses that need to be vectorized.  Uses
@@ -471,7 +469,7 @@ process_use (gimple *stmt, tree use, loo
   if (!force && !exist_non_indexing_operands_for_use_p (use, stmt))
  return true;
 
-  if (!vect_is_simple_use (use, loop_vinfo, , _stmt))
+  if (!vect_is_simple_use (use, loop_vinfo, , _vinfo))
 {
   if (dump_enabled_p ())
 dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -479,27 +477,20 @@ process_use (gimple *stmt, tree use, loo
   return false;
 }
 
-  if (!def_stmt || gimple_nop_p (def_stmt))
+  if (!dstmt_vinfo)
 return true;
 
-  def_bb = gimple_bb (def_stmt);
-  if (!flow_bb_inside_loop_p (loop, def_bb))
-{
-  if (dump_enabled_p ())
-   dump_printf_loc (MSG_NOTE, vect_location, "def_stmt is out of loop.\n");
-  return true;
-}
+  def_bb = gimple_bb (dstmt_vinfo->stmt);
 
-  /* case 2: A reduction phi (STMT) defined by a reduction stmt (DEF_STMT).
- DEF_STMT must have already been processed, because this should be the
+  /* case 2: A reduction phi (STMT) defined by a reduction stmt (DSTMT_VINFO).
+ DSTMT_VINFO must have already been processed, because this should be the
  only way that STMT, which is a reduction-phi, was put in the worklist,
- as there should be no other uses for DEF_STMT in the loop.  So we just
+ as there should be no other uses for DSTMT_VINFO in the loop.  So we just
  check that everything is as expected, and we are done.  */
-  dstmt_vinfo = vinfo_for_stmt (def_stmt);
   bb = gimple_bb (stmt);
   if (gimple_code (stmt) == GIMPLE_PHI
   && STMT_VINFO_DEF_TYPE (stmt_vinfo) == vect_reduction_def
-  && gimple_code (def_stmt) != GIMPLE_PHI
+  && gimple_code (dstmt_vinfo->stmt) != GIMPLE_PHI
   && STMT_VINFO_DEF_TYPE (dstmt_vinfo) == vect_reduction_def
   && bb->loop_father == def_bb->loop_father)
 {
@@ -514,7 +505,7 @@ process_use (gimple *stmt, tree use, loo
 
   /* case 3a: outer-loop stmt defining an inner-loop stmt:
outer-loop-header-bb:
-   d = def_stmt
+   d = dstmt_vinfo
inner-loop:
stmt # use (d)
outer-loop-tail-bb:
@@ -554,7 +545,7 @@ process_use (gimple *stmt, tree 

[10/46] Temporarily make stmt_vec_info a class

2018-07-24 Thread Richard Sandiford
This patch turns stmt_vec_info into an unspeakably bad wrapper class
and adds an implicit conversion to the associated gimple stmt.
Having this conversion makes the rest of the series easier to write,
but since the class goes away again at the end of the series, I've
not bothered adding any comments or tried to make it pretty.


2018-07-24  Richard Sandiford  

gcc/
* tree-vectorizer.h (stmt_vec_info): Temporarily change from
a typedef to a wrapper class.
(NULL_STMT_VEC_INFO): New macro.
(vec_info::stmt_infos): Change to vec.
(stmt_vec_info::operator*): New function.
(stmt_vec_info::operator gimple *): Likewise.
(set_vinfo_for_stmt): Use NULL_STMT_VEC_INFO.
(add_stmt_costs): Likewise.
* tree-vect-loop-manip.c (iv_phi_p): Likewise.
* tree-vect-loop.c (vect_compute_single_scalar_iteration_cost)
(vect_get_known_peeling_cost): Likewise.
(vect_estimate_min_profitable_iters): Likewise.
* tree-vect-patterns.c (vect_init_pattern_stmt): Likewise.
* tree-vect-slp.c (vect_remove_slp_scalar_calls): Likewise.
* tree-vect-stmts.c (vect_build_gather_load_calls): Likewise.
(vectorizable_store, free_stmt_vec_infos): Likewise.
(new_stmt_vec_info): Change return type of xcalloc to
_stmt_vec_info *.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   2018-07-24 10:22:30.401309046 +0100
+++ gcc/tree-vectorizer.h   2018-07-24 10:22:33.829278607 +0100
@@ -21,12 +21,31 @@ Software Foundation; either version 3, o
 #ifndef GCC_TREE_VECTORIZER_H
 #define GCC_TREE_VECTORIZER_H
 
+class stmt_vec_info {
+public:
+  stmt_vec_info () {}
+  stmt_vec_info (struct _stmt_vec_info *ptr) : m_ptr (ptr) {}
+  struct _stmt_vec_info *operator-> () const { return m_ptr; }
+  struct _stmt_vec_info * () const;
+  operator struct _stmt_vec_info * () const { return m_ptr; }
+  operator gimple * () const;
+  operator void * () const { return m_ptr; }
+  operator bool () const { return m_ptr; }
+  bool operator == (const stmt_vec_info ) { return x.m_ptr == m_ptr; }
+  bool operator == (_stmt_vec_info *x) { return x == m_ptr; }
+  bool operator != (const stmt_vec_info ) { return x.m_ptr != m_ptr; }
+  bool operator != (_stmt_vec_info *x) { return x != m_ptr; }
+
+private:
+  struct _stmt_vec_info *m_ptr;
+};
+
+#define NULL_STMT_VEC_INFO (stmt_vec_info (NULL))
+
 #include "tree-data-ref.h"
 #include "tree-hash-traits.h"
 #include "target.h"
 
-typedef struct _stmt_vec_info *stmt_vec_info;
-
 /* Used for naming of new temporaries.  */
 enum vect_var_kind {
   vect_simple_var,
@@ -229,7 +248,7 @@ struct vec_info {
   vec_info_shared *shared;
 
   /* The mapping of GIMPLE UID to stmt_vec_info.  */
-  vec stmt_vec_infos;
+  vec stmt_vec_infos;
 
   /* All SLP instances.  */
   auto_vec slp_instances;
@@ -1052,6 +1071,17 @@ #define VECT_SCALAR_BOOLEAN_TYPE_P(TYPE)
&& TYPE_PRECISION (TYPE) == 1   \
&& TYPE_UNSIGNED (TYPE)))
 
+inline _stmt_vec_info &
+stmt_vec_info::operator* () const
+{
+  return *m_ptr;
+}
+
+inline stmt_vec_info::operator gimple * () const
+{
+  return m_ptr ? m_ptr->stmt : NULL;
+}
+
 extern vec *stmt_vec_info_vec;
 
 void set_stmt_vec_info_vec (vec *);
@@ -1084,7 +1114,7 @@ set_vinfo_for_stmt (gimple *stmt, stmt_v
 }
   else
 {
-  gcc_checking_assert (info == NULL);
+  gcc_checking_assert (info == NULL_STMT_VEC_INFO);
   (*stmt_vec_info_vec)[uid - 1] = info;
 }
 }
@@ -1261,7 +1291,9 @@ add_stmt_costs (void *data, stmt_vector_
   unsigned i;
   FOR_EACH_VEC_ELT (*cost_vec, i, cost)
 add_stmt_cost (data, cost->count, cost->kind,
-  cost->stmt ? vinfo_for_stmt (cost->stmt) : NULL,
+  (cost->stmt
+   ? vinfo_for_stmt (cost->stmt)
+   : NULL_STMT_VEC_INFO),
   cost->misalign, cost->where);
 }
 
Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c  2018-06-30 14:56:22.022893750 +0100
+++ gcc/tree-vect-loop-manip.c  2018-07-24 10:22:33.821278677 +0100
@@ -1344,7 +1344,7 @@ iv_phi_p (gphi *phi)
 return false;
 
   stmt_vec_info stmt_info = vinfo_for_stmt (phi);
-  gcc_assert (stmt_info != NULL);
+  gcc_assert (stmt_info != NULL_STMT_VEC_INFO);
   if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
   || STMT_VINFO_DEF_TYPE (stmt_info) == vect_double_reduction_def)
 return false;
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:22:30.401309046 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:22:33.821278677 +0100
@@ -1139,7 +1139,7 @@ vect_compute_single_scalar_iteration_cos
j, si)
 {
   struct _stmt_vec_info *stmt_info
-   = si->stmt ? vinfo_for_stmt (si->stmt) : NULL;
+

[29/46] Use stmt_vec_info instead of gimple stmts internally (part 2)

2018-07-24 Thread Richard Sandiford
This second part handles the less mechnical cases, i.e. those that don't
just involve swapping a gimple stmt for an existing stmt_vec_info.


2018-07-24  Richard Sandiford  

gcc/
* tree-vect-loop.c (vect_analyze_loop_operations): Look up the
statement before passing it to vect_analyze_stmt.
(vect_create_epilog_for_reduction): Use a stmt_vec_info to walk
the chain of phi vector definitions.  Track the exit phi via its
stmt_vec_info.
(vectorizable_reduction): Set cond_stmt_vinfo directly from the
STMT_VINFO_REDUC_DEF.
* tree-vect-slp.c (vect_get_place_in_interleaving_chain): Use
stmt_vec_infos to handle the statement chains.
(vect_get_slp_defs): Record the first statement in the node
using a stmt_vec_info.
* tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Look up
statements here and pass their stmt_vec_info down to subroutines.
(vect_init_vector_1): Hoist call to vinfo_for_stmt and pass it
down to vect_finish_stmt_generation.
(vect_init_vector, vect_get_vec_defs, vect_finish_replace_stmt)
(vect_finish_stmt_generation): Call vinfo_for_stmt and pass
stmt_vec_infos to subroutines.
(vect_remove_stores): Use stmt_vec_infos to handle the statement
chains.

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:23:35.376732054 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:23:38.964700191 +0100
@@ -1629,8 +1629,9 @@ vect_analyze_loop_operations (loop_vec_i
 {
  gimple *stmt = gsi_stmt (si);
  if (!gimple_clobber_p (stmt)
- && !vect_analyze_stmt (stmt, _to_vectorize, NULL, NULL,
-_vec))
+ && !vect_analyze_stmt (loop_vinfo->lookup_stmt (stmt),
+_to_vectorize,
+NULL, NULL, _vec))
return false;
 }
 } /* bbs */
@@ -4832,11 +4833,11 @@ vect_create_epilog_for_reduction (veclookup_stmt (new_phis[0]);
   for (int k = 1; k < ncopies; ++k)
{
- next_phi = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (next_phi));
- tree second_vect = PHI_RESULT (next_phi);
+ next_phi_info = STMT_VINFO_RELATED_STMT (next_phi_info);
+ tree second_vect = PHI_RESULT (next_phi_info->stmt);
   tree tem = make_ssa_name (vec_dest, new_vec_stmt);
   new_vec_stmt = gimple_build_assign (tem, code,
  first_vect, second_vect);
@@ -5573,11 +5574,12 @@ vect_create_epilog_for_reduction (veclookup_stmt (new_phis[k / ratio]);
  reduction_phi_info = reduction_phis[k / ratio];
  if (double_reduc)
inner_phi = inner_phis[k / ratio];
@@ -5623,8 +5625,7 @@ vect_create_epilog_for_reduction (vec ops, slp_tree slp_node,
   vec > *vec_oprnds)
 {
-  gimple *first_stmt;
   int number_of_vects = 0, i;
   unsigned int child_index = 0;
   HOST_WIDE_INT lhs_size_unit, rhs_size_unit;
@@ -3586,7 +3587,7 @@ vect_get_slp_defs (vec ops, slp_tr
   tree oprnd;
   bool vectorized_defs;
 
-  first_stmt = SLP_TREE_SCALAR_STMTS (slp_node)[0];
+  stmt_vec_info first_stmt_info = SLP_TREE_SCALAR_STMTS (slp_node)[0];
   FOR_EACH_VEC_ELT (ops, i, oprnd)
 {
   /* For each operand we check if it has vectorized definitions in a child
@@ -3637,8 +3638,8 @@ vect_get_slp_defs (vec ops, slp_tr
  vect_schedule_slp_instance (), fix it by replacing LHS with
  RHS, if necessary.  See vect_get_smallest_scalar_type () for
  details.  */
-  vect_get_smallest_scalar_type (first_stmt, _size_unit,
- _size_unit);
+ vect_get_smallest_scalar_type (first_stmt_info, _size_unit,
+_size_unit);
   if (rhs_size_unit != lhs_size_unit)
 {
   number_of_vects *= rhs_size_unit;
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:23:35.384731983 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:23:38.968700155 +0100
@@ -622,7 +622,6 @@ vect_mark_stmts_to_be_vectorized (loop_v
   unsigned int i;
   stmt_vec_info stmt_vinfo;
   basic_block bb;
-  gimple *phi;
   bool live_p;
   enum vect_relevant relevant;
 
@@ -636,27 +635,27 @@ vect_mark_stmts_to_be_vectorized (loop_v
   bb = bbs[i];
   for (si = gsi_start_phis (bb); !gsi_end_p (si); gsi_next ())
{
- phi = gsi_stmt (si);
+ stmt_vec_info phi_info = loop_vinfo->lookup_stmt (gsi_stmt (si));
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location, "init: phi relevant? ");
- dump_gimple_stmt (MSG_NOTE, TDF_SLIM, phi, 0);

[28/46] Use stmt_vec_info instead of gimple stmts internally (part 1)

2018-07-24 Thread Richard Sandiford
This first part makes functions use stmt_vec_infos instead of
gimple stmts in cases where the stmt_vec_info was already available
and where the change is mechanical.  Most of it is just replacing
"stmt" with "stmt_info".


2018-07-24  Richard Sandiford  

gcc/
* tree-vect-data-refs.c (vect_slp_analyze_node_dependences):
(vect_check_gather_scatter, vect_create_data_ref_ptr, bump_vector_ptr)
(vect_permute_store_chain, vect_setup_realignment)
(vect_permute_load_chain, vect_shift_permute_load_chain)
(vect_transform_grouped_load): Use stmt_vec_info rather than gimple
stmts internally, and when passing values to other vectorizer routines.
* tree-vect-loop-manip.c (vect_can_advance_ivs_p): Likewise.
* tree-vect-loop.c (vect_analyze_scalar_cycles_1)
(vect_analyze_loop_operations, get_initial_def_for_reduction)
(vect_create_epilog_for_reduction, vectorize_fold_left_reduction)
(vectorizable_reduction, vectorizable_induction)
(vectorizable_live_operation, vect_transform_loop_stmt)
(vect_transform_loop): Likewise.
* tree-vect-patterns.c (vect_reassociating_reduction_p)
(vect_recog_widen_op_pattern, vect_recog_mixed_size_cond_pattern)
(vect_recog_bool_pattern, vect_recog_gather_scatter_pattern): Likewise.
* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
(vect_slp_analyze_node_operations_1): Likewise.
* tree-vect-stmts.c (vect_mark_relevant, process_use)
(exist_non_indexing_operands_for_use_p, vect_init_vector_1)
(vect_mark_stmts_to_be_vectorized, vect_get_vec_def_for_operand)
(vect_finish_stmt_generation_1, get_group_load_store_type)
(get_load_store_type, vect_build_gather_load_calls)
(vectorizable_bswap, vectorizable_call, vectorizable_simd_clone_call)
(vect_create_vectorized_demotion_stmts, vectorizable_conversion)
(vectorizable_assignment, vectorizable_shift, vectorizable_operation)
(vectorizable_store, vectorizable_load, vectorizable_condition)
(vectorizable_comparison, vect_analyze_stmt, vect_transform_stmt)
(supportable_widening_operation): Likewise.
(vect_get_vector_types_for_stmt): Likewise.
* tree-vectorizer.h (vect_dr_behavior): Likewise.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   2018-07-24 10:23:31.736764378 +0100
+++ gcc/tree-vect-data-refs.c   2018-07-24 10:23:35.376732054 +0100
@@ -712,7 +712,7 @@ vect_slp_analyze_node_dependences (slp_i
 been sunk to (and we verify if we can do that as well).  */
  if (gimple_visited_p (stmt))
{
- if (stmt != last_store)
+ if (stmt_info != last_store)
continue;
  unsigned i;
  stmt_vec_info store_info;
@@ -3666,7 +3666,7 @@ vect_check_gather_scatter (gimple *stmt,
 
   /* See whether this is already a call to a gather/scatter internal function.
  If not, see whether it's a masked load or store.  */
-  gcall *call = dyn_cast  (stmt);
+  gcall *call = dyn_cast  (stmt_info->stmt);
   if (call && gimple_call_internal_p (call))
 {
   ifn = gimple_call_internal_fn (call);
@@ -4677,8 +4677,8 @@ vect_create_data_ref_ptr (gimple *stmt,
   if (loop_vinfo)
 {
   loop = LOOP_VINFO_LOOP (loop_vinfo);
-  nested_in_vect_loop = nested_in_vect_loop_p (loop, stmt);
-  containing_loop = (gimple_bb (stmt))->loop_father;
+  nested_in_vect_loop = nested_in_vect_loop_p (loop, stmt_info);
+  containing_loop = (gimple_bb (stmt_info->stmt))->loop_father;
   pe = loop_preheader_edge (loop);
 }
   else
@@ -4786,7 +4786,7 @@ vect_create_data_ref_ptr (gimple *stmt,
 
   /* Create: (&(base[init_val+offset]+byte_offset) in the loop preheader.  */
 
-  new_temp = vect_create_addr_base_for_vector_ref (stmt, _stmt_list,
+  new_temp = vect_create_addr_base_for_vector_ref (stmt_info, _stmt_list,
   offset, byte_offset);
   if (new_stmt_list)
 {
@@ -4934,7 +4934,7 @@ bump_vector_ptr (tree dataref_ptr, gimpl
 new_dataref_ptr = make_ssa_name (TREE_TYPE (dataref_ptr));
   incr_stmt = gimple_build_assign (new_dataref_ptr, POINTER_PLUS_EXPR,
   dataref_ptr, update);
-  vect_finish_stmt_generation (stmt, incr_stmt, gsi);
+  vect_finish_stmt_generation (stmt_info, incr_stmt, gsi);
 
   /* Copy the points-to information if it exists. */
   if (DR_PTR_INFO (dr))
@@ -5282,7 +5282,7 @@ vect_permute_store_chain (vec dr_c
  data_ref = make_temp_ssa_name (vectype, NULL, "vect_shuffle3_low");
  perm_stmt = gimple_build_assign (data_ref, VEC_PERM_EXPR, vect1,
   vect2, perm3_mask_low);
- vect_finish_stmt_generation (stmt, perm_stmt, gsi);
+ vect_finish_stmt_generation (stmt_info, 

[30/46] Use stmt_vec_infos rather than gimple stmts for worklists

2018-07-24 Thread Richard Sandiford
2018-07-24  Richard Sandiford  

gcc/
* tree-vect-loop.c (vect_analyze_scalar_cycles_1): Change the type
of the worklist from a vector of gimple stmts to a vector of
stmt_vec_infos.
* tree-vect-stmts.c (vect_mark_relevant, process_use)
(vect_mark_stmts_to_be_vectorized): Likewise

Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c2018-07-24 10:23:38.964700191 +0100
+++ gcc/tree-vect-loop.c2018-07-24 10:23:42.472669038 +0100
@@ -474,7 +474,7 @@ vect_analyze_scalar_cycles_1 (loop_vec_i
 {
   basic_block bb = loop->header;
   tree init, step;
-  auto_vec worklist;
+  auto_vec worklist;
   gphi_iterator gsi;
   bool double_reduc;
 
@@ -543,9 +543,9 @@ vect_analyze_scalar_cycles_1 (loop_vec_i
   /* Second - identify all reductions and nested cycles.  */
   while (worklist.length () > 0)
 {
-  gimple *phi = worklist.pop ();
+  stmt_vec_info stmt_vinfo = worklist.pop ();
+  gphi *phi = as_a  (stmt_vinfo->stmt);
   tree def = PHI_RESULT (phi);
-  stmt_vec_info stmt_vinfo = vinfo_for_stmt (phi);
 
   if (dump_enabled_p ())
 {
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   2018-07-24 10:23:38.968700155 +0100
+++ gcc/tree-vect-stmts.c   2018-07-24 10:23:42.472669038 +0100
@@ -194,7 +194,7 @@ vect_clobber_variable (gimple *stmt, gim
Mark STMT as "relevant for vectorization" and add it to WORKLIST.  */
 
 static void
-vect_mark_relevant (vec *worklist, gimple *stmt,
+vect_mark_relevant (vec *worklist, gimple *stmt,
enum vect_relevant relevant, bool live_p)
 {
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
@@ -453,7 +453,7 @@ exist_non_indexing_operands_for_use_p (t
 
 static bool
 process_use (gimple *stmt, tree use, loop_vec_info loop_vinfo,
-enum vect_relevant relevant, vec *worklist,
+enum vect_relevant relevant, vec *worklist,
 bool force)
 {
   stmt_vec_info stmt_vinfo = vinfo_for_stmt (stmt);
@@ -618,16 +618,14 @@ vect_mark_stmts_to_be_vectorized (loop_v
   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
   unsigned int nbbs = loop->num_nodes;
   gimple_stmt_iterator si;
-  gimple *stmt;
   unsigned int i;
-  stmt_vec_info stmt_vinfo;
   basic_block bb;
   bool live_p;
   enum vect_relevant relevant;
 
   DUMP_VECT_SCOPE ("vect_mark_stmts_to_be_vectorized");
 
-  auto_vec worklist;
+  auto_vec worklist;
 
   /* 1. Init worklist.  */
   for (i = 0; i < nbbs; i++)
@@ -665,17 +663,17 @@ vect_mark_stmts_to_be_vectorized (loop_v
   use_operand_p use_p;
   ssa_op_iter iter;
 
-  stmt = worklist.pop ();
+  stmt_vec_info stmt_vinfo = worklist.pop ();
   if (dump_enabled_p ())
{
-  dump_printf_loc (MSG_NOTE, vect_location, "worklist: examine stmt: 
");
-  dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt, 0);
+ dump_printf_loc (MSG_NOTE, vect_location,
+  "worklist: examine stmt: ");
+ dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt_vinfo->stmt, 0);
}
 
   /* Examine the USEs of STMT. For each USE, mark the stmt that defines it
 (DEF_STMT) as relevant/irrelevant according to the relevance property
 of STMT.  */
-  stmt_vinfo = vinfo_for_stmt (stmt);
   relevant = STMT_VINFO_RELEVANT (stmt_vinfo);
 
   /* Generally, the relevance property of STMT (in STMT_VINFO_RELEVANT) is


[PATCH 02/11] [nvptx] Rename worker_bcast variables oacc_bcast.

2018-07-24 Thread cesar
From: Cesar Philippidis 

Eventually, we want the nvptx BE to use a common shared memory buffer
for both worker and vector state propagation (albeit using different
partitions of shared memory for each logical thread). This patch
renames the worker_bcast variables into a more generic oacc_bcast.

2018-XX-YY  Cesar Philippidis  

gcc/
* config/nvptx/nvptx.c (worker_bcast_size): Rename as
oacc_bcast_size.
(worker_bcast_align): Rename as oacc_bcast_align.
(worker_bcast_sym): Rename as oacc_bcast_sym.
(nvptx_option_override): Update usage of oacc_bcast_*.
(struct wcast_data_t): Rename as broadcast_data_t.
(nvptx_gen_wcast): Update type of data argument and usage of
oacc_bcast_align.
(wprop_gen): Update type of data_ and usage of oacc_bcast_align.
(nvptx_wpropagate): Update type of data and usage of
oacc_bcast_{sym,size}.
(nvptx_single): Update type of data and usage of oacc_bcast_size.
(nvptx_file_end): Update usage of oacc_bcast_{sym,align,size}.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 521f83e..fb3e0c7 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -129,14 +129,15 @@ struct tree_hasher : ggc_cache_ptr_hash
 static GTY((cache)) hash_table *declared_fndecls_htab;
 static GTY((cache)) hash_table *needed_fndecls_htab;
 
-/* Buffer needed to broadcast across workers.  This is used for both
-   worker-neutering and worker broadcasting.  It is shared by all
-   functions emitted.  The buffer is placed in shared memory.  It'd be
-   nice if PTX supported common blocks, because then this could be
-   shared across TUs (taking the largest size).  */
-static unsigned worker_bcast_size;
-static unsigned worker_bcast_align;
-static GTY(()) rtx worker_bcast_sym;
+/* Buffer needed to broadcast across workers and vectors.  This is
+   used for both worker-neutering and worker broadcasting, and
+   vector-neutering and boardcasting when vector_length > 32.  It is
+   shared by all functions emitted.  The buffer is placed in shared
+   memory.  It'd be nice if PTX supported common blocks, because then
+   this could be shared across TUs (taking the largest size).  */
+static unsigned oacc_bcast_size;
+static unsigned oacc_bcast_align;
+static GTY(()) rtx oacc_bcast_sym;
 
 /* Buffer needed for worker reductions.  This has to be distinct from
the worker broadcast array, as both may be live concurrently.  */
@@ -209,9 +210,9 @@ nvptx_option_override (void)
   declared_libfuncs_htab
 = hash_table::create_ggc (17);
 
-  worker_bcast_sym = gen_rtx_SYMBOL_REF (Pmode, "__worker_bcast");
-  SET_SYMBOL_DATA_AREA (worker_bcast_sym, DATA_AREA_SHARED);
-  worker_bcast_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
+  oacc_bcast_sym = gen_rtx_SYMBOL_REF (Pmode, "__oacc_bcast");
+  SET_SYMBOL_DATA_AREA (oacc_bcast_sym, DATA_AREA_SHARED);
+  oacc_bcast_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
 
   worker_red_sym = gen_rtx_SYMBOL_REF (Pmode, "__worker_red");
   SET_SYMBOL_DATA_AREA (worker_red_sym, DATA_AREA_SHARED);
@@ -1756,7 +1757,7 @@ nvptx_gen_vcast (rtx reg)
 
 /* Structure used when generating a worker-level spill or fill.  */
 
-struct wcast_data_t
+struct broadcast_data_t
 {
   rtx base;  /* Register holding base addr of buffer.  */
   rtx ptr;  /* Iteration var,  if needed.  */
@@ -1780,7 +1781,7 @@ enum propagate_mask
how many loop iterations will be executed (0 for not a loop).  */

 static rtx
-nvptx_gen_wcast (rtx reg, propagate_mask pm, unsigned rep, wcast_data_t *data)
+nvptx_gen_wcast (rtx reg, propagate_mask pm, unsigned rep, broadcast_data_t 
*data)
 {
   rtx  res;
   machine_mode mode = GET_MODE (reg);
@@ -1810,8 +1811,8 @@ nvptx_gen_wcast (rtx reg, propagate_mask pm, unsigned 
rep, wcast_data_t *data)
  {
unsigned align = GET_MODE_ALIGNMENT (mode) / BITS_PER_UNIT;
 
-   if (align > worker_bcast_align)
- worker_bcast_align = align;
+   if (align > oacc_bcast_align)
+ oacc_bcast_align = align;
data->offset = (data->offset + align - 1) & ~(align - 1);
addr = data->base;
if (data->offset)
@@ -3916,15 +3917,15 @@ nvptx_vpropagate (bool is_call, basic_block block, 
rtx_insn *insn)
 static rtx
 wprop_gen (rtx reg, propagate_mask pm, unsigned rep, void *data_)
 {
-  wcast_data_t *data = (wcast_data_t *)data_;
+  broadcast_data_t *data = (broadcast_data_t *)data_;
 
   if (pm & PM_loop_begin)
 {
   /* Starting a loop, initialize pointer.*/
   unsigned align = GET_MODE_ALIGNMENT (GET_MODE (reg)) / BITS_PER_UNIT;
 
-  if (align > worker_bcast_align)
-   worker_bcast_align = align;
+  if (align > oacc_bcast_align)
+   oacc_bcast_align = align;
   data->offset = (data->offset + align - 1) & ~(align - 1);
 
   data->ptr = gen_reg_rtx (Pmode);
@@ -3949,7 +3950,7 @@ wprop_gen (rtx reg, propagate_mask pm, 

[PATCH 00/11] [nvptx] Initial vector length changes

2018-07-24 Thread cesar
From: Cesar Philippidis 

This patch series contains various cleanups and structural
reorganizations to the NVPTX BE in preparation for the forthcoming
variable length vector length enhancements. Tom, in order to make
these changes easier for you to review, I broke these patches into
logical components. If approved for trunk, would you like to see these
patches committed individually, or all together in a single huge
commit?

One notable change in this patch set is the partial inclusion of the
PTX_DEFAULT_RUNTIME_DIM change that I previously placed with the
libgomp default geometry update patch that I posted a couple of weeks
ago. I don't want to block this patch series so I included the nvptx
changes in patch 01.

It this OK for trunk? I regtested both standalone and offloading
compiliers. I'm seeing some inconsistencies in the standalone compiler
results, so I might rerun those just to be safe. But the results using
nvptx as an offloading compiler came back clean.

Thanks,
Cesar


[PATCH 01/11] [nvptx] Update openacc dim macros

2018-07-24 Thread cesar
From: Cesar Philippidis 

Besides for updating the macros for the NVPTX OpenACC dims, this patch
also renames PTX_GANG_DEFAULT to PTX_DEFAULT_RUNTIME_DIM. I had
originally included the PTX_GANG_DEFAULT hunk in an earlier libgomp
patch, but going forward it makes sense to isolate the nvptx and
libgomp changes when possible, which this patch series does.

2018-XX-YY  Cesar Philippidis  

gcc/
config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Rename to
PTX_DEFAULT_RUNTIME_DIM.
* (PTX_VECTOR_LENGTH, PTX_WORKER_LENGTH,
PTX_DEFAULT_RUNTIME_DIM): Move to the top of the file.
(PTX_WARP_SIZE): Define.
(PTX_CTA_SIZE): Define.
(nvptx_simt_vf): Return PTX_WARP_SIZE instead of PTX_VECTOR_LENGTH.
(nvptx_goacc_validate_dims): Use PTX_DEFAULT_RUNTIME_DIM.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 5608bee..521f83e 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -81,6 +81,13 @@
 #define WORKAROUND_PTXJIT_BUG_2 1
 #define WORKAROUND_PTXJIT_BUG_3 1
 
+/* Define dimension sizes for known hardware.  */
+#define PTX_VECTOR_LENGTH 32
+#define PTX_WORKER_LENGTH 32
+#define PTX_DEFAULT_RUNTIME_DIM 0 /* Defer to runtime.  */
+#define PTX_WARP_SIZE 32
+#define PTX_CTA_SIZE 1024
+
 /* The various PTX memory areas an object might reside in.  */
 enum nvptx_data_area
 {
@@ -5161,18 +5168,13 @@ nvptx_expand_builtin (tree exp, rtx target, rtx 
ARG_UNUSED (subtarget),
 default: gcc_unreachable ();
 }
 }
-
-/* Define dimension sizes for known hardware.  */
-#define PTX_VECTOR_LENGTH 32
-#define PTX_WORKER_LENGTH 32
-#define PTX_GANG_DEFAULT  0 /* Defer to runtime.  */
 
 /* Implement TARGET_SIMT_VF target hook: number of threads in a warp.  */
 
 static int
 nvptx_simt_vf ()
 {
-  return PTX_VECTOR_LENGTH;
+  return PTX_WARP_SIZE;
 }
 
 /* Validate compute dimensions of an OpenACC offload or routine, fill
@@ -5216,7 +5218,7 @@ nvptx_goacc_validate_dims (tree decl, int dims[], int 
fn_level)
   if (dims[GOMP_DIM_WORKER] < 0)
dims[GOMP_DIM_WORKER] = PTX_WORKER_LENGTH;
   if (dims[GOMP_DIM_GANG] < 0)
-   dims[GOMP_DIM_GANG] = PTX_GANG_DEFAULT;
+   dims[GOMP_DIM_GANG] = PTX_DEFAULT_RUNTIME_DIM;
   changed = true;
 }
 
-- 
2.7.4



[PATCH 03/11] [nvptx] Consolidate offloaded function attributes into struct offload_attrs

2018-07-24 Thread cesar
From: Cesar Philippidis 

This patch introduces a new struct offload_attrs, which contains the
details regarding the offload function launch geometry. In addition to
its current usage to neuter worker and vector threads, it will
eventually by used to validate the compile-time launch geometry
requested by the user.

2018-XX-YY  Cesar Philippidis  

gcc/
* config/nvptx/nvptx.c (struct offload_attrs): New.
(populate_offload_attrs): New function.
(nvptx_reorg): Use it to extract partitioning mask.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index fb3e0c7..1b83b3c 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -2871,6 +2871,17 @@ nvptx_reorg_uniform_simt ()
 }
 }
 
+/* Offloading function attributes.  */
+
+struct offload_attrs
+{
+  unsigned mask;
+  int num_gangs;
+  int num_workers;
+  int vector_length;
+  int max_workers;
+};
+
 /* Loop structure of the function.  The entire function is described as
a NULL loop.  */
 
@@ -4568,6 +4579,56 @@ nvptx_neuter_pars (parallel *par, unsigned modes, 
unsigned outer)
 nvptx_neuter_pars (par->next, modes, outer);
 }
 
+static void
+populate_offload_attrs (offload_attrs *oa)
+{
+  tree attr = oacc_get_fn_attrib (current_function_decl);
+  tree dims = TREE_VALUE (attr);
+  unsigned ix;
+
+  oa->mask = 0;
+
+  for (ix = 0; ix != GOMP_DIM_MAX; ix++, dims = TREE_CHAIN (dims))
+{
+  tree t = TREE_VALUE (dims);
+  int size = (t == NULL_TREE) ? 0 : TREE_INT_CST_LOW (t);
+  tree allowed = TREE_PURPOSE (dims);
+
+  if (size != 1 && !(allowed && integer_zerop (allowed)))
+   oa->mask |= GOMP_DIM_MASK (ix);
+
+  switch (ix)
+   {
+   case GOMP_DIM_GANG:
+ oa->num_gangs = size;
+ break;
+
+   case GOMP_DIM_WORKER:
+ oa->num_workers = size;
+ break;
+
+   case GOMP_DIM_VECTOR:
+ oa->vector_length = size;
+ break;
+   }
+}
+
+  if (oa->vector_length == 0)
+{
+  /* FIXME: Need a more graceful way to handle large vector
+lengths in OpenACC routines.  */
+  if (!lookup_attribute ("omp target entrypoint",
+DECL_ATTRIBUTES (current_function_decl)))
+   oa->vector_length = PTX_WARP_SIZE;
+  else
+   oa->vector_length = PTX_VECTOR_LENGTH;
+}
+  if (oa->num_workers == 0)
+oa->max_workers = PTX_CTA_SIZE / oa->vector_length;
+  else
+oa->max_workers = oa->num_workers;
+}
+
 #if WORKAROUND_PTXJIT_BUG_2
 /* Variant of pc_set that only requires JUMP_P (INSN) if STRICT.  This variant
is needed in the nvptx target because the branches generated for
@@ -4749,27 +4810,19 @@ nvptx_reorg (void)
 {
   /* If we determined this mask before RTL expansion, we could
 elide emission of some levels of forks and joins.  */
-  unsigned mask = 0;
-  tree dims = TREE_VALUE (attr);
-  unsigned ix;
+  offload_attrs oa;
 
-  for (ix = 0; ix != GOMP_DIM_MAX; ix++, dims = TREE_CHAIN (dims))
-   {
- int size = TREE_INT_CST_LOW (TREE_VALUE (dims));
- tree allowed = TREE_PURPOSE (dims);
+  populate_offload_attrs ();
 
- if (size != 1 && !(allowed && integer_zerop (allowed)))
-   mask |= GOMP_DIM_MASK (ix);
-   }
   /* If there is worker neutering, there must be vector
 neutering.  Otherwise the hardware will fail.  */
-  gcc_assert (!(mask & GOMP_DIM_MASK (GOMP_DIM_WORKER))
- || (mask & GOMP_DIM_MASK (GOMP_DIM_VECTOR)));
+  gcc_assert (!(oa.mask & GOMP_DIM_MASK (GOMP_DIM_WORKER))
+ || (oa.mask & GOMP_DIM_MASK (GOMP_DIM_VECTOR)));
 
   /* Discover & process partitioned regions.  */
   parallel *pars = nvptx_discover_pars (_insn_map);
   nvptx_process_pars (pars);
-  nvptx_neuter_pars (pars, mask, 0);
+  nvptx_neuter_pars (pars, oa.mask, 0);
   delete pars;
 }
 
-- 
2.7.4



[PATCH 04/11] [nvptx] Make nvptx state propagation function names more generic

2018-07-24 Thread cesar
From: Cesar Philippidis 

This patch renames various state propagation functions into somewhat
that reflects their usage in generic worker and vector contexts. E.g.,
whereas before nvptx_wpropagate used to be used exclusively for worker
state propagation, it will eventually be used for any state
propagation using multiple warps. Because variable length vectors will
be able use both shared memory and warp shuffles for broadcasting, the
old vector-specific functions now contain 'warp' in their name.

2018-XX-YY  Cesar Philippidis  

gcc/
* config/nvptx/nvptx.c (nvptx_gen_wcast): Rename as
nvptx_gen_warp_bcast.
(nvptx_gen_wcast): Rename to nvptx_gen_shared_bcast, add bool
vector argument, and update call to nvptx_gen_shared_bcast.
(propagator_fn): Add bool argument.
(nvptx_propagate): New bool argument, pass bool argument to fn.
(vprop_gen): Rename to warp_prop_gen, update call to
nvptx_gen_warp_bcast.
(nvptx_vpropagate): Rename to nvptx_warp_propagate, update call to
nvptx_propagate.
(wprop_gen): Rename to shared_prop_gen, update call to
nvptx_gen_shared_bcast.
(nvptx_wpropagate): Rename to nvptx_shared_propagate, update call
to nvptx_propagate.
(nvptx_wsync): Rename to nvptx_cta_sync.
(nvptx_single): Update calls to nvptx_gen_warp_bcast,
nvptx_gen_shared_bcast and nvptx_cta_sync.
(nvptx_process_pars): Likewise.
(write_worker_buffer): Rename as write_shared_buffer.
(nvptx_file_end): Update calls to write_shared_buffer.
(nvptx_expand_worker_addr): Rename as nvptx_expand_shared_addr.
(nvptx_expand_builtin): Update call to nvptx_expand_shared_addr.
(nvptx_get_worker_red_addr): Rename as nvptx_get_shared_red_addr.
(nvptx_goacc_reduction_setup): Update call to
nvptx_get_shared_red_addr.
(nvptx_goacc_reduction_fini): Likewise.
(nvptx_goacc_reduction_teardown): Likewise.

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 1b83b3c..447425f 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -1750,7 +1750,7 @@ nvptx_gen_shuffle (rtx dst, rtx src, rtx idx, 
nvptx_shuffle_kind kind)
across the vectors of a single warp.  */
 
 static rtx
-nvptx_gen_vcast (rtx reg)
+nvptx_gen_warp_bcast (rtx reg)
 {
   return nvptx_gen_shuffle (reg, reg, const0_rtx, SHUFFLE_IDX);
 }
@@ -1781,7 +1781,8 @@ enum propagate_mask
how many loop iterations will be executed (0 for not a loop).  */

 static rtx
-nvptx_gen_wcast (rtx reg, propagate_mask pm, unsigned rep, broadcast_data_t 
*data)
+nvptx_gen_shared_bcast (rtx reg, propagate_mask pm, unsigned rep,
+   broadcast_data_t *data, bool vector)
 {
   rtx  res;
   machine_mode mode = GET_MODE (reg);
@@ -1795,7 +1796,7 @@ nvptx_gen_wcast (rtx reg, propagate_mask pm, unsigned 
rep, broadcast_data_t *dat
start_sequence ();
if (pm & PM_read)
  emit_insn (gen_sel_truesi (tmp, reg, GEN_INT (1), const0_rtx));
-   emit_insn (nvptx_gen_wcast (tmp, pm, rep, data));
+   emit_insn (nvptx_gen_shared_bcast (tmp, pm, rep, data, vector));
if (pm & PM_write)
  emit_insn (gen_rtx_SET (reg, gen_rtx_NE (BImode, tmp, const0_rtx)));
res = get_insns ();
@@ -1815,6 +1816,7 @@ nvptx_gen_wcast (rtx reg, propagate_mask pm, unsigned 
rep, broadcast_data_t *dat
  oacc_bcast_align = align;
data->offset = (data->offset + align - 1) & ~(align - 1);
addr = data->base;
+   gcc_assert (data->base != NULL);
if (data->offset)
  addr = gen_rtx_PLUS (Pmode, addr, GEN_INT (data->offset));
  }
@@ -3816,11 +3818,11 @@ nvptx_find_sese (auto_vec , 
bb_pair_vec_t )
regions and (b) only propagating stack entries that are used.  The
latter might be quite hard to determine.  */
 
-typedef rtx (*propagator_fn) (rtx, propagate_mask, unsigned, void *);
+typedef rtx (*propagator_fn) (rtx, propagate_mask, unsigned, void *, bool);
 
 static bool
 nvptx_propagate (bool is_call, basic_block block, rtx_insn *insn,
-propagate_mask rw, propagator_fn fn, void *data)
+propagate_mask rw, propagator_fn fn, void *data, bool vector)
 {
   bitmap live = DF_LIVE_IN (block);
   bitmap_iterator iterator;
@@ -3855,7 +3857,7 @@ nvptx_propagate (bool is_call, basic_block block, 
rtx_insn *insn,
  
  emit_insn (gen_rtx_SET (idx, GEN_INT (fs)));
  /* Allow worker function to initialize anything needed.  */
- rtx init = fn (tmp, PM_loop_begin, fs, data);
+ rtx init = fn (tmp, PM_loop_begin, fs, data, vector);
  if (init)
emit_insn (init);
  emit_label (label);
@@ -3864,7 +3866,7 @@ nvptx_propagate (bool is_call, basic_block block, 
rtx_insn *insn,
}
   if (rw & PM_read)
emit_insn (gen_rtx_SET (tmp, gen_rtx_MEM (DImode, 

Re: [RFC 1/3, debug] Add fdebug-nops

2018-07-24 Thread Tom de Vries
On 07/24/2018 09:06 PM, Alexandre Oliva wrote:
> On Jul 24, 2018, Tom de Vries  wrote:
> 
>> There's a design principle in GCC that code generation and debug generation
>> are independent.  This guarantees that if you're encountering a problem in an
>> application without debug info, you can recompile it with -g and be certain
>> that you can reproduce the same problem, and use the debug info to debug the
>> problem.  This invariant is enforced by bootstrap-debug.  The fdebug-nops
>> breaks this invariant
> 
> I thought of a way to not break it: enable the debug info generation
> machinery, including VTA and SFN, but discard those only at the very end
> if -g is not enabled.  The downside is that it would likely slow -Og
> down significantly, but who uses it without -g anyway?

I thought of the same.  I've submitted a patch here that uses SFN:
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01391.html . VTA is not
needed AFAIU.

Thanks,
- Tom


[PATCH] Add initial version of C++17 header

2018-07-24 Thread Jonathan Wakely

This is missing the synchronized_pool_resource and
unsynchronized_pool_resource classes but is otherwise complete.

This is a new implementation, not based on the existing code in
, but memory_resource and
polymorphic_allocator ended up looking almost the same anyway.

The constant_init kluge in src/c++17/memory_resource.cc is apparently
due to Richard Smith and ensures that the objects are constructed during
constant initialiation phase and not destroyed (because the
constant_init destructor doesn't destroy the union member and the
storage is not reused).

* config/abi/pre/gnu.ver: Export new symbols.
* configure: Regenerate.
* include/Makefile.am: Add new  header.
* include/Makefile.in: Regenerate.
* include/precompiled/stdc++.h: Include  for C++17.
* include/std/memory_resource: New header.
(memory_resource, polymorphic_allocator, new_delete_resource)
(null_memory_resource, set_default_resource, get_default_resource)
(pool_options, monotonic_buffer_resource): Define.
* src/Makefile.am: Add c++17 directory.
* src/Makefile.in: Regenerate.
* src/c++11/Makefile.am: Fix comment.
* src/c++17/Makefile.am: Add makefile for new sub-directory.
* src/c++17/Makefile.in: Generate.
* src/c++17/memory_resource.cc: New.
(newdel_res_t, null_res_t, constant_init, newdel_res, null_res)
(default_res, new_delete_resource, null_memory_resource)
(set_default_resource, get_default_resource): Define.
* testsuite/20_util/memory_resource/1.cc: New test.
* testsuite/20_util/memory_resource/2.cc: New test.
* testsuite/20_util/monotonic_buffer_resource/1.cc: New test.
* testsuite/20_util/monotonic_buffer_resource/allocate.cc: New test.
* testsuite/20_util/monotonic_buffer_resource/deallocate.cc: New test.
* testsuite/20_util/monotonic_buffer_resource/release.cc: New test.
* testsuite/20_util/monotonic_buffer_resource/upstream_resource.cc:
New test.
* testsuite/20_util/polymorphic_allocator/1.cc: New test.
* testsuite/20_util/polymorphic_allocator/resource.cc: New test.
* testsuite/20_util/polymorphic_allocator/select.cc: New test.
* testsuite/util/testsuite_allocator.h (__gnu_test::memory_resource):
Define concrete memory resource for testing.
(__gnu_test::default_resource_mgr): Define RAII helper for changing
default resource.

Tested powerpc64le-linux, committed to trunk.



commit 05c7ae80dbd59fcef5d583eac15181afbc07a116
Author: Jonathan Wakely 
Date:   Tue Jul 24 14:19:19 2018 +0100

Add initial version of C++17  header

This is missing the synchronized_pool_resource and
unsynchronized_pool_resource classes but is otherwise complete.

This is a new implementation, not based on the existing code in
, but memory_resource and
polymorphic_allocator ended up looking almost the same anyway.

The constant_init kluge in src/c++17/memory_resource.cc is apparently
due to Richard Smith and ensures that the objects are constructed during
constant initialiation phase and not destroyed (because the
constant_init destructor doesn't destroy the union member and the
storage is not reused).

* config/abi/pre/gnu.ver: Export new symbols.
* configure: Regenerate.
* include/Makefile.am: Add new  header.
* include/Makefile.in: Regenerate.
* include/precompiled/stdc++.h: Include  for C++17.
* include/std/memory_resource: New header.
(memory_resource, polymorphic_allocator, new_delete_resource)
(null_memory_resource, set_default_resource, get_default_resource)
(pool_options, monotonic_buffer_resource): Define.
* src/Makefile.am: Add c++17 directory.
* src/Makefile.in: Regenerate.
* src/c++11/Makefile.am: Fix comment.
* src/c++17/Makefile.am: Add makefile for new sub-directory.
* src/c++17/Makefile.in: Generate.
* src/c++17/memory_resource.cc: New.
(newdel_res_t, null_res_t, constant_init, newdel_res, null_res)
(default_res, new_delete_resource, null_memory_resource)
(set_default_resource, get_default_resource): Define.
* testsuite/20_util/memory_resource/1.cc: New test.
* testsuite/20_util/memory_resource/2.cc: New test.
* testsuite/20_util/monotonic_buffer_resource/1.cc: New test.
* testsuite/20_util/monotonic_buffer_resource/allocate.cc: New test.
* testsuite/20_util/monotonic_buffer_resource/deallocate.cc: New 
test.
* testsuite/20_util/monotonic_buffer_resource/release.cc: New test.
* testsuite/20_util/monotonic_buffer_resource/upstream_resource.cc:
New test.
* testsuite/20_util/polymorphic_allocator/1.cc: New 

Re: [PATCH] combine: Allow combining two insns to two insns

2018-07-24 Thread Jeff Law
On 07/24/2018 11:18 AM, Segher Boessenkool wrote:
> This patch allows combine to combine two insns into two.  This helps
> in many cases, by reducing instruction path length, and also allowing
> further combinations to happen.  PR85160 is a typical example of code
> that it can improve.
> 
> This patch does not allow such combinations if either of the original
> instructions was a simple move instruction.  In those cases combining
> the two instructions increases register pressure without improving the
> code.  With this move test register pressure does no longer increase
> noticably as far as I can tell.
> 
> (At first I also didn't allow either of the resulting insns to be a
> move instruction.  But that is actually a very good thing to have, as
> should have been obvious).
> 
> Tested for many months; tested on about 30 targets.
> 
> I'll commit this later this week if there are no objections.
> 
> 
> Segher
> 
> 
> 2018-07-24  Segher Boessenkool  
> 
>   PR rtl-optimization/85160
>   * combine.c (is_just_move): New function.
>   (try_combine): Allow combining two instructions into two if neither of
>   the original instructions was a move.
I've had several instances where a 2->2 combination would be useful
through the years.  I didn't save any of those examples though...  Good
to see the limitation being addressed.

jeff


Re: [2/5] C-SKY port: Backend implementation

2018-07-24 Thread Jeff Law
On 07/24/2018 12:18 PM, Sandra Loosemore wrote:
> On 07/24/2018 09:45 AM, Jeff Law wrote:
>> On 07/23/2018 10:21 PM, Sandra Loosemore wrote:
>>> 2018-07-23  Jojo  
>>>  Huibin Wang  
>>>  Sandra Loosemore  
>>>  Chung-Lin Tang  
>>>
>>>  C-SKY port: Backend implementation
>>>
>>>  gcc/
>>>  * config/csky/*: New.
>>>  * common/config/csky/*: New.
>>
>> Let's avoid gratutious whitespace that attempts to line up conditionals.
>>    As an example, look at the predicate csky_load_multiple_operation.  I
>> think just doing a quick pass over the .c, .h and main .md files should
>> be sufficient here.
> 
> OK, will do.
> 
>> I'm not a big fan of more awk code, but I'm not going to object to it :-)
>>
>> Why does the port have its own little pass for condition code
>> optimization (cse_cc)?  What is it doing that can't be done with our
>> generic optimizers?
> 
> This pass was included in the initial patch set we got from C-SKY, and
> as it didn't seem to break anything I left it in.  Perhaps C-SKY can
> provide a testcase that demonstrates why it's still useful in the
> current version of GCC; otherwise we can remove this from the initial
> port submission and restore it later if some performance analysis shows
> it is still worthwhile.
FWIW it looks like we model CC setting on just a few insns, (add,
subtract) so I'd be surprised if this little mini pass found much.  I'd
definitely like to hear from the csky authors here.

Alternately, you could do add some instrumentation to flag when it
triggers, take a test or two that does, reduce it and we can then look
at the key RTL sequences and see what the pass is really doing.

> 
>> Any thoughts on using the newer function descriptor bits rather than old
>> style stack trampolines?
> 
> Has that been committed?  I vaguely remembered discussion of a new way
> to handle nested functions without using the trampoline interface, but I
> couldn't find any documentation in the internals manual.
It did.  See TARGET_CUSTOM_FUNCTION_DESCRIPTORS and the (relatively few)
ports that define it.



> 
>> I don't see anything terribly concerning in the core of the port.  The
>> amount of support code for minipool is huge and I wonder if some sharing
>> across the various ports would be possible, but I don't think that
>> should be a blocking issue for this port.
> 
> Yes, that code was clearly copied almost verbatim from the ARM backend.
> I left it alone as much as possible to simplify any future attempts at
> genericizing it.
Understood -- I'd assumed it was largely copied from ARM, but hadn't
gone back to the ARM bits to verify.

> 
>> Can you update the backends.html web page here appropriately for the
>> c-sky target?
> 
> Sure, I can take care of updating that when the port is committed.  I
> believe the right entry is
> 
> "csky  b   ia"
Yea, that seems right to me.

Jeff


Re: [PATCH] Make strlen range computations more conservative

2018-07-24 Thread Jeff Law
On 07/24/2018 01:59 AM, Bernd Edlinger wrote:
> Hi!
> 
> This patch makes strlen range computations more conservative.
> 
> Firstly if there is a visible type cast from type A to B before passing
> then value to strlen, don't expect the type layout of B to restrict the
> possible return value range of strlen.
Why do you think this is the right thing to do?  ie, is there language
in the standards that makes you think the code as it stands today is
incorrect from a conformance standpoint?  Is there a significant body of
code that is affected in an adverse way by the current code?  If so,
what code?



> 
> Furthermore use the outermost enclosing array instead of the
> innermost one, because too aggressive optimization will likely
> convert harmless errors into security-relevant errors, because
> as the existing test cases demonstrate, this optimization is actively
> attacking string length checks in user code, while and not giving
> any warnings.
Same questions here.

I'll also note that Martin is *very* aware of the desire to avoid
introducing security relevent errors.  In fact his main focus is to help
identify coding errors that have a security impact.  So please don't
characterize his work as "actively attacking string length checks in
user code".

Ultimately we want highly accurate string lengths to help improve the
quality of the warnings we generate for potentially dangerous code.
These changes seem to take us in the opposite direction.

So ISTM that you really need a stronger justification using the
standards compliance and/or real world code that is made less safe by
keeping string lengths as accurate as possible.


> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?
I'd like to ask we hold on this until I return from PTO (Aug 1) so that
we can discuss the best thing to do here for each class of change.

I think you, Martin, Richi and myself should hash through the technical
issues raised by the patch.  Obviously others can chime in, but I think
the 4 of us probably need to drive the discussion.

Thanks,
Jeff


Re: [Patch] [Aarch64] PR 86538 - Define __ARM_FEATURE_LSE if LSE is available

2018-07-24 Thread Steve Ellcey
On Tue, 2018-07-24 at 22:04 +0100, James Greenhalgh wrote:
> 
> 
> I'd say this patch isn't desirable for trunk. I'd be interested in use cases
> that need a static decision on presence of LSE that are not better expressed
> using higher level language features.
> 
> Thanks,
> James

How about when building the higher level features?  Right now,
in sysdeps/aarch64/atomic-machine.h, we
hardcode ATOMIC_EXCHANGE_USES_CAS to 0.  If we had __ARM_FEATURE_LSE we
could use that to determine if we wanted to set
ATOMIC_EXCHANGE_USES_CAS to 0 or 1 which would affect the call
generated in nptl/pthread_spin_lock.c.  That would be useful if we
built a lipthread specifically for a platform that had LSE.

Steve Ellcey
sell...@cavium.com



Re: [PATCH] fix a couple of bugs in const string folding (PR 86532)

2018-07-24 Thread Jeff Law
On 07/24/2018 02:16 PM, Martin Sebor wrote:
> On 07/20/2018 04:20 AM, Richard Biener wrote:
>> On Thu, 19 Jul 2018, Martin Sebor wrote:
>>
>>> Here's one more update with tweaks addressing a couple more
>>> of Bernd's comments:
>>>
>>> 1) correct the use of TREE_STRING_LENGTH() where a number of
>>> array elements is expected and not bytes
>>> 2) set CHARTYPE as soon as it's first determined rather than
>>> trying to extract it again later
TREE_STRING_LENGTH is *really* poorly named.  It practically invites misuse.

[ Snip ]


> 
> 
> gcc-86532.diff
> 
> 
> PR tree-optimization/86622 - incorrect strlen of array of array plus variable 
> offset
> PR tree-optimization/86532 - Wrong code due to a wrong strlen folding 
> starting with r262522
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/86622
>   PR tree-optimization/86532
>   * builtins.h (string_length): Declare.
>   * builtins.c (c_strlen): Correct handling of non-constant offsets.  
>   (check_access): Be prepared for non-constant length ranges.
>   (string_length): Make extern.
>   * expr.c (string_constant): Only handle the minor non-constant
>   array index.  Use string_constant to compute the length of
>   a generic string constant.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/86622
>   PR tree-optimization/86532
>   * gcc.c-torture/execute/strlen-2.c: New test.
>   * gcc.c-torture/execute/strlen-3.c: New test.
>   * gcc.c-torture/execute/strlen-4.c: New test.
> 
OK
jeff


Re: [PATCH] Make strlen range computations more conservative

2018-07-24 Thread Bernd Edlinger
On 07/24/18 23:46, Jeff Law wrote:
> On 07/24/2018 01:59 AM, Bernd Edlinger wrote:
>> Hi!
>>
>> This patch makes strlen range computations more conservative.
>>
>> Firstly if there is a visible type cast from type A to B before passing
>> then value to strlen, don't expect the type layout of B to restrict the
>> possible return value range of strlen.
> Why do you think this is the right thing to do?  ie, is there language
> in the standards that makes you think the code as it stands today is
> incorrect from a conformance standpoint?  Is there a significant body of
> code that is affected in an adverse way by the current code?  If so,
> what code?
> 
> 

I think if you have an object, of an effective type A say char[100], then
you can cast the address of A to B, say typedef char (*B)[2] for instance
and then to const char *, say for use in strlen.  I may be wrong, but I think
that we should at least try not to pick up char[2] from B, but instead
use A for strlen ranges, or leave this range open.  Currently the range
info for strlen is [0..1] in this case, even if we see the type cast
in the generic tree.

One other example I have found in one of the test cases:

char c;

if (strlen() != 0) abort();

this is now completely elided, but why?  Is there a code base where
that is used?  I doubt, but why do we care to eliminate something
stupid like that?  If we would emit a warning for that I'm fine with it,
But if we silently remove code like that I don't think that it
will improve anything.  So I ask, where is the code base which
gets an improvement from that optimization?



> 
>>
>> Furthermore use the outermost enclosing array instead of the
>> innermost one, because too aggressive optimization will likely
>> convert harmless errors into security-relevant errors, because
>> as the existing test cases demonstrate, this optimization is actively
>> attacking string length checks in user code, while and not giving
>> any warnings.
> Same questions here.
> 
> I'll also note that Martin is *very* aware of the desire to avoid
> introducing security relevent errors.  In fact his main focus is to help
> identify coding errors that have a security impact.  So please don't
> characterize his work as "actively attacking string length checks in
> user code".
> 

I do fully respect Martin's valuable contributions over the years,
and I did not intend to say anything about the quality of his work,
for GCC, it is just breathtaking!

What I meant is just, what this particular optimization can do.

> Ultimately we want highly accurate string lengths to help improve the
> quality of the warnings we generate for potentially dangerous code.
> These changes seem to take us in the opposite direction.
> 

No, I don't think so, we have full control on the direction, when
I do what Richi requested on his response, we will have one function
where the string length estimation is based upon, instead of several
open coded tree walks.

> So ISTM that you really need a stronger justification using the
> standards compliance and/or real world code that is made less safe by
> keeping string lengths as accurate as possible.
> 
> 

This work concentrates mostly on avoiding to interfere with code that
actually deserves warnings, but which is not being warned about.

>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
> I'd like to ask we hold on this until I return from PTO (Aug 1) so that
> we can discuss the best thing to do here for each class of change.
> 

Okay.

> I think you, Martin, Richi and myself should hash through the technical
> issues raised by the patch.  Obviously others can chime in, but I think
> the 4 of us probably need to drive the discussion.
> 

Yes, sure.  I will try to help when I can.

Currently I thought Martin is working on the string constant folding,
(therefore I thought this range patch would not collide with his patch)
and there are plenty of change requests, plus I think he has some more
patches on hold.  I would like to see the review comments resolved,
and maybe also get to see the follow up patches, maybe as a patch
series, so we can get a clearer picture?


Thanks
Bernd.

> Thanks,
> Jeff
> 


[PATCH 11/11] [nvptx] Generalize state propagation and synchronization

2018-07-24 Thread cesar
From: Tom de Vries 

As the title mentions, this patch generalizes the state propagation and
synchronization code. Note that while the patch makes reference to
large_vectors, they are not enabled in nvptx_goacc_validate_dims.
Therefore, only the worker case is exercised in this patch.

2018-XX-YY  Tom de Vries  
Cesar Philippidis  

gcc/
* config/nvptx/nvptx.c (oacc_bcast_partition): Declare.
(nvptx_option_override): Init oacc_bcast_partition.
(nvptx_init_oacc_workers): New function.
(nvptx_declare_function_name): Call nvptx_init_oacc_workers.
(nvptx_needs_shared_bcast): New function.
(nvptx_find_par): Generalize to enable vectors to use shared-memory
to propagate state.
(nvptx_shared_propagate): Initialize vector bcast partition and
synchronization state.
(nvptx_single):  Generalize to enable vectors to use shared-memory
to propagate state.
(nvptx_process_pars): Likewise.
(nvptx_set_current_function): Initialize oacc_broadcast_partition.
* config/nvptx/nvptx.h (struct machine_function): Add
bcast_partition and sync_bar members.

(cherry picked from openacc-gcc-7-branch commit
628f439f33ed6f689656a1ed8ff74db97e7ec3ed, and commit
293e415e04d6b407e59118253e5fdfe539000cfe)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 7d49b4f..abd47ac 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -136,6 +136,7 @@ static GTY((cache)) hash_table 
*needed_fndecls_htab;
memory.  It'd be nice if PTX supported common blocks, because then
this could be shared across TUs (taking the largest size).  */
 static unsigned oacc_bcast_size;
+static unsigned oacc_bcast_partition;
 static unsigned oacc_bcast_align;
 static GTY(()) rtx oacc_bcast_sym;
 
@@ -154,6 +155,8 @@ static bool need_softstack_decl;
 /* True if any function references __nvptx_uni.  */
 static bool need_unisimt_decl;
 
+static int nvptx_mach_max_workers ();
+
 /* Allocate a new, cleared machine_function structure.  */
 
 static struct machine_function *
@@ -213,6 +216,7 @@ nvptx_option_override (void)
   oacc_bcast_sym = gen_rtx_SYMBOL_REF (Pmode, "__oacc_bcast");
   SET_SYMBOL_DATA_AREA (oacc_bcast_sym, DATA_AREA_SHARED);
   oacc_bcast_align = GET_MODE_ALIGNMENT (SImode) / BITS_PER_UNIT;
+  oacc_bcast_partition = 0;
 
   worker_red_sym = gen_rtx_SYMBOL_REF (Pmode, "__worker_red");
   SET_SYMBOL_DATA_AREA (worker_red_sym, DATA_AREA_SHARED);
@@ -1101,6 +1105,40 @@ nvptx_init_axis_predicate (FILE *file, int regno, const 
char *name)
   fprintf (file, "\t}\n");
 }
 
+/* Emit code to initialize OpenACC worker broadcast and synchronization
+   registers.  */
+
+static void
+nvptx_init_oacc_workers (FILE *file)
+{
+  fprintf (file, "\t{\n");
+  fprintf (file, "\t\t.reg.u32\t%%tidy;\n");
+  if (cfun->machine->bcast_partition)
+{
+  fprintf (file, "\t\t.reg.u64\t%%t_bcast;\n");
+  fprintf (file, "\t\t.reg.u64\t%%y64;\n");
+}
+  fprintf (file, "\t\tmov.u32\t\t%%tidy, %%tid.y;\n");
+  if (cfun->machine->bcast_partition)
+{
+  fprintf (file, "\t\tcvt.u64.u32\t%%y64, %%tidy;\n");
+  fprintf (file, "\t\tadd.u64\t\t%%y64, %%y64, 1; // vector ID\n");
+  fprintf (file, "\t\tcvta.shared.u64\t%%t_bcast, __oacc_bcast;\n");
+  fprintf (file, "\t\tmad.lo.u64\t%%r%d, %%y64, %d, %%t_bcast; "
+  "// vector broadcast offset\n",
+  REGNO (cfun->machine->bcast_partition),
+  oacc_bcast_partition);
+}
+  /* Verify oacc_bcast_size.  */
+  gcc_assert (oacc_bcast_partition * (nvptx_mach_max_workers () + 1)
+ <= oacc_bcast_size);
+  if (cfun->machine->sync_bar)
+fprintf (file, "\t\tadd.u32\t\t%%r%d, %%tidy, 1; "
+"// vector synchronization barrier\n",
+REGNO (cfun->machine->sync_bar));
+  fprintf (file, "\t}\n");
+}
+
 /* Emit code to initialize predicate and master lane index registers for
-muniform-simt code generation variant.  */
 
@@ -1327,6 +1365,8 @@ nvptx_declare_function_name (FILE *file, const char 
*name, const_tree decl)
   if (cfun->machine->unisimt_predicate
   || (cfun->machine->has_simtreg && !crtl->is_leaf))
 nvptx_init_unisimt_predicate (file);
+  if (cfun->machine->bcast_partition || cfun->machine->sync_bar)
+nvptx_init_oacc_workers (file);
 }
 
 /* Output code for switching uniform-simt state.  ENTERING indicates whether
@@ -3045,6 +3085,19 @@ nvptx_split_blocks (bb_insn_map_t *map)
 }
 }
 
+/* Return true if MASK contains parallelism that requires shared
+   memory to broadcast.  */
+
+static bool
+nvptx_needs_shared_bcast (unsigned mask)
+{
+  bool worker = mask & GOMP_DIM_MASK (GOMP_DIM_WORKER);
+  bool large_vector = (mask & GOMP_DIM_MASK (GOMP_DIM_VECTOR))
+&& nvptx_mach_vector_length () != PTX_WARP_SIZE;
+
+  return worker || large_vector;
+}
+
 /* BLOCK is a basic block containing a head or tail instruction.
Locate the associated prehead 

[PATCH] PR libstdc++/86658 fix __niter_wrap to not copy invalid iterators

2018-07-24 Thread Jonathan Wakely

An output iterator passed as the unused first argument to __niter_wrap
might have already been invalidated, so don't copy it.

PR libstdc++/86658
* include/bits/stl_algobase.h (__niter_wrap<_Iterator>): Pass unused
parameter by reference, to avoid copying invalid iterators.
* testsuite/25_algorithms/copy/86658.cc: New test.

Tested powerpc64le-linux, committed to trunk.

commit e2dcaf432a8fa6e8e9d96b03003ece28fa3f1ca6
Author: Jonathan Wakely 
Date:   Tue Jul 24 21:33:54 2018 +0100

PR libstdc++/86658 fix __niter_wrap to not copy invalid iterators

An output iterator passed as the unused first argument to __niter_wrap
might have already been invalidated, so don't copy it.

PR libstdc++/86658
* include/bits/stl_algobase.h (__niter_wrap<_Iterator>): Pass unused
parameter by reference, to avoid copying invalid iterators.
* testsuite/25_algorithms/copy/86658.cc: New test.

diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index f0130bc4123..b1ecd83ddb7 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -288,7 +288,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // No need to wrap, iterator already has the right type.
   template
 inline _Iterator
-__niter_wrap(_Iterator, _Iterator __res)
+__niter_wrap(const _Iterator&, _Iterator __res)
 { return __res; }
 
   // All of these auxiliary structs serve two purposes.  (1) Replace
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/86658.cc 
b/libstdc++-v3/testsuite/25_algorithms/copy/86658.cc
new file mode 100644
index 000..600747a823d
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/copy/86658.cc
@@ -0,0 +1,36 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run }
+
+#define _GLIBCXX_DEBUG
+#include 
+#include 
+
+void
+test01()
+{
+  int i[1] = { 1 };
+  std::vector v(1);
+  std::copy(i, i+1, std::inserter(v, v.end()));
+}
+
+int
+main()
+{
+  test01();
+}


Re: [Patch] [Aarch64] PR 86538 - Define __ARM_FEATURE_LSE if LSE is available

2018-07-24 Thread James Greenhalgh
On Tue, Jul 24, 2018 at 03:22:02PM -0500, Steve Ellcey wrote:
> This is a patch for PR 86538, to define an __ARM_FEATURE_LSE macro
> when LSE is available.  Richard Earnshaw closed PR 86538 as WONTFIX
> because the ACLE (Arm C Language Extension) does not require this
> macro and because he is concerned that it might encourage people to
> use inline assembly instead of the __sync and atomic intrinsics.
> (See actual comments in the defect report.)
> 
> While I agree that we want people to use the intrinsics I still think
> there are use cases where people may want to know if LSE is available
> or not and there is currrently no (simple) way to determine if this feature
> is available since it can be turned or and off independently of the
> architecture used.  Also, as a general principle, I  think any feature
> that can be toggled on or off by the compiler should provide a way for
> users to determine what its state is.

Well, we blow that design principle all over the place (find me a macro
which tells you whether AARCH64_EXTRA_TUNE_SLOW_UNALIGNED_LDPW is on for
example :-) )

A better design principle would be that if we think language programmers
may want to compile in different C code depending on a compiler option, we
should consider adding a feature macro.

> So what do other ARM maintainers and users think?  Is this a useful
> feature to have in GCC?

I'm with Richard on this one.

Whether LSE is available or not at compile time, the best user strategy is
to use the C11/C++11 atomic extensions. That's where the memory model is
well defined, well reasoned about, and well implemented.

Purely in ACLE we're not keen on providing macros that don't provide choice
to a C level programmer (i.e. change the prescence of intrinsics).

You could well imagine an inline asm programmer wanting to choose between an
LSE path and an Armv8.0-A path; but I can't imagine what they would want to
do on that path that couldn't be expressed better in the C language. You
might say they want to validate presence of the instruction; but that will
need to be a dynamic check outside of ACLE anyway.

All of which is to say, I don't think that this is a neccessary macro. Each
time I've seen it requested by a user, we've told them the same thing; what
do you want to express here that isn't better expressed by C atomic
primitives.

I'd say this patch isn't desirable for trunk. I'd be interested in use cases
that need a static decision on presence of LSE that are not better expressed
using higher level language features.

Thanks,
James



Re: Fix ceil_log2(0) (PR 86644)

2018-07-24 Thread Jeff Law
On 07/24/2018 12:11 PM, Richard Sandiford wrote:
> This PR shows a pathological case in which we try SLP vectorisation on
> dead code.  We record that 0 bits of the result are enough to satisfy
> all users (which is true), and that led to precision being 0 in:
> 
> static unsigned int
> vect_element_precision (unsigned int precision)
> {
>   precision = 1 << ceil_log2 (precision);
>   return MAX (precision, BITS_PER_UNIT);
> }
> 
> ceil_log2 (0) returned 64 rather than 0, leading to 1 << 64, which is UB.
> 
> Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.
> OK to install?
> 
> Richard
> 
> 
> 2018-07-24  Richard Sandiford  
> 
> gcc/
>   * hwint.c (ceil_log2): Fix comment.  Return 0 for 0.
OK
Jeff


Re: [PATCH 6/7] AArch64 - new pass to add conditional-branch speculation tracking

2018-07-24 Thread Jeff Law
On 07/23/2018 08:33 AM, Richard Earnshaw (lists) wrote:
> [sorry, missed this mail somehow]
> 
> On 11/07/18 22:01, Jeff Law wrote:
>> On 07/09/2018 10:38 AM, Richard Earnshaw wrote:
>>> This patch is the main part of the speculation tracking code.  It adds
>>> a new target-specific pass that is run just before the final branch
>>> reorg pass (so that it can clean up any new edge insertions we make).
>>> The pass is only run with -mtrack-speculation is passed on the command
>>> line.
>>>
>>> One thing that did come to light as part of this was that the stack pointer
>>> register was not being permitted in comparision instructions.  We rely on
>>> that for moving the tracking state between SP and the scratch register at
>>> function call boundaries.
>> Note that the sp in comparison instructions issue came up with the
>> improvements to stack-clash that Tamar, Richard S. and you worked on.
>>
> 
> I can certainly lift that part into a separate patch.
Your call.  It was mostly an observation that the change was clearly
needed elsewhere.  I'm certainly comfortable letting that hunk go in
with whichever kit is approved first :-)

> 
>>
>>>
>>> * config/aarch64/aarch64-speculation.cc: New file.
>>> * config/aarch64/aarch64-passes.def (pass_track_speculation): Add before
>>> pass_reorder_blocks.
>>> * config/aarch64/aarch64-protos.h (make_pass_track_speculation): Add
>>> prototype.
>>> * config/aarch64/aarch64.c (aarch64_conditional_register_usage): Fix
>>> X14 and X15 when tracking speculation.
>>> * config/aarch64/aarch64.md (register name constants): Add
>>> SPECULATION_TRACKER_REGNUM and SPECULATION_SCRATCH_REGNUM.
>>> (unspec): Add UNSPEC_SPECULATION_TRACKER.
>>> (speculation_barrier): New insn attribute.
>>> (cmp): Allow SP in comparisons.
>>> (speculation_tracker): New insn.
>>> (speculation_barrier): Add speculation_barrier attribute.
>>> * config/aarch64/t-aarch64: Add make rule for aarch64-speculation.o.
>>> * config.gcc (aarch64*-*-*): Add aarch64-speculation.o to extra_objs.
>>> * doc/invoke.texi (AArch64 Options): Document -mtrack-speculation.
>>> ---
>>>  gcc/config.gcc|   2 +-
>>>  gcc/config/aarch64/aarch64-passes.def |   1 +
>>>  gcc/config/aarch64/aarch64-protos.h   |   3 +-
>>>  gcc/config/aarch64/aarch64-speculation.cc | 494 
>>> ++
>>>  gcc/config/aarch64/aarch64.c  |  13 +
>>>  gcc/config/aarch64/aarch64.md |  30 +-
>>>  gcc/config/aarch64/t-aarch64  |  10 +
>>>  gcc/doc/invoke.texi   |  10 +-
>>>  8 files changed, 558 insertions(+), 5 deletions(-)
>>>  create mode 100644 gcc/config/aarch64/aarch64-speculation.cc
>> Given the consensus forming about using these kind of masking
>> instructions being the preferred way to mitigate (as opposed to lfence
>> barriers and the like) I have to ask your opinions about making the bulk
>> of this a general pass rather than one specific to the aarch backend.
>> I'd hate to end up duplicating all this stuff across multiple architectures.
>>
>> I think it all looks pretty reasonable though.
>>
>> jeff
>>
> 
> 
> It would be nice to make this more generic, but I'm not sure how easy
> that would be.  Some of the analysis is surely the same, but deployment
> of the mitigation itself is perhaps more complex.  At this point in
> time, I think I'd prefer to go with the target-specific implementation
> and then look to generalize it as a follow-up.  There may be some more
> optimizations to add later as well.
ACK.  I suspect it's mostly the analysis side that we'll want to share.
I don't mind giving you the advantage of going first and letting it live
in the aarch64 backend.  Second implementation can extract the analysis
bits :-)

So IMHO, this can go forward whenever you want to push it.

Jeff



Re: [PATCH] Explain asan parameters in params.def (PR sanitizer/79635).

2018-07-24 Thread Jeff Law
On 07/24/2018 06:18 AM, Martin Liška wrote:
> Hi.
> 
> That's simple patch that improves documentation as requested
> in the PR.
> 
> Ready for trunk?
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-07-24  Martin Liska  
> 
> PR sanitizer/79635
>   * params.def: Explain ASan abbreviation and provide
> a documentation link.
> ---
>  gcc/params.def | 2 ++
>  1 file changed, 2 insertions(+)
> 
> 
OK
jeff


[PATCH 09/11] [nvptx] Use TARGET_SET_CURRENT_FUNCTION

2018-07-24 Thread cesar
From: Cesar Philippidis 

Chung-Lin had originally defined TARGET_SET_CURRENT_FUNCTION as part
of his gang-local variable patch. But I intend to introduce those
changes at a later time. Eventually the state propagation code will
utilize nvptx_set_current_function to reset the reduction buffer
offset. However, for the time being, this patch only introduces
it as a placeholder.

2018-XX-YY  Chung-Lin Tang  
Cesar Philippidis  

  1   2   >