[PATCH] improve ifcvt optimization (PR rtl-optimization/89430)

2019-03-14 Thread JiangNing OS
This patch is to fix a missing ifcvt opportunity in back-end. For the simple 
case below,

if (...)
x = a;  /* x is memory */
/* no else */

We can generate conditional move and remove the branch as below if the target 
cost is acceptable. 

r1 = x
r2 = a
cmp ...
csel r3, r1, r2, cond
x = r3

This could be safe if x is a stack variable, and there isn't any address taken 
in current function, so the store speculation can be avoided. 

In practice, this optimization can improve a real application performance by %4 
on aarch64.

Thanks,
-Jiangning


csel3.patch
Description: csel3.patch


Go patch committed: Eliminate bounds checks in append expression

2019-03-14 Thread Ian Lance Taylor
This patch by Cherry Zhang fixes the Go frontend to eliminate bound
checks in an append expression.  The compiler generates two array
index expressions when lowering an append expression.  Currently they
generate bound checks.  Bound checks are not necessary in this case,
as we know the slice has, or will grow to, enough length and capacity.
Eliminate them.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 269634)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-946aa5ab2e82d045a2a3b2f18ba2c5b00e957c4b
+80a7f6dae0861a06407a44c501b6346a4abd119c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 269619)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -8008,8 +8008,8 @@ Builtin_call_expression::flatten_append(
   ref = Expression::make_temporary_reference(s1tmp, loc);
   Expression* zero = Expression::make_integer_ul(0, int_type, loc);
   Expression* ref2 = Expression::make_temporary_reference(ntmp, loc);
-  // FIXME: Mark this index as not requiring bounds checks.
-  ref = Expression::make_index(ref, zero, ref2, NULL, loc);
+  ref = Expression::make_array_index(ref, zero, ref2, NULL, loc);
+  ref->array_index_expression()->set_needs_bounds_check(false);
 
   if (assign_lhs == NULL)
 {
@@ -8058,8 +8058,8 @@ Builtin_call_expression::flatten_append(
   a1 = Expression::make_temporary_reference(s1tmp, loc);
   ref = Expression::make_temporary_reference(l1tmp, loc);
   Expression* nil = Expression::make_nil(loc);
-  // FIXME: Mark this index as not requiring bounds checks.
-  a1 = Expression::make_index(a1, ref, nil, NULL, loc);
+  a1 = Expression::make_array_index(a1, ref, nil, NULL, loc);
+  a1->array_index_expression()->set_needs_bounds_check(false);
 
   a2 = Expression::make_temporary_reference(s2tmp, loc);
 
@@ -8086,9 +8086,9 @@ Builtin_call_expression::flatten_append(
  ref2 = Expression::make_temporary_reference(l1tmp, loc);
  Expression* off = Expression::make_integer_ul(i, int_type, loc);
  ref2 = Expression::make_binary(OPERATOR_PLUS, ref2, off, loc);
- // FIXME: Mark this index as not requiring bounds checks.
- Expression* lhs = Expression::make_index(ref, ref2, NULL, NULL,
-  loc);
+ Expression* lhs = Expression::make_array_index(ref, ref2, NULL,
+ NULL, loc);
+  lhs->array_index_expression()->set_needs_bounds_check(false);
  gogo->lower_expression(function, inserter, );
  gogo->flatten_expression(function, inserter, );
  // The flatten pass runs after the write barrier pass, so we
@@ -11328,15 +11328,6 @@ Array_index_expression::do_get_backend(T
   if (length == NULL)
 length = cap_arg;
 
-  int code = (array_type->length() != NULL
- ? (this->end_ == NULL
-? RUNTIME_ERROR_ARRAY_INDEX_OUT_OF_BOUNDS
-: RUNTIME_ERROR_ARRAY_SLICE_OUT_OF_BOUNDS)
- : (this->end_ == NULL
-? RUNTIME_ERROR_SLICE_INDEX_OUT_OF_BOUNDS
-: RUNTIME_ERROR_SLICE_SLICE_OUT_OF_BOUNDS));
-  Bexpression* crash = gogo->runtime_error(code, loc)->get_backend(context);
-
   if (this->start_->type()->integer_type() == NULL
   && !Type::are_convertible(int_type, this->start_->type(), NULL))
 {
@@ -11344,31 +11335,46 @@ Array_index_expression::do_get_backend(T
   return context->backend()->error_expression();
 }
 
-  Bexpression* bad_index =
-Expression::check_bounds(this->start_, loc)->get_backend(context);
-
   Bexpression* start = this->start_->get_backend(context);
   start = gogo->backend()->convert_expression(int_btype, start, loc);
-  Bexpression* start_too_large =
-gogo->backend()->binary_expression((this->end_ == NULL
-   ? OPERATOR_GE
-   : OPERATOR_GT),
-   start,
-  (this->end_ == NULL
-   ? length
-   : capacity),
-   loc);
-  bad_index = gogo->backend()->binary_expression(OPERATOR_OROR, 
start_too_large,
-bad_index, loc);
+
+  Bexpression* crash = NULL;
+  Bexpression* bad_index = NULL;
+  if (this->needs_bounds_check_)
+{
+  int code = (array_type->length() != NULL
+  ? (this->end_ == NULL
+ ? RUNTIME_ERROR_ARRAY_INDEX_OUT_OF_BOUNDS
+   

Re: [PATCH] Don't ICE on && in dwarf2out (PR debug/89704)

2019-03-14 Thread Jason Merrill

On 3/14/19 7:01 PM, Jakub Jelinek wrote:

Hi!

On the following testcase, we have tree initializers that satisfy
initializer_constant_valid_p and do:
   rtl = rtl_for_decl_init (init, type);
   if (rtl)
 return add_const_value_attribute (die, rtl);
add_const_value_attribute has a list of rtl codes it expects in the
initializers (and wants to handle only the simplest cases anyway), but
doesn't cover what can appear in these initializers, MINUS for the pointer
subtraction and the *_EXTEND for the casts.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-14  Jakub Jelinek  

PR debug/89704
* dwarf2out.c (add_const_value_attribute): Return false for MINUS,
SIGN_EXTEND and ZERO_EXTEND.


OK.

Jason


[PATCH] fix ice in attribute copy (PR 89685)

2019-03-14 Thread Martin Sebor

To copy type attributes from struct A, the copy attribute (new
in GCC 9) accepts a pointer argument such as (struct A*)0, but
it isn't prepared for anything much more complicated than that.
So for example when it's passed something like (struct A*)(0, 1)
as the test case in PR 89685 does (a P1 regression), it fails
with an ICE.

The attached patch makes this handling more robust by letting
it accept all forms of type and member references.

Tested on x86_64-linux.

Martin
PR c/89685 - ICE on attribute copy with a compound expression

gcc/c-family/ChangeLog:

	PR c/89685
	* c-attribs.c (handle_copy_attribute): Handle references and
	non-constant expressions.

gcc/testsuite/ChangeLog:

	PR c/89685
	* gcc.dg/attr-copy-8.c: New test.
	* g++.dg/ext/attr-copy-2.C: New test.

Index: gcc/c-family/c-attribs.c
===
--- gcc/c-family/c-attribs.c	(revision 269657)
+++ gcc/c-family/c-attribs.c	(working copy)
@@ -2413,15 +2413,30 @@ handle_copy_attribute (tree *node, tree name, tree
 }
 
   /* Consider address-of expressions in the attribute argument
- as requests to copy from the referenced entity.  For constant
- expressions, consider those to be requests to copy from their
- type, such as in:
+ as requests to copy from the referenced entity. */
+  if (TREE_CODE (ref) == ADDR_EXPR)
+ref = TREE_OPERAND (ref, 0);
+
+  do
+{
+  /* Drill down into references to find the referenced decl.  */
+  tree_code refcode = TREE_CODE (ref);
+  if (refcode == ARRAY_REF
+	  || refcode == INDIRECT_REF)
+	ref = TREE_OPERAND (ref, 0);
+  else if (refcode == COMPONENT_REF)
+	ref = TREE_OPERAND (ref, 1);
+  else
+	break;
+} while (!DECL_P (ref));
+
+  /* For pointer expressions, consider those to be requests to copy
+ from their type, such as in:
struct __attribute__ (copy ((struct T *)0)) U { ... };
  which copies type attributes from struct T to the declaration
  of struct U.  */
-  if (TREE_CODE (ref) == ADDR_EXPR)
-ref = TREE_OPERAND (ref, 0);
-  else if (CONSTANT_CLASS_P (ref))
+  if ((CONSTANT_CLASS_P (ref) || EXPR_P (ref))
+  && POINTER_TYPE_P (TREE_TYPE (ref)))
 ref = TREE_TYPE (ref);
 
   if (DECL_P (decl))
Index: gcc/testsuite/gcc.dg/attr-copy-8.c
===
--- gcc/testsuite/gcc.dg/attr-copy-8.c	(nonexistent)
+++ gcc/testsuite/gcc.dg/attr-copy-8.c	(working copy)
@@ -0,0 +1,97 @@
+/* PR c/89685 - ICE on attribute copy with a compound expression
+   { dg-do compile }
+   { dg-options "-Wall -Wno-unused-value -Wno-int-to-pointer-cast" } */
+
+#define ATTR(...) __attribute__ ((__VA_ARGS__))
+
+typedef struct ATTR (packed) A { ATTR (packed) unsigned bf: 1; } A;
+
+typedef struct B
+{
+  struct A a;
+  struct A *pa;
+} B;
+
+extern struct A a;
+extern struct A *pa;
+extern B b;
+extern B ab[1];
+extern B *pb;
+
+typedef struct C
+{
+  ATTR (copy ((struct A *)0)) short m_pa_0;
+  ATTR (copy ((struct A *)(1, 0))) int m_pa_1_0;
+  ATTR (copy ((struct A *)(0, 1))) long m_pa_0_1;
+
+  ATTR (copy (*(struct A *)0)) short m_xpa_0;
+  ATTR (copy (*(struct A *)(1, 0))) int m_xpa_1_0;
+  ATTR (copy (*(struct A *)(0, 1))) long m_xpa_0_1;
+
+  ATTR (copy (((struct A *)0)[0])) short m_arpa_0;
+  ATTR (copy (((struct A *)(1, 0))[0])) int m_arpa_1_0;
+  ATTR (copy (((struct A *)(0, 1))[0])) long m_arpa_0_1;
+
+  ATTR (copy (a)) short m_ra;
+  ATTR (copy (b.a)) int m_rb_a;
+  ATTR (copy (b.pa)) long m_rb_pa;
+
+  ATTR (copy ()) short m_ara;
+  ATTR (copy ()) int m_arb_a;
+  ATTR (copy (*b.pa)) long m_xb_pa;
+  ATTR (copy (b.pa[0])) long m_arb_pa;
+  
+  ATTR (copy (*pa)) short m_xpa;
+  ATTR (copy (pa[0])) short m_arpa;
+
+  ATTR (copy (ab[0].a)) int m_arab_a;
+  ATTR (copy (ab[1].pa)) long m_arab_pa;
+  ATTR (copy (*ab[2].pa)) int m_xarab_pa;
+  ATTR (copy (ab[3].pa->bf)) unsigned int m_arab_pa_bf: 1;
+
+  ATTR (copy (pb->a)) int m_pb_a;
+  ATTR (copy (pb->pa)) long m_pb_pa;
+  ATTR (copy (*pb->pa)) int m_xpb_pa;
+  ATTR (copy (pb->pa->bf)) unsigned int m_pb_pa_bf: 1;
+
+  ATTR (aligned (4), copy ((struct A *)(0))) short m_a4_pa_0;
+} C;
+
+
+_Static_assert (__builtin_has_attribute (((C*)0)->m_pa_0, packed));
+_Static_assert (__builtin_has_attribute (((C*)0)->m_pa_1_0, packed));
+_Static_assert (__builtin_has_attribute (((C*)0)->m_pa_0_1, packed));
+
+_Static_assert (__builtin_has_attribute (((C*)0)->m_xpa_0, packed));
+_Static_assert (__builtin_has_attribute (((C*)0)->m_xpa_1_0, packed));
+_Static_assert (__builtin_has_attribute (((C*)0)->m_xpa_0_1, packed));
+
+_Static_assert (__builtin_has_attribute (((C*)0)->m_arpa_0, packed));
+_Static_assert (__builtin_has_attribute (((C*)0)->m_arpa_1_0, packed));
+_Static_assert (__builtin_has_attribute (((C*)0)->m_arpa_0_1, packed));
+
+_Static_assert (__builtin_has_attribute (((C*)0)->m_ra, packed));
+_Static_assert (__builtin_has_attribute (((C*)0)->m_rb_a, packed));
+_Static_assert 

[PATCH] [AARCH64] Improve vector generation cost model

2019-03-14 Thread apinski
From: Andrew Pinski 

Hi,
  On OcteonTX2, ld1r and ld1 (with a single lane) are split
into two different micro-ops unlike most other targets.
This adds three extra costs to the cost table:
ld1_dup: used for "ld1r {v0.4s}, [x0]"
merge_dup: used for "dup v0.4s, v0.4s[0]" and "ins v0.4s[0], v0.4s[0]"
ld1_merge: used fir "ld1 {v0.4s}[0], [x0]"

OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

ChangeLog:
* config/arm/aarch-common-protos.h (vector_cost_table):
Add merge_dup, ld1_merge, and ld1_dup.
* config/aarch64/aarch64-cost-tables.h (qdf24xx_extra_costs):
Update for the new fields.
(thunderx_extra_costs): Likewise.
(thunderx2t99_extra_costs): Likewise.
(tsv110_extra_costs): Likewise.
* config/arm/aarch-cost-tables.h (generic_extra_costs): Likewise.
(cortexa53_extra_costs): Likewise.
(cortexa57_extra_costs): Likewise.
(exynosm1_extra_costs): Likewise.
(xgene1_extra_costs): Likewise.
* config/aarch64/aarch64.c (aarch64_rtx_costs): Handle vec_dup of a memory.
Hanlde vec_merge of a memory.

Signed-off-by: Andrew Pinski 
---
 gcc/config/aarch64/aarch64-cost-tables.h | 20 +++
 gcc/config/aarch64/aarch64.c | 22 +
 gcc/config/arm/aarch-common-protos.h |  3 +++
 gcc/config/arm/aarch-cost-tables.h   | 25 +++-
 4 files changed, 61 insertions(+), 9 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
b/gcc/config/aarch64/aarch64-cost-tables.h
index 5c9442e1b89..9a7c70ba595 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -123,7 +123,10 @@ const struct cpu_cost_table qdf24xx_extra_costs =
   },
   /* Vector */
   {
-COSTS_N_INSNS (1)  /* alu.  */
+COSTS_N_INSNS (1),  /* Alu.  */
+COSTS_N_INSNS (1), /* dup_merge.  */
+COSTS_N_INSNS (1), /* ld1_merge.  */
+COSTS_N_INSNS (1)  /* ld1_dup.  */
   }
 };
 
@@ -227,7 +230,10 @@ const struct cpu_cost_table thunderx_extra_costs =
   },
   /* Vector */
   {
-COSTS_N_INSNS (1)  /* Alu.  */
+COSTS_N_INSNS (1), /* Alu.  */
+COSTS_N_INSNS (1), /* dup_merge.  */
+COSTS_N_INSNS (1), /* ld1_merge.  */
+COSTS_N_INSNS (1)  /* ld1_dup.  */
   }
 };
 
@@ -330,7 +336,10 @@ const struct cpu_cost_table thunderx2t99_extra_costs =
   },
   /* Vector */
   {
-COSTS_N_INSNS (1)  /* Alu.  */
+COSTS_N_INSNS (1), /* Alu.  */
+COSTS_N_INSNS (1), /* dup_merge.  */
+COSTS_N_INSNS (1), /* ld1_merge.  */
+COSTS_N_INSNS (1)  /* ld1_dup.  */
   }
 };
 
@@ -434,7 +443,10 @@ const struct cpu_cost_table tsv110_extra_costs =
   },
   /* Vector */
   {
-COSTS_N_INSNS (1)  /* alu.  */
+COSTS_N_INSNS (1), /* Alu.  */
+COSTS_N_INSNS (1), /* dup_merge.  */
+COSTS_N_INSNS (1), /* ld1_merge.  */
+COSTS_N_INSNS (1)  /* ld1_dup.  */
   }
 };
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b38505b0872..dc4d3d39af8 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -10568,6 +10568,28 @@ cost_plus:
 }
   break;
 
+case VEC_DUPLICATE:
+  if (!speed)
+   return false;
+
+  if (GET_CODE (XEXP (x, 0)) == MEM)
+   *cost += extra_cost->vect.ld1_dup;
+  else
+   *cost += extra_cost->vect.merge_dup;
+  return true;
+
+case VEC_MERGE:
+  if (speed && GET_CODE (XEXP (x, 0)) == VEC_DUPLICATE)
+   {
+ if (GET_CODE (XEXP (XEXP (x, 0), 0)) == MEM)
+   *cost += extra_cost->vect.ld1_merge;
+ else
+   *cost += extra_cost->vect.merge_dup;
+ return true;
+   }
+  break;
+
+
 case TRUNCATE:
 
   /* Decompose muldi3_highpart.  */
diff --git a/gcc/config/arm/aarch-common-protos.h 
b/gcc/config/arm/aarch-common-protos.h
index 11cd5145bbc..dbc1282402a 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -131,6 +131,9 @@ struct fp_cost_table
 struct vector_cost_table
 {
   const int alu;
+  const int merge_dup;
+  const int ld1_merge;
+  const int ld1_dup;
 };
 
 struct cpu_cost_table
diff --git a/gcc/config/arm/aarch-cost-tables.h 
b/gcc/config/arm/aarch-cost-tables.h
index bc33efadc6c..a51bc668f56 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -121,7 +121,10 @@ const struct cpu_cost_table generic_extra_costs =
   },
   /* Vector */
   {
-COSTS_N_INSNS (1)  /* alu.  */
+COSTS_N_INSNS (1),  /* alu.  */
+COSTS_N_INSNS (1), /* dup_merge.  */
+COSTS_N_INSNS (1), /* ld1_merge.  */
+COSTS_N_INSNS (1)  /* ld1_dup.  */
   }
 };
 
@@ -224,7 +227,10 @@ const struct cpu_cost_table cortexa53_extra_costs =
   },
   /* Vector */
   {
-COSTS_N_INSNS (1)  /* alu.  */
+COSTS_N_INSNS (1),  /* alu.  */
+COSTS_N_INSNS (1), /* dup_merge.  */
+COSTS_N_INSNS (1), /* ld1_merge.  */
+COSTS_N_INSNS (1)  /* ld1_dup.  */
   }
 };
 
@@ -327,7 +333,10 @@ const struct cpu_cost_table cortexa57_extra_costs =
   },
   

[PATCH] Don't ICE on && in dwarf2out (PR debug/89704)

2019-03-14 Thread Jakub Jelinek
Hi!

On the following testcase, we have tree initializers that satisfy
initializer_constant_valid_p and do:
  rtl = rtl_for_decl_init (init, type);
  if (rtl)
return add_const_value_attribute (die, rtl);
add_const_value_attribute has a list of rtl codes it expects in the
initializers (and wants to handle only the simplest cases anyway), but
doesn't cover what can appear in these initializers, MINUS for the pointer
subtraction and the *_EXTEND for the casts.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-14  Jakub Jelinek  

PR debug/89704
* dwarf2out.c (add_const_value_attribute): Return false for MINUS,
SIGN_EXTEND and ZERO_EXTEND.

* gcc.dg/debug/pr89704.c: New test.

--- gcc/dwarf2out.c.jj  2019-03-13 21:21:58.0 +0100
+++ gcc/dwarf2out.c 2019-03-14 15:31:00.180137759 +0100
@@ -19670,6 +19670,9 @@ add_const_value_attribute (dw_die_ref di
 
 case HIGH:
 case CONST_FIXED:
+case MINUS:
+case SIGN_EXTEND:
+case ZERO_EXTEND:
   return false;
 
 case MEM:
--- gcc/testsuite/gcc.dg/debug/pr89704.c.jj 2019-03-14 15:36:39.239657822 
+0100
+++ gcc/testsuite/gcc.dg/debug/pr89704.c2019-03-14 15:36:34.559733459 
+0100
@@ -0,0 +1,14 @@
+/* PR debug/89704 */
+/* { dg-do compile } */
+
+typedef __INTPTR_TYPE__ intptr_t;
+
+int
+foo (void)
+{
+  lab1:;
+  lab2:;
+  static int i = (intptr_t) & - (intptr_t) &
+  static int j = (intptr_t) & - (intptr_t) &
+  return i;
+}

Jakub


[PATCH] Strip location wrappers inside of inchash::add_expr (PR c++/89709)

2019-03-14 Thread Jakub Jelinek
Hi!

As mentioned in the PR, r267272 added STRIP_ANY_LOCATION_WRAPPERS
at the start of operand_equal_p, but has not updated inchash::add_expr
which needs to follow what operand_equal_p does, so that if two trees
compare equal, they get the same hash value.
For location wrappers at the outermost level we wouldn't even try to
verify it, as they would be stripped before checking the hashes,
and in the usual case it wouldn't make a difference because if not
OEP_ADDRESS_OF STRIP_NOPS strips also the location wrappers.
But if OEP_ADDRESS_OF is set and we have e.g. a location wrapper in one
of the operands and not in the other one (the ICE is on
>, field>>>
vs.
, field>>>
which compares equal but has different hashes), then we need to make sure
they have the same hash.

The following patch fixes that and also moves the operand_equal_p location
wrapper stripping after the hash value computation code to check the hashes
in all cases.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-03-14  Jakub Jelinek  

PR c++/89709
* tree.c (inchash::add_expr): Strip any location wrappers.
* fold-const.c (operand_equal_p): Move stripping of location wrapper
after hash verification.

* g++.dg/cpp0x/constexpr-89709.C: New test.

--- gcc/tree.c.jj   2019-03-11 22:56:45.511835239 +0100
+++ gcc/tree.c  2019-03-14 11:39:55.898838708 +0100
@@ -7743,6 +7743,8 @@ add_expr (const_tree t, inchash::hash 
   return;
 }
 
+  STRIP_ANY_LOCATION_WRAPPER (t);
+
   if (!(flags & OEP_ADDRESS_OF))
 STRIP_NOPS (t);
 
--- gcc/fold-const.c.jj 2019-03-05 10:03:09.818780210 +0100
+++ gcc/fold-const.c2019-03-14 11:47:12.117791365 +0100
@@ -2942,9 +2942,6 @@ combine_comparisons (location_t loc,
 int
 operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags)
 {
-  STRIP_ANY_LOCATION_WRAPPER (arg0);
-  STRIP_ANY_LOCATION_WRAPPER (arg1);
-
   /* When checking, verify at the outermost operand_equal_p call that
  if operand_equal_p returns non-zero then ARG0 and ARG1 has the same
  hash value.  */
@@ -2967,6 +2964,9 @@ operand_equal_p (const_tree arg0, const_
return 0;
 }
 
+  STRIP_ANY_LOCATION_WRAPPER (arg0);
+  STRIP_ANY_LOCATION_WRAPPER (arg1);
+
   /* If either is ERROR_MARK, they aren't equal.  */
   if (TREE_CODE (arg0) == ERROR_MARK || TREE_CODE (arg1) == ERROR_MARK
   || TREE_TYPE (arg0) == error_mark_node
--- gcc/testsuite/g++.dg/cpp0x/constexpr-89709.C.jj 2019-03-14 
11:41:28.270346813 +0100
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-89709.C2019-03-14 
11:40:59.161816938 +0100
@@ -0,0 +1,18 @@
+// PR c++/89709
+// { dg-do compile { target c++11 } }
+// { dg-options "-O" }
+
+struct A { int i; };
+A a;
+
+constexpr int *
+foo ()
+{
+  return 
+}
+
+bool
+bar ()
+{
+  return foo () == 
+}


Jakub


Re: [Patch] PR rtl-optimization/87763 - generate more bfi instructions on aarch64

2019-03-14 Thread Steve Ellcey
Double ping.

Steve Ellcey
sell...@marvell.com

On Tue, 2019-02-26 at 08:44 -0800, Steve Ellcey wrote:
> Ping.
> 
> Steve Ellcey
> sell...@marvell.com
> 
> 
> On Mon, 2019-02-11 at 10:46 -0800, Steve Ellcey wrote:
> > On Thu, 2019-02-07 at 18:13 +, Wilco Dijkstra wrote
> > > 
> > > Hi Steve,
> > > 
> > > > > After special cases you could do something like t = mask2 +
> > > > > (HWI_1U << shift);
> > > > > return t == (t & -t) to check for a valid bfi.
> > > > 
> > > > I am not sure I follow this logic and my attempts to use this
> > > > did
> > > > not
> > > > work so I kept my original code.
> > > 
> > > It's similar to the initial code in aarch64_bitmask_imm, but
> > > rather
> > > than adding
> > > the lowest bit to the value to verify it is a mask (val + (val &
> > > -val)), we use the
> > > shift instead. If the shift is exactly right, it reaches the
> > > first
> > > set bit of the mask.
> > > Adding the low bit to a valid mask always results in zero or a
> > > single set bit.
> > > The standard idiom to check that is t == (t & -t).
> > > 
> > > > > +  "bfi\t%0, %1, 0, %P2"
> > > > > 
> > > > > This could emit a width that may be 32 too large in SImode if
> > > > > bit 31 is set
> > > > > (there is another use of %P in aarch64.md which may have the
> > > > > same
> > > > > issue).
> > > > 
> > > > I am not sure why having bit 31 set would be a problem.  Sign
> > > > extension?
> > > 
> > > Yes, if bit 31 is set, %P will emit 33 for a 32-bit constant which
> > > is obviously wrong.
> > > Your patch avoids this for bfi by explicitly computing the correct
> > > value.
> > > 
> > > This looks good to me (and creates useful bfi's as expected), but I
> > > can't approve.
> > > 
> > > Wilco
> > 
> > Thanks for looking this over.  I have updated the mask check to use
> > your method and retested to make sure it still works.  Can one of the
> > aarch64 maintainers approve this patch?
> > 
> > 2018-02-11  Steve Ellcey  
> > 
> > PR rtl-optimization/87763
> > * config/aarch64/aarch64-protos.h
> > (aarch64_masks_and_shift_for_bfi_p):
> > New prototype.
> > * config/aarch64/aarch64.c (aarch64_masks_and_shift_for_bfi_p):
> > New function.
> > * config/aarch64/aarch64.md (*aarch64_bfi5_shift):
> > New instruction.
> > (*aarch64_bfi4_noand): Ditto.
> > (*aarch64_bfi4_noshift): Ditto.
> > (*aarch64_bfi4_noshift_alt): Ditto.
> > 
> > 2018-02-11  Steve Ellcey  
> > 
> > PR rtl-optimization/87763
> > * gcc.target/aarch64/combine_bfxil.c: Change some bfxil checks
> > to bfi.
> > * gcc.target/aarch64/combine_bfi_2.c: New test.
> > 
> > 


Re: [patch] Fix ASAN failures on SPARC64/Linux

2019-03-14 Thread Jakub Jelinek
On Thu, Mar 14, 2019 at 11:01:37PM +0100, Eric Botcazou wrote:
> > Not if the >> 3 shift is arithmetic shift.
> 
> Sorry, I don't understand how this can work.

For some configurations, libasan defines SHADOW_OFFSET to
__asan_shadow_memory_dynamic_address (exported uptr symbol from libasan),
so SHADOW_OFFSET would be also __asan_shadow_memory_dynamic_address and
#define MEM_TO_SHADOW(mem) ((sptr(mem) >> SHADOW_SCALE) + (SHADOW_OFFSET))

For the different sizes of the address space hole:
[0x0800UL,0xf800UL)
[0x8000UL,0x8000UL)
[0x0008UL,0xfff8UL)
[0x0010UL,0xfff0UL)
it would then be up to the asan initialization to figure out what value
of the __asan_shadow_memory_dynamic_address it wants to use.

Say for the largest hole, sptr(0)>>3 is 0,
sptr(0x0800UL)>>3 is 0x0100UL,
sptr(0xf800UL)>>3 is 0xff00UL,
sptr(0xUL)>>3 is 0xUL.
The VA has 8TiB before hole and 8TiB after hole, and needs 2TiB of shadow
memory.  You pick some 2TiB region, either in the area below hole, or above
hole, where nothing is mapped, let's say you pick
[0x0200UL,0x0400UL) as the shadow memory
and then __asan_shadow_memory_dynamic_address will be
0x0200UL + sptr(0x0800UL)>>3, i.e.
0x0300UL.
kLowMemBeg is 0, kLowMemEnd is 0x0200UL-1 (first region
where you can have normal data), then there would be a shadow memory
corresponding to all of memory above the hole (i.e.
0x0200UL..0x0300UL) followed immediately
by shadow memory for kLowMemBeg..kLowMemEnd (0x0300UL..
0x0340UL-1), followed by
kShadowGapBeg 0x0340UL through kShadowGapEnd
0x0380UL-1 and finally again shadow memory for the normal
memory between 0x0400UL..0x0800UL
(i.e. 0x0380UL..0x0400UL-1).
Note, the 0x0200UL choice was just an example, I believe
it would work if you just tried to mmap without MAP_FIXED a 2TiB region
for the shadow and whatever you get used together with the start and end
of VA hole to compute everything else.

Similarly for all the other VA hole sizes, just instead of 2TiB you need
32TiB, 512TiB or 1PiB of shadow memory (always size of memory before
VA hole divided by 4 (== size of both regions outside of hole divided by 8)).

gcc would then emit whatever sequence the memory model emits to
access external __asan_shadow_memory_dynamic_address symbol, shifted
arithmetically address >> 3 and added that to value of
__asan_shadow_memory_dynamic_address.

Jakub


Re: [patch] Fix ASAN failures on SPARC64/Linux

2019-03-14 Thread Eric Botcazou
> Not if the >> 3 shift is arithmetic shift.

Sorry, I don't understand how this can work.

-- 
Eric Botcazou


Re: [PATCH] Fix -gdwarf-5 -gsplit-dwarf ICEs (PR debug/89498)

2019-03-14 Thread Alexandre Oliva
On Mar 12, 2019, Jakub Jelinek  wrote:

> Note, for DWARF before v4, there is no DW_FORM_sec_offset and
> DW_FORM_data4 or DW_FORM_data8 depending on whether it is 32-bit or 64-bit
> DWARF is used instead.

Ah, yes, thanks.  I made these fixes, changed some dashes to underlines
in "identifiers", and uploaded it for now to the end of
https://people.redhat.com/aoliva/papers/sfn/dwarf6-sfn-lvu.txt


=
Extensions for DWARF v2 to v5
=

This section documents how location views are implemented as
extensions to earlier versions of DWARF.  Being backports of the
proposed extensions above, these were NOT submitted to the DWARF
committee.

View number assignment takes place in the line number table, just as
proposed above.

Attributes and Forms


Location lists are not extensible in DWARF up to v5 in a backward
compatible way.  Thus, augmenting them with view lists relies on a
separate optional attribute:

  DW_AT_GNU_locviews | 0x2137

This attribute should only be present in a DIE that has a
DW_AT_location attribute that references a location list, rather than
a single location.  If this attribute is omitted, the view numbers
associated with any implied or explicit location ranges in the
location are to be taken as zero.

Locviews lists, referenced by DW_AT_GNU_locviews attributes, are to be
placed in the same section as the location lists.

When location lists are referenced in DW_AT_location attributes by an
absolute address in DW_FORM_sec_offset (DWARF v4) or DW_FORM_data
(DWARF up to v3), the corresponding DW_AT_GNU_locviews attribute can
be of the same form, with an absolute address as well.

When generating split (extension to) DWARF up to v4, DW_AT_location is
represented with an offset from the base address in the beginning of
the .debug_loc.dwo section, in DW_FORM_sec_offset (DWARF v4) or
DW_FORM_data (DWARF up to v3).  DW_AT_GNU_locviews should then be
an offset from the same base address, also of the same form.

When generating split DWARF v5, DW_AT_location is given in
DW_FORM_loclistx, an offset, from the base address after the
.debug_loclists section header, that in turn holds the offset, from
the same base address, of the beginning of the actual location list.
DW_AT_GNU_locviews attributes that augment such location lists shall
use a DW_FORM_sec_offset value, to be interpreted as an offset from
the same base address, rather than from the beginning of the
.debug_loclists section.

For split DWARF v5, it is not recommended for DW_AT_GNU_locviews to
use DW_FORM_loclistx.  Having entries in the loclists index table
referencing locviews rather than loclists might surprise some
consumers, and using out-of-range indirect addresses instead of index
table entries might also surprise them for out-of-range indexing of
the table and for not ultimately referencing a location list.



Separate Locviews lists
---

The locviews list is represented as a sequence of pairs of uleb128
numbers, each pair corresponding to an address range in the location
list, i.e. not including the terminator, in DWARF up to v4, and
covering only bounded location descriptions in DWARF v5.  The view
numbers pairs are matched to address ranges in the order they appear.

Where a proposed DWARFv6 combined location+view list could be
represented as:

  DW_LLE_base_address base_address
  DW_LLE_view_pairV1, V2
  DW_LLE_start_endA1, A2, location_expression
  DW_LLE_start_endA3, A4, location_expression
  DW_LLE_view_pairV5, V6
  DW_LLE_start_length A5, L6, location_expression
  DW_LLE_default  location_expression
  DW_LLE_end_of_list

... the locview list output separately for DWARF up to v5 would be:

  V1, V2, 0, 0, V5, V6

for a DWARFv5 location list like the v6 one above, minus the
DW_LLE_view_pair entries.  The view pair for A3 and A4, that could be
omitted in the combined v6 list, has to be explicitly put in up to v5,
because for each bounded location description, there must be a pair of
corresponding view numbers to be matched to the pair of addresses
given by it.

Up to DWARFv4, the location list matching the locview list above would
look as follows:

  -1, base_address
  A1, A2location_expression
  A3, A4location_expression
  A5, A5+L6 location_expression
  (a default expression cannot be represented)
  0, 0



Inlined Entry Point View


This extension proposal is yet to be submitted to the DWARF committee.

Optimizing subprograms into which a subroutine was inlined, the entry
point of an inlined subroutine may be at an address associated with
other views, in the enclosing subprogram, in the inlined subroutine,
or even in other inlined subroutines.

It may thus be useful to indicate which of the views associated with
the entry point address logically corresponds to the view in which
execution of the inlined subprogram begins, so that, when a
breakpoint is set at the 

Re: [C++ Patch] PR 89571 ("[9 Regression] ICE in nothrow_spec_p, at cp/except.c:1238")

2019-03-14 Thread Paolo Carlini

Hi,

On 14/03/19 21:03, Jason Merrill wrote:

On 3/12/19 7:21 AM, Paolo Carlini wrote:

Hi,

this is just an error recovery ICE but maybe we can fix it in time 
for 9.1.0, like c++/89488, which is somehow related. Indeed, the 
problem is again a !DEFERRED_NOEXCEPT_SPEC_P gcc_assert triggering 
where, earlier, process_subob_fn had maybe_instantiate_noexcept 
returning false (in c++/89448 the assertion triggered almost 
immediately, when merge_exception_specifiers was called).


Anyway, what we committed for c++/89488 doesn't help here because the 
error_mark_node assigned to *spec_p is propagated back only up to 
build_over_call, where mark_used returns false but isn't acted upon 
because of the additional && !(complain & tf_error). Thus I had the 
idea of simply removing the latter (by itself a bit invasive at this 
stage) but that leads to duplicated diagnostics because then, in 
cp_parser_decltype_expr, when cp_parser_postfix_expression returns 
error_mark_node we go on and try a full expression too, and emit the 
same diagnostics again.


Given the above we can imagine reworking the parser, which seems to 
me a bit tricky at this time, or we can go back to my initial 
proposal for c++/89488, attached below, which doesn't set *spec_p to 
error_mark_node when maybe_instantiate_noexcept returns false. My 
rationale being that, elsewhere, we very often discard completely the 
return value of maybe_instantiate_noexcept, thus it doesn't seem a 
big deal only using it to avoid immediately calling 
marge_exception_specifiers and ICE. If we do this, for c++/89448 we 
emit the additional "cannot convert" error, which may not be a bad 
thing, given that we used to emit it forever, and the same do both 
clang and edg, thus it looks like considering that 'zl ()' not 
completely broken may make sense for error-recovery purposes...


The thing is, we need to defer instantiation of a noexcept-specifier 
that depends on a DMI we haven't parsed yet, since it can succeed 
later and we don't want to have given up and said false.


But other cases of failed instantiation we can and should diagnose 
once and then recover from.  What do you think about this approach?


Oh, nice, I didn't go that far... However, I definitely wanted to spend 
more time on that if (spec ==error_mark_node) return false; which seemed 
at the center of the issues we were having.


Anyway, I'm reassigning the bug to you.

Thanks! Paolo.



Re: [C++ debug PATCH] [PR88534] accept VAR_DECL in class literal template parms

2019-03-14 Thread Jason Merrill

On 3/14/19 4:20 PM, Marek Polacek wrote:

On Thu, Mar 14, 2019 at 05:14:49PM -0300, Alexandre Oliva wrote:

P0732R2 / C++ 2a introduce class literals as template parameters.  The
front-end uses VAR_DECLs constructed from such literals to bind the
template PARM_DECLs, but dwarf2out.c used to reject such VAR_DECLs.

Taking DECL_INITIAL from such VAR_DECLs enables the generation of
DW_AT_const_value for them, at least when the class literal can
actually be represented as such.

Regstrapped on x86_64- and i686-linux-gnu.  Ok to install?


for  gcc/ChangeLog

PR c++/88534
PR c++/88537
* dwarf2out.c (generic_parameter_die): Follow DECL_INITIAL of
VAR_DECL args.

for  gcc/ChangeLog

PR c++/88534
PR c++/88537
* g++.dg/cpp2a/pr88534.C: New.
* g++.dg/cpp2a/pr88537.C: New.
---
  gcc/dwarf2out.c  |7 
  gcc/testsuite/g++.dg/cpp2a/pr88534.C |   65 ++
  gcc/testsuite/g++.dg/cpp2a/pr88537.C |   16 
  3 files changed, 88 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/pr88534.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/pr88537.C

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 830681447..478d9b9b289b1 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -13601,6 +13601,13 @@ generic_parameter_die (tree parm, tree arg,
dw_die_ref tmpl_die = NULL;
const char *name = NULL;
  
+  /* C++2a accepts class literals as template parameters, and var

+ decls with initializers represent them.  The VAR_DECLs would be
+ rejected, but we can take the DECL_INITIAL constructor and
+ attempt to expand it.  */
+  if (TREE_CODE (arg) == VAR_DECL)


You can use VAR_P for this.


OK with that change.

Jason



Re: C++ PATCH for c++/89612 - ICE with member friend template with noexcept

2019-03-14 Thread Jason Merrill

On 3/7/19 4:52 PM, Marek Polacek wrote:

This was one of those PRs where the more you poke, the more ICEs turn up.
This patch fixes the ones I could find.  The original problem was that
maybe_instantiate_noexcept got a TEMPLATE_DECL created for the member
friend template in do_friend.  Its noexcept-specification was deferred,
so we went to the block with push_access_scope, but that crashes on a
TEMPLATE_DECL.  One approach could be to somehow not defer noexcept-specs
for friend templates, I guess, but I didn't want to do that.


How does it make sense to instantiate the noexcept-specifier of a 
template?  We should only get there for fully-instantiated function decls.



Lastly, I found an invalid testcase that was breaking because a template code
leaked to constexpr functions.  This I fixed similarly to the recent explicit
PR fix (r269131).


This spot should probably also use build_converted_constant_expr.

Jason


Re: [C++ debug PATCH] [PR88534] accept VAR_DECL in class literal template parms

2019-03-14 Thread Marek Polacek
On Thu, Mar 14, 2019 at 05:14:49PM -0300, Alexandre Oliva wrote:
> P0732R2 / C++ 2a introduce class literals as template parameters.  The
> front-end uses VAR_DECLs constructed from such literals to bind the
> template PARM_DECLs, but dwarf2out.c used to reject such VAR_DECLs.
> 
> Taking DECL_INITIAL from such VAR_DECLs enables the generation of
> DW_AT_const_value for them, at least when the class literal can
> actually be represented as such.
> 
> Regstrapped on x86_64- and i686-linux-gnu.  Ok to install?
> 
> 
> for  gcc/ChangeLog
> 
>   PR c++/88534
>   PR c++/88537
>   * dwarf2out.c (generic_parameter_die): Follow DECL_INITIAL of
>   VAR_DECL args.
> 
> for  gcc/ChangeLog
> 
>   PR c++/88534
>   PR c++/88537
>   * g++.dg/cpp2a/pr88534.C: New.
>   * g++.dg/cpp2a/pr88537.C: New.
> ---
>  gcc/dwarf2out.c  |7 
>  gcc/testsuite/g++.dg/cpp2a/pr88534.C |   65 
> ++
>  gcc/testsuite/g++.dg/cpp2a/pr88537.C |   16 
>  3 files changed, 88 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/pr88534.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/pr88537.C
> 
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 830681447..478d9b9b289b1 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -13601,6 +13601,13 @@ generic_parameter_die (tree parm, tree arg,
>dw_die_ref tmpl_die = NULL;
>const char *name = NULL;
>  
> +  /* C++2a accepts class literals as template parameters, and var
> + decls with initializers represent them.  The VAR_DECLs would be
> + rejected, but we can take the DECL_INITIAL constructor and
> + attempt to expand it.  */
> +  if (TREE_CODE (arg) == VAR_DECL)

You can use VAR_P for this.

Marek


[C++ debug PATCH] [PR88534] accept VAR_DECL in class literal template parms

2019-03-14 Thread Alexandre Oliva
P0732R2 / C++ 2a introduce class literals as template parameters.  The
front-end uses VAR_DECLs constructed from such literals to bind the
template PARM_DECLs, but dwarf2out.c used to reject such VAR_DECLs.

Taking DECL_INITIAL from such VAR_DECLs enables the generation of
DW_AT_const_value for them, at least when the class literal can
actually be represented as such.

Regstrapped on x86_64- and i686-linux-gnu.  Ok to install?


for  gcc/ChangeLog

PR c++/88534
PR c++/88537
* dwarf2out.c (generic_parameter_die): Follow DECL_INITIAL of
VAR_DECL args.

for  gcc/ChangeLog

PR c++/88534
PR c++/88537
* g++.dg/cpp2a/pr88534.C: New.
* g++.dg/cpp2a/pr88537.C: New.
---
 gcc/dwarf2out.c  |7 
 gcc/testsuite/g++.dg/cpp2a/pr88534.C |   65 ++
 gcc/testsuite/g++.dg/cpp2a/pr88537.C |   16 
 3 files changed, 88 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/pr88534.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/pr88537.C

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 830681447..478d9b9b289b1 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -13601,6 +13601,13 @@ generic_parameter_die (tree parm, tree arg,
   dw_die_ref tmpl_die = NULL;
   const char *name = NULL;
 
+  /* C++2a accepts class literals as template parameters, and var
+ decls with initializers represent them.  The VAR_DECLs would be
+ rejected, but we can take the DECL_INITIAL constructor and
+ attempt to expand it.  */
+  if (TREE_CODE (arg) == VAR_DECL)
+arg = DECL_INITIAL (arg);
+
   if (!parm || !DECL_NAME (parm) || !arg)
 return NULL;
 
diff --git a/gcc/testsuite/g++.dg/cpp2a/pr88534.C 
b/gcc/testsuite/g++.dg/cpp2a/pr88534.C
new file mode 100644
index 0..54faf385f11aa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/pr88534.C
@@ -0,0 +1,65 @@
+// { dg-do compile { target c++2a } }
+// { dg-options "-g" }
+
+typedef __SIZE_TYPE__ size_t;
+
+namespace std
+{
+
+template 
+struct integer_sequence
+{
+  typedef T value_type;
+  static constexpr size_t size () noexcept { return sizeof...(I); }
+};
+
+template 
+using make_integer_sequence = integer_sequence;
+
+template 
+using index_sequence = integer_sequence;
+
+template 
+using make_index_sequence = make_integer_sequence;
+}
+
+template  struct S
+{
+  T content[N];
+  using char_type = T;
+  template 
+  constexpr S (const T ()[N], std::index_sequence) noexcept : 
content{input[I]...} { }
+  constexpr S (const T ()[N]) noexcept : S (input, 
std::make_index_sequence ()) { }
+  constexpr size_t size () const noexcept
+  {
+if (content[N - 1] == '\0')
+  return N - 1;
+else
+  return N;
+  }
+  constexpr T operator[] (size_t i) const noexcept
+  {
+return content[i];
+  }
+  constexpr const T *begin () const noexcept
+  {
+return content;
+  }
+  constexpr const T *end () const noexcept
+  {
+return content + size ();
+  }
+};
+
+template  S (const T (&)[N]) -> S;
+
+template 
+struct F
+{
+};
+
+auto
+foo ()
+{
+  F<"test"> f;
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/pr88537.C 
b/gcc/testsuite/g++.dg/cpp2a/pr88537.C
new file mode 100644
index 0..d558d45f57830
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/pr88537.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++2a } }
+// { dg-options "-g" }
+
+struct pair {
+   unsigned a;
+   unsigned b;
+   constexpr pair(unsigned _a, unsigned _b) noexcept: a{_a}, b{_b} { }
+};
+
+template  void fnc() {
+   
+}
+
+void f() {
+fnc();
+}

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe


Re: [C++ Patch] PR 89571 ("[9 Regression] ICE in nothrow_spec_p, at cp/except.c:1238")

2019-03-14 Thread Jason Merrill

On 3/12/19 7:21 AM, Paolo Carlini wrote:

Hi,

this is just an error recovery ICE but maybe we can fix it in time for 
9.1.0, like c++/89488, which is somehow related. Indeed, the problem is 
again a !DEFERRED_NOEXCEPT_SPEC_P gcc_assert triggering where, earlier, 
process_subob_fn had maybe_instantiate_noexcept returning false (in 
c++/89448 the assertion triggered almost immediately, when 
merge_exception_specifiers was called).


Anyway, what we committed for c++/89488 doesn't help here because the 
error_mark_node assigned to *spec_p is propagated back only up to 
build_over_call, where mark_used returns false but isn't acted upon 
because of the additional && !(complain & tf_error). Thus I had the idea 
of simply removing the latter (by itself a bit invasive at this stage) 
but that leads to duplicated diagnostics because then, in 
cp_parser_decltype_expr, when cp_parser_postfix_expression returns 
error_mark_node we go on and try a full expression too, and emit the 
same diagnostics again.


Given the above we can imagine reworking the parser, which seems to me a 
bit tricky at this time, or we can go back to my initial proposal for 
c++/89488, attached below, which doesn't set *spec_p to error_mark_node 
when maybe_instantiate_noexcept returns false. My rationale being that, 
elsewhere, we very often discard completely the return value of 
maybe_instantiate_noexcept, thus it doesn't seem a big deal only using 
it to avoid immediately calling marge_exception_specifiers and ICE. If 
we do this, for c++/89448 we emit the additional "cannot convert" error, 
which may not be a bad thing, given that we used to emit it forever, and 
the same do both clang and edg, thus it looks like considering that 'zl 
()' not completely broken may make sense for error-recovery purposes...


The thing is, we need to defer instantiation of a noexcept-specifier 
that depends on a DMI we haven't parsed yet, since it can succeed later 
and we don't want to have given up and said false.


But other cases of failed instantiation we can and should diagnose once 
and then recover from.  What do you think about this approach?



commit 9b633e74683e5b22102b4bc6811cdb05756602b0
Author: Jason Merrill 
Date:   Thu Mar 14 14:10:37 2019 -0400

* pt.c (maybe_instantiate_noexcept): Only return false for 
defaulted.
(regenerate_decl_from_template): Use it for noexcept-specs.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 2ab6e273c3a..9367d49629c 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -23991,12 +23991,17 @@ regenerate_decl_from_template (tree decl, tree tmpl, 
tree args)
   if (args_depth > parms_depth)
args = get_innermost_template_args (args, parms_depth);
 
-  specs = tsubst_exception_specification (TREE_TYPE (code_pattern),
- args, tf_error, NULL_TREE,
- /*defer_ok*/false);
-  if (specs && specs != error_mark_node)
-   TREE_TYPE (decl) = build_exception_variant (TREE_TYPE (decl),
-   specs);
+  /* Instantiate a dynamic exception-specification.  */
+  if (tree raises = TYPE_RAISES_EXCEPTIONS (TREE_TYPE (code_pattern)))
+   if (TREE_VALUE (raises))
+ {
+   specs = tsubst_exception_specification (TREE_TYPE (code_pattern),
+   args, tf_error, NULL_TREE,
+   /*defer_ok*/false);
+   if (specs && specs != error_mark_node)
+ TREE_TYPE (decl) = build_exception_variant (TREE_TYPE (decl),
+ specs);
+ }
 
   /* Merge parameter declarations.  */
   decl_parm = skip_artificial_parms_for (decl,
@@ -24062,6 +24067,8 @@ regenerate_decl_from_template (tree decl, tree tmpl, 
tree args)
   if (DECL_DECLARED_INLINE_P (code_pattern)
  && !DECL_DECLARED_INLINE_P (decl))
DECL_DECLARED_INLINE_P (decl) = 1;
+
+  maybe_instantiate_noexcept (decl, tf_error);
 }
   else if (VAR_P (decl))
 {
@@ -24187,7 +24194,11 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
complain)
   static hash_set* fns = new hash_set;
   bool added = false;
   if (DEFERRED_NOEXCEPT_PATTERN (noex) == NULL_TREE)
-   spec = get_defaulted_eh_spec (fn, complain);
+   {
+ spec = get_defaulted_eh_spec (fn, complain);
+ if (spec == error_mark_node)
+   return false;
+   }
   else if (!(added = !fns->add (fn)))
{
  /* If hash_set::add returns true, the element was already there.  */
@@ -24247,7 +24258,10 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
complain)
fns->remove (fn);
 
   if (spec == error_mark_node)
-   return false;
+   {
+ gcc_assert (seen_error ());
+ spec = noexcept_false_spec;
+   }
 
   TREE_TYPE (fn) = 

[C++ PATCH] Don't require 'bool' in a concept definition.

2019-03-14 Thread Jason Merrill
The concepts overhaul isn't ready yet, but we should at least support concept
definitions without bool; that has been the one thing that prevents people from
writing code that works in both TS and C++20 concepts.

Tested x86_64-pc-linux-gnu, applying to trunk.

* parser.c (cp_parser_decl_specifier_seq): Support C++20
concept-definition syntax without 'bool'.
---
 gcc/cp/parser.c  | 9 +
 gcc/testsuite/g++.dg/concepts/no-bool1.C | 4 
 gcc/cp/ChangeLog | 5 +
 3 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/concepts/no-bool1.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 14da1a14501..b8a0245ce57 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -13964,6 +13964,15 @@ cp_parser_decl_specifier_seq (cp_parser* parser,
 case RID_CONCEPT:
   ds = ds_concept;
   cp_lexer_consume_token (parser->lexer);
+ /* In C++20 a concept definition is just 'concept name = expr;'
+Support that syntax by pretending we've seen 'bool'.  */
+ if (cp_lexer_next_token_is (parser->lexer, CPP_NAME)
+ && cp_lexer_nth_token_is (parser->lexer, 2, CPP_EQ))
+   {
+ cp_parser_set_decl_spec_type (decl_specs, boolean_type_node,
+   token, /*type_definition*/false);
+ decl_specs->any_type_specifiers_p = true;
+   }
   break;
 
  /* function-specifier:
diff --git a/gcc/testsuite/g++.dg/concepts/no-bool1.C 
b/gcc/testsuite/g++.dg/concepts/no-bool1.C
new file mode 100644
index 000..be9b73a0689
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/no-bool1.C
@@ -0,0 +1,4 @@
+// { dg-do compile { target c++14 } }
+// { dg-additional-options -fconcepts }
+
+template  concept True = true;
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index 054a891bcef..6c96c24430f 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,8 @@
+2019-03-14  Jason Merrill  
+
+   * parser.c (cp_parser_decl_specifier_seq): Support C++20
+   concept-definition syntax without 'bool'.
+
 2019-03-14  Jakub Jelinek  
 
PR c++/89512

base-commit: a95b8a4616e5112976bf5ba3e63e344fa74c578a
-- 
2.20.1



Re: V4 [PATCH] i386: Handle REG_EH_REGION note

2019-03-14 Thread Jakub Jelinek
On Fri, Mar 15, 2019 at 03:36:15AM +0800, H.J. Lu wrote:
> Like this?

Yes.

> gcc/
> 
>   PR target/89650
>   * config/i386/i386.c (remove_partial_avx_dependency): Handle
>   REG_EH_REGION note.
> 
> gcc/testsuite/
> 
>   PR target/89650
>   * g++.target/i386/pr89650.C: New test.

Ok for trunk.

Jakub


V4 [PATCH] i386: Handle REG_EH_REGION note

2019-03-14 Thread H.J. Lu
On 3/14/19, Jakub Jelinek  wrote:
> On Thu, Mar 14, 2019 at 09:08:17PM +0800, H.J. Lu wrote:
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -2819,6 +2819,8 @@ remove_partial_avx_dependency (void)
>>rtx set;
>>rtx v4sf_const0 = NULL_RTX;
>>
>> +  auto_vec splitted_insn;
>
> Perhaps throwing_insns or flow_transfer_insns or control_flow_insns
> instead?
>
>> +  unsigned int i;
>> +  FOR_EACH_VEC_ELT (splitted_insn, i, insn)
>
> I actually meant something like:
>   if (control_flow_insn_p (insn))
> {
>   /* Split the block after insn.  There will be a
>  fallthru edge, which is OK so we keep it.  We
>  have to create the exception edges ourselves.  */
>   split_block (bb, insn);
>   rtl_make_eh_edge (NULL, bb, BB_END (bb));
> }
>
> only inside of the FOR_EACH_VEC_ELT.  If there is more than one
> insn that needs splitting in the bb, it will be in the vector as well
> and handled in some later iteration.
>

Like this?

-- 
H.J.
From 0bb8c37ce83d7886c8deb7c2ba9eb5ddf4d02dc0 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 12 Mar 2019 08:55:24 +0800
Subject: [PATCH] i386: Handle REG_EH_REGION note

When we split:

(insn 18 17 76 2 (set (reg:SF 88 [ _19 ])
(float:SF (mem/c:SI (symbol_ref:DI ("d") [flags 0x2]  ) [1 d+0 S4 A32]))) "x.ii":4:20 170 {*floatsisf2}
 (expr_list:REG_EH_REGION (const_int 2 [0x2])
(nil)))

to

(insn 94 17 18 2 (set (reg:V4SF 115)
(vec_merge:V4SF (vec_duplicate:V4SF (float:SF (mem/c:SI (symbol_ref:DI ("d") [flags 0x2]  ) [1 d+0 S4 A32])))
(reg:V4SF 114)
(const_int 1 [0x1]))) "x.ii":4:20 -1
 (nil))
(insn 18 94 76 2 (set (reg:SF 88 [ _19 ])
(subreg:SF (reg:V4SF 115) 0)) "x.ii":4:20 112 {*movsf_internal}
 (expr_list:REG_EH_REGION (const_int 2 [0x2])
(nil)))

we must copy the REG_EH_REGION note to the first insn and split the block
after the newly added insn.  The REG_EH_REGION on the second insn will be
removed later since it no longer traps.

gcc/

	PR target/89650
	* config/i386/i386.c (remove_partial_avx_dependency): Handle
	REG_EH_REGION note.

gcc/testsuite/

	PR target/89650
	* g++.target/i386/pr89650.C: New test.
---
 gcc/config/i386/i386.c  | 30 +
 gcc/testsuite/g++.target/i386/pr89650.C | 19 
 2 files changed, 49 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/i386/pr89650.C

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 783a810437b..70e10119213 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2819,6 +2819,8 @@ remove_partial_avx_dependency (void)
   rtx set;
   rtx v4sf_const0 = NULL_RTX;
 
+  auto_vec control_flow_insns;
+
   FOR_EACH_BB_FN (bb, cfun)
 {
   FOR_BB_INSNS (bb, insn)
@@ -2875,6 +2877,17 @@ remove_partial_avx_dependency (void)
 	  set_insn = emit_insn_before (set, insn);
 	  df_insn_rescan (set_insn);
 
+	  if (cfun->can_throw_non_call_exceptions)
+	{
+	  /* Handle REG_EH_REGION note.  */
+	  rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
+	  if (note)
+		{
+		  control_flow_insns.safe_push (set_insn);
+		  add_reg_note (set_insn, REG_EH_REGION, XEXP (note, 0));
+		}
+	}
+
 	  src = gen_rtx_SUBREG (dest_mode, vec, 0);
 	  set = gen_rtx_SET (dest, src);
 
@@ -2925,6 +2938,23 @@ remove_partial_avx_dependency (void)
   df_insn_rescan (set_insn);
   df_process_deferred_rescans ();
   loop_optimizer_finalize ();
+
+  if (!control_flow_insns.is_empty ())
+	{
+	  free_dominance_info (CDI_DOMINATORS);
+
+	  unsigned int i;
+	  FOR_EACH_VEC_ELT (control_flow_insns, i, insn)
+	if (control_flow_insn_p (insn))
+	  {
+		/* Split the block after insn.  There will be a fallthru
+		   edge, which is OK so we keep it.  We have to create
+		   the exception edges ourselves.  */
+		bb = BLOCK_FOR_INSN (insn);
+		split_block (bb, insn);
+		rtl_make_eh_edge (NULL, bb, BB_END (bb));
+	  }
+	}
 }
 
   bitmap_obstack_release (NULL);
diff --git a/gcc/testsuite/g++.target/i386/pr89650.C b/gcc/testsuite/g++.target/i386/pr89650.C
new file mode 100644
index 000..4b253cbf467
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/pr89650.C
@@ -0,0 +1,19 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "-O2 -flive-range-shrinkage -fno-tree-dce -fno-dce -fnon-call-exceptions -mavx" }
+
+int d, e;
+struct g {
+  float f;
+  g(float h) : f(h + d) {}
+  ~g() {}
+};
+struct i {
+  int a;
+  int b : 4;
+  int 
+  i(int h) : a(), b(), c(h) {}
+};
+int main() {
+  i j(e);
+  g k[]{1, 2};
+}
-- 
2.20.1



Re: duplicate using declarations

2019-03-14 Thread Paolo Carlini

Hi Nathan,

On 14/03/19 16:59, Nathan Sidwell wrote:

I see this is DR 36.
http://wiki.edg.com/pub/Wg21kona2019/CoreWorkingGroup/cwg_active.html#36

I still think we should drop this check, for one thing we don't apply 
it consistently (eg, naming an unscoped enum value).


I see that for example current icc enforces the check whereas current 
clang doesn't. Given that the DR is still Open and we changed our 
behavior relatively recently, first blush it doesn't seem 
straightforward to me that we should change it again. That said, I would 
be curious to know why the example has been removed... Not a big deal, 
anyway, if dropping the check makes sense in the bigger picture 
certainly I'm not going to insist ;)


Paolo.



Re: [PATCH] Fix PR89710

2019-03-14 Thread Jakub Jelinek
On Thu, Mar 14, 2019 at 03:10:59PM +0100, Richard Biener wrote:
> I've added a testcase.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk
> or should it wait for GCC10?

I meant something like following where we'd clean it up everything right
away after we DCE some returns_twice calls.

Except that doesn't work well, because the temporary
  cfun->calls_setjmp = true;
hack not only allows gimple_purge_dead_abnormal_call_edges through the
condition your patch was removing, but also makes then 
call_can_make_abnormal_goto
and stmt_can_make_abnormal_goto to return true.  So, maybe on top of this
patch add a bool force argument to gimple_purge_dead_abnormal_call_edges
and gimple_purge_all_dead_abnormal_call_edges and instead of the temporary
cfun->calls_setjmp = true;
in this patch pass true as force.  Plus, if we lost the last returns_twice
call and cfun->has_nonlocal_decl isn't set either, instead of calling
gimple_purge_all_dead_abnormal_call_edges just on the selected bbs call
gimple_purge_dead_abnormal_call_edges with force on on all the bbs
because if there are no nonlocal labels and no cfun->calls_setjmp used to be
set and is no longer set, then we don't need any AB edges from any calls?

--- gcc/tree-ssa-dce.c.jj   2019-01-10 11:43:17.044334047 +0100
+++ gcc/tree-ssa-dce.c  2019-03-14 17:35:47.158038514 +0100
@@ -1201,6 +1201,8 @@ eliminate_unnecessary_stmts (void)
   gimple *stmt;
   tree call;
   vec h;
+  bool calls_setjmp = cfun->calls_setjmp;
+  bitmap need_ab_cleanup = NULL;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
 fprintf (dump_file, "\nEliminating unnecessary statements:\n");
@@ -1287,6 +1289,14 @@ eliminate_unnecessary_stmts (void)
}
  if (!is_gimple_debug (stmt))
something_changed = true;
+ if (calls_setjmp
+ && is_gimple_call (stmt)
+ && (gimple_call_flags (stmt) & ECF_RETURNS_TWICE))
+   {
+ if (need_ab_cleanup == NULL)
+   need_ab_cleanup = BITMAP_ALLOC (NULL);
+ bitmap_set_bit (need_ab_cleanup, bb->index);
+   }
  remove_dead_stmt (, bb);
}
  else if (is_gimple_call (stmt))
@@ -1358,6 +1368,17 @@ eliminate_unnecessary_stmts (void)
 
   h.release ();
 
+  if (need_ab_cleanup)
+{
+  bool saved_calls_setjmp = cfun->calls_setjmp;
+  cfun->calls_setjmp = true;
+  if (gimple_purge_all_dead_abnormal_call_edges (need_ab_cleanup))
+   cfg_altered = true;
+  cfun->calls_setjmp = saved_calls_setjmp;
+
+  BITMAP_FREE (need_ab_cleanup);
+}
+
   /* Since we don't track liveness of virtual PHI nodes, it is possible that we
  rendered some PHI nodes unreachable while they are still in use.
  Mark them for renaming.  */


Jakub


Re: [PATCH] Fix up tree-ssa-strlen.c ICEs (PR tree-optimization/89703)

2019-03-14 Thread Jakub Jelinek
On Thu, Mar 14, 2019 at 10:08:55AM -0600, Martin Sebor wrote:
> What solution would you like to see for these (pointer vs integer)
> mismatches?

For GCC 9, I'd probably like most adding the similar pointer vs. non-pointer
test to the return value like there is for the arguments, unless there is
some deep reason why the return value is special.  But we aren't talking
about implicit declarations and even when it is not a prototype, I think
if somebody writes char *strlen (); he must assume we'll not try to optimize
his code.  The warning would be still there when one enables it.

Jakub


Re: [PATCH] Fix up tree-ssa-strlen.c ICEs (PR tree-optimization/89703)

2019-03-14 Thread Martin Sebor

On 3/14/19 2:43 AM, Jakub Jelinek wrote:

On Thu, Mar 14, 2019 at 12:50:28AM +, Joseph Myers wrote:

The C FE sadly passes through some really bad prototypes of builtin
functions as "harmless":
   /* Accept "harmless" mismatches in function types such
  as missing qualifiers or pointer vs same size integer
  mismatches.  This is for the ffs and fprintf builtins.
  However, with -Wextra in effect, diagnose return and
  argument types that are incompatible according to
  language rules.  */


Note that this isn't just about pre-standard headers with unexpected types
(if it were, it might be obsolete).  It's also needed for building glibc
(to build glibc with -Wextra with GCC trunk, one of the -Wno- options
needed is -Wno-builtin-declaration-mismatch), because of how various code
creates function aliases between functions that have the same ABI but
different types (long versus int, etc.).


I was mainly woried about the pointer vs same sized integer mismatches.
Seems we refuse them on arguments:
   /* Fail for types with incompatible modes/sizes.  */
   if (TYPE_MODE (TREE_VALUE (oldargs))
   != TYPE_MODE (TREE_VALUE (newargs)))
 return NULL_TREE;
   /* Fail for function and object pointer mismatches.  */
   if ((FUNCTION_POINTER_TYPE_P (oldtype)
!= FUNCTION_POINTER_TYPE_P (newtype))
   || POINTER_TYPE_P (oldtype) != POINTER_TYPE_P (newtype))
 return NULL_TREE;
but not on the return type, where there is just:
   if (TYPE_MODE (oldrettype) != TYPE_MODE (newrettype))
 return NULL_TREE;
Whether a builtin returns a pointer or integer is at least for the
middle-end quite important difference, so I'm just surprised more bugs in
other passes haven't been filed yet.  I can deal with that in the
tree-ssa-strlen.c pass with the patch I've posted, just am worried about
potential issues elsewhere.


What solution would you like to see for these (pointer vs integer)
mismatches?

I think they should not only be diagnosed even without -Wextra but
at a minimum, treated as ordinary library functions, same as if
their arguments fail to match in the same way(*).

(Since some of the impetus for the fix was these sorts of ICEs,
I suspect I either missed this case when I fixed PR86125 or
I intentionally left it conservative because of the comment.)

[*] Making a library call will just crash at runtime, except
possibly much later, so I think it would be best to issue
a warning by default, like Clang does. It might also be worth
considering replacing the calls with a trap (with some new
option to disable it).

Whatever solution we can agree on I will tighten it in GCC 10.

Martin


Re: duplicate using declarations

2019-03-14 Thread Nathan Sidwell

I see this is DR 36.
http://wiki.edg.com/pub/Wg21kona2019/CoreWorkingGroup/cwg_active.html#36

I still think we should drop this check, for one thing we don't apply it 
consistently (eg, naming an unscoped enum value).


On 3/14/19 11:47 AM, Nathan Sidwell wrote:

Jason, Paolo,

This concerns:
namespace N
{
   int i;
}

void
f ()
{
   using N::i;
   using N::i;   // { dg-error "declared" }
}

which is extracted from g++.dg/lookup/using53.C added fixing 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20420


The comment mentions C++11 7.3.3/10, which does give this as an error, 
but confusingly permits the duplicate decl if we were at namespace 
scope.  That seems wrong to me, and indeed in the current WD, (moved to 
9.8/10) the fn example is removed.  Presumably making it as well formed 
as the namespace example.


Ok with dropping this check entirely? There's a bunch of obsolete using 
decl code I'd like to kill before adding support to modules.


nathan




--
Nathan Sidwell


duplicate using declarations

2019-03-14 Thread Nathan Sidwell

Jason, Paolo,

This concerns:
namespace N
{
  int i;
}

void
f ()
{
  using N::i;
  using N::i;   // { dg-error "declared" }
}

which is extracted from g++.dg/lookup/using53.C added fixing 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20420


The comment mentions C++11 7.3.3/10, which does give this as an error, 
but confusingly permits the duplicate decl if we were at namespace 
scope.  That seems wrong to me, and indeed in the current WD, (moved to 
9.8/10) the fn example is removed.  Presumably making it as well formed 
as the namespace example.


Ok with dropping this check entirely? There's a bunch of obsolete using 
decl code I'd like to kill before adding support to modules.


nathan

--
Nathan Sidwell


Re: [PATCH] Adjust gcc.dg/uninit-pred-8_b.c (PR89551)

2019-03-14 Thread Richard Biener
On Thu, 14 Mar 2019, Christophe Lyon wrote:

> On Thu, 14 Mar 2019 at 15:35, Jakub Jelinek  wrote:
> >
> > On Thu, Mar 14, 2019 at 03:32:39PM +0100, Christophe Lyon wrote:
> > > > 2019-03-04  Richard Biener  
> > > >
> > > > PR testsuite/89551
> > > > * gcc.dg/uninit-pred-8_b.c: Force logical-op-non-short-circuit
> > > > the way that makes the testcase PASS.
> > >
> > > Hi Richard,
> > >
> > > I think you forgot to backport this to gcc-8-branch when you committed 
> > > r269650?
> >
> > That param is not supported in 8.x or older.
> 
> OK. I noticed this because I saw the same regressions I reported on
> trunk on some arm configs:
> https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00125.html

I will backport the param and the patch tomorrow unless somebody beats me.
It's just a testsuite issue after all.

Richard.


Re: [PATCH] Adjust gcc.dg/uninit-pred-8_b.c (PR89551)

2019-03-14 Thread Christophe Lyon
On Thu, 14 Mar 2019 at 15:35, Jakub Jelinek  wrote:
>
> On Thu, Mar 14, 2019 at 03:32:39PM +0100, Christophe Lyon wrote:
> > > 2019-03-04  Richard Biener  
> > >
> > > PR testsuite/89551
> > > * gcc.dg/uninit-pred-8_b.c: Force logical-op-non-short-circuit
> > > the way that makes the testcase PASS.
> >
> > Hi Richard,
> >
> > I think you forgot to backport this to gcc-8-branch when you committed 
> > r269650?
>
> That param is not supported in 8.x or older.

OK. I noticed this because I saw the same regressions I reported on
trunk on some arm configs:
https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00125.html


>
> > > --- gcc/testsuite/gcc.dg/uninit-pred-8_b.c  (revision 269361)
> > > +++ gcc/testsuite/gcc.dg/uninit-pred-8_b.c  (working copy)
> > > @@ -1,6 +1,7 @@
> > > -
> > >  /* { dg-do compile } */
> > > -/* { dg-options "-Wuninitialized -O2" } */
> > > +/* ???  Jump threading makes a mess of the 
> > > logical-op-non-short-circuit=0 case
> > > +   so force it our way.  */
> > > +/* { dg-options "-Wuninitialized -O2 --param 
> > > logical-op-non-short-circuit=1" } */
> > >
> > >  int g;
> > >  void bar();
>
> Jakub


Re: [PATCH] Adjust gcc.dg/uninit-pred-8_b.c (PR89551)

2019-03-14 Thread Jakub Jelinek
On Thu, Mar 14, 2019 at 03:32:39PM +0100, Christophe Lyon wrote:
> > 2019-03-04  Richard Biener  
> >
> > PR testsuite/89551
> > * gcc.dg/uninit-pred-8_b.c: Force logical-op-non-short-circuit
> > the way that makes the testcase PASS.
> 
> Hi Richard,
> 
> I think you forgot to backport this to gcc-8-branch when you committed 
> r269650?

That param is not supported in 8.x or older.

> > --- gcc/testsuite/gcc.dg/uninit-pred-8_b.c  (revision 269361)
> > +++ gcc/testsuite/gcc.dg/uninit-pred-8_b.c  (working copy)
> > @@ -1,6 +1,7 @@
> > -
> >  /* { dg-do compile } */
> > -/* { dg-options "-Wuninitialized -O2" } */
> > +/* ???  Jump threading makes a mess of the logical-op-non-short-circuit=0 
> > case
> > +   so force it our way.  */
> > +/* { dg-options "-Wuninitialized -O2 --param 
> > logical-op-non-short-circuit=1" } */
> >
> >  int g;
> >  void bar();

Jakub


Re: [PATCH] Adjust gcc.dg/uninit-pred-8_b.c (PR89551)

2019-03-14 Thread Richard Biener
On March 14, 2019 3:32:39 PM GMT+01:00, Christophe Lyon 
 wrote:
>On Mon, 4 Mar 2019 at 11:25, Richard Biener  wrote:
>>
>>
>> The CFG cleanup change made us remove an extra forwarder which
>somehow
>> makes VRP jump threading go berzerk.  Fortunately only on
>> logical-op-non-short-circuit=0 targets so the easy way to fix the
>> testcase is to force that our way.
>>
>> Tested on powerpc64le-linux-gnu and x86_64-linux-gnu.
>>
>> I'll hold off applying this for a bit in case Jeff wants to
>> analyze why/how we're doing extra jump threading just because
>> of the lack of that extra forwarder...
>>
>> Richard.
>>
>> 2019-03-04  Richard Biener  
>>
>> PR testsuite/89551
>> * gcc.dg/uninit-pred-8_b.c: Force
>logical-op-non-short-circuit
>> the way that makes the testcase PASS.
>
>Hi Richard,
>
>I think you forgot to backport this to gcc-8-branch when you committed
>r269650?

Whoops, quite possible. I'll fixup later or tomorrow. 

Richard. 

>Christophe
>
>
>>
>> Index: gcc/testsuite/gcc.dg/uninit-pred-8_b.c
>> ===
>> --- gcc/testsuite/gcc.dg/uninit-pred-8_b.c  (revision 269361)
>> +++ gcc/testsuite/gcc.dg/uninit-pred-8_b.c  (working copy)
>> @@ -1,6 +1,7 @@
>> -
>>  /* { dg-do compile } */
>> -/* { dg-options "-Wuninitialized -O2" } */
>> +/* ???  Jump threading makes a mess of the
>logical-op-non-short-circuit=0 case
>> +   so force it our way.  */
>> +/* { dg-options "-Wuninitialized -O2 --param
>logical-op-non-short-circuit=1" } */
>>
>>  int g;
>>  void bar();



Re: [PATCH] Adjust gcc.dg/uninit-pred-8_b.c (PR89551)

2019-03-14 Thread Christophe Lyon
On Mon, 4 Mar 2019 at 11:25, Richard Biener  wrote:
>
>
> The CFG cleanup change made us remove an extra forwarder which somehow
> makes VRP jump threading go berzerk.  Fortunately only on
> logical-op-non-short-circuit=0 targets so the easy way to fix the
> testcase is to force that our way.
>
> Tested on powerpc64le-linux-gnu and x86_64-linux-gnu.
>
> I'll hold off applying this for a bit in case Jeff wants to
> analyze why/how we're doing extra jump threading just because
> of the lack of that extra forwarder...
>
> Richard.
>
> 2019-03-04  Richard Biener  
>
> PR testsuite/89551
> * gcc.dg/uninit-pred-8_b.c: Force logical-op-non-short-circuit
> the way that makes the testcase PASS.

Hi Richard,

I think you forgot to backport this to gcc-8-branch when you committed r269650?

Christophe


>
> Index: gcc/testsuite/gcc.dg/uninit-pred-8_b.c
> ===
> --- gcc/testsuite/gcc.dg/uninit-pred-8_b.c  (revision 269361)
> +++ gcc/testsuite/gcc.dg/uninit-pred-8_b.c  (working copy)
> @@ -1,6 +1,7 @@
> -
>  /* { dg-do compile } */
> -/* { dg-options "-Wuninitialized -O2" } */
> +/* ???  Jump threading makes a mess of the logical-op-non-short-circuit=0 
> case
> +   so force it our way.  */
> +/* { dg-options "-Wuninitialized -O2 --param logical-op-non-short-circuit=1" 
> } */
>
>  int g;
>  void bar();


[PATCH][DOC][OBVIOUS] Remove dead option from manual (PR other/89712).

2019-03-14 Thread Martin Liška
Hi.

This is removal of not supported option.
I'm going to install it as obvious.

Martin

gcc/ChangeLog:

2019-03-14  Martin Liska  

PR other/89712
* doc/invoke.texi: Remove -fdump-class-hierarchy option.
---
 gcc/doc/invoke.texi | 1 -
 1 file changed, 1 deletion(-)


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index df0883f2fc9..0a941519dbc 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -583,7 +583,6 @@ Objective-C and Objective-C++ Dialects}.
 -fdisable-tree-@var{pass-name}=@var{range-list} @gol
 -fdump-debug  -fdump-earlydebug @gol
 -fdump-noaddr  -fdump-unnumbered  -fdump-unnumbered-links @gol
--fdump-class-hierarchy@r{[}-@var{n}@r{]} @gol
 -fdump-final-insns@r{[}=@var{file}@r{]} @gol
 -fdump-ipa-all  -fdump-ipa-cgraph  -fdump-ipa-inline @gol
 -fdump-lang-all @gol



Re: [PATCH] Fix PR89710

2019-03-14 Thread Richard Biener
On Thu, 14 Mar 2019, Richard Biener wrote:

> On Thu, 14 Mar 2019, Jakub Jelinek wrote:
> 
> > On Thu, Mar 14, 2019 at 11:23:03AM +0100, Richard Biener wrote:
> > > I am testing the following.  There's IMHO also a missed optimization
> > > (for CFG-cleanup?) that when a block does not end up in a call
> > > outgoing abnormal edges can be purged?  In this case it is
> > > IPA inlining leaving us with this - the inliner calls
> > > gimple_purge_dead_abnormal_call_edges on the return block
> > > but both cfun->has_nonlocal_label and cfun->calls_setjmp are
> > > false so the function does nothing.  This is probably
> > > a premature check there?
> > 
> > I think it would be better to keep the invariant that there are no abnormal
> > non-EH edges if both of the cfun bools are false, so that we don't need to
> > do useless work in the 99.9% of cases.
> 
> I agree.
> 
> > In the inliner case, is it that we haven't yet updated those or something
> > similar (if yes, could we update them earlier)?
> 
> It's the edge still being present when DCE recomputes the flag.
> 
> > In the DCE case, can't we when clearing the flag schedule a cleanup and only
> > clear it afterwards?
> 
> Well, I guess the cleanup would just work iff we'd keep the flag
> check in call_can_make_abnormal_goto but remove the flag check
> in gimple_purge_dead_abnormal_call_edges.  Not immediately
> after DCE but at least inlining gets rid of the edge (and the
> abnormal dispatcher) then.
> 
> Does that sound reasonable?
> 
> Thus like the following, testing in progress.

I've added a testcase.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk
or should it wait for GCC10?

Thanks,
Richard.

2019-03-14  Richard Biener  

PR tree-optimization/89710
* tree-cfg.c (gimple_purge_dead_abnormal_call_edges): Remove
premature check.

* gcc.dg/tree-ssa/pr89710.c: New testcase.

Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 269678)
+++ gcc/tree-cfg.c  (working copy)
@@ -8714,10 +8714,6 @@ gimple_purge_dead_abnormal_call_edges (b
   edge_iterator ei;
   gimple *stmt = last_stmt (bb);
 
-  if (!cfun->has_nonlocal_label
-  && !cfun->calls_setjmp)
-return false;
-
   if (stmt && stmt_can_make_abnormal_goto (stmt))
 return false;
 
Index: gcc/testsuite/gcc.dg/tree-ssa/pr89710.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr89710.c (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr89710.c (working copy)
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-fixup_cfg3" } */
+
+void
+gm (int *);
+
+__attribute__ ((returns_twice)) void
+jg (void)
+{
+}
+
+void
+eb (void)
+{
+  int r6 = 0;
+
+  if (r6 != 0)
+gm ();
+}
+
+void
+gm (int *r6)
+{
+  jg ();
+
+  for (;;)
+{
+  eb ();
+  *r6 = 0;
+}
+}
+
+/* After IPA inlining all abnormal edges and the dispatcher should vanish.  */
+/* { dg-final { scan-tree-dump-not "ABNORMAL_DISPATCHER" "fixup_cfg3" } } */


Re: [PATCH] Fix mips subreg handling (PR target/89378)

2019-03-14 Thread Jeff Law
On 3/14/19 7:52 AM, Jakub Jelinek wrote:
> Hi!
> 
> On the gcc.dg/vect/pr88598-3.c testcase, gen_vec_extract... is called with
> operands[1] that is already a subreg - (subreg:V4SF (reg:V4SI 201 [ _31 ]) 0)
> and wraps it in another SUBREG, which is invalid in RTL.
> 
> In all the following 4 spots I've verified that the operand's mode is some
> 128-bit vector mode and the mode we want to use is another 128-bit vector
> mode, so we are just VCEing the operand to another same sized mode, so we can
> call gen_lowpart to handle everything for us (there are no worries about big
> vs. little endian etc.).
> 
> Paul Hua has kindly bootstrapped/regtested this on mips (mentioned in the
> PR), ok for trunk?
> 
> 2019-03-14  Jakub Jelinek  
> 
>   PR target/89378
>   * config/mips/mips.c (mips_expand_vec_cond_expr): Use gen_lowpart
>   instead of gen_rtx_SUBREG.
>   * config/mips/mips-msa.md (vec_extract): Likewise.
Also note my mips64/mips64el-linux toolchains built and tested without
regressions.  My bootstrap had to restart due to network issues and is
still ongoing.

jeff


[PATCH 3/3] Fix GNU coding style in lto-common.c.

2019-03-14 Thread marxin

gcc/lto/ChangeLog:

2019-03-14  Martin Liska  

* lto-common.c: Update coding style.
* lto.c (materialize_cgraph): Likewise.
---
 gcc/lto/lto-common.c | 214 ++-
 gcc/lto/lto.c|  30 +++---
 2 files changed, 128 insertions(+), 116 deletions(-)

diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c
index 01470c17080..aa64d8b779e 100644
--- a/gcc/lto/lto-common.c
+++ b/gcc/lto/lto-common.c
@@ -76,15 +76,15 @@ hash_name (const void *p)
 static int
 eq_name (const void *p1, const void *p2)
 {
-  const struct lto_section_slot *s1 =
-(const struct lto_section_slot *) p1;
-  const struct lto_section_slot *s2 =
-(const struct lto_section_slot *) p2;
+  const struct lto_section_slot *s1
+= (const struct lto_section_slot *) p1;
+  const struct lto_section_slot *s2
+= (const struct lto_section_slot *) p2;
 
   return strcmp (s1->name, s2->name) == 0;
 }
 
-/* Free lto_section_slot */
+/* Free lto_section_slot.  */
 
 static void
 free_with_string (void *arg)
@@ -95,9 +95,9 @@ free_with_string (void *arg)
   free (arg);
 }
 
-/* Create section hash table */
+/* Create section hash table.  */
 
-htab_t 
+htab_t
 lto_obj_create_section_hash_table (void)
 {
   return htab_create (37, hash_name, eq_name, free_with_string);
@@ -145,9 +145,9 @@ lto_splay_tree_id_equal_p (splay_tree_key key, unsigned HOST_WIDE_INT id)
   return *(unsigned HOST_WIDE_INT *) key == id;
 }
 
-/* Insert a splay tree node into tree T with ID as key and FILE_DATA as value. 
+/* Insert a splay tree node into tree T with ID as key and FILE_DATA as value.
The ID is allocated separately because we need HOST_WIDE_INTs which may
-   be wider than a splay_tree_key. */
+   be wider than a splay_tree_key.  */
 
 static void
 lto_splay_tree_insert (splay_tree t, unsigned HOST_WIDE_INT id,
@@ -164,13 +164,13 @@ static splay_tree
 lto_splay_tree_new (void)
 {
   return splay_tree_new (lto_splay_tree_compare_ids,
-	 	 lto_splay_tree_delete_id,
+			 lto_splay_tree_delete_id,
 			 NULL);
 }
 
 /* Decode the content of memory pointed to by DATA in the in decl
-   state object STATE. DATA_IN points to a data_in structure for
-   decoding. Return the address after the decoded object in the
+   state object STATE.  DATA_IN points to a data_in structure for
+   decoding.  Return the address after the decoded object in the
input.  */
 
 static const uint32_t *
@@ -255,7 +255,7 @@ hash_canonical_type (tree type)
 {
   hstate.add_int (TYPE_PRECISION (type));
   if (!type_with_interoperable_signedness (type))
-hstate.add_int (TYPE_UNSIGNED (type));
+	hstate.add_int (TYPE_UNSIGNED (type));
 }
 
   if (VECTOR_TYPE_P (type))
@@ -437,8 +437,8 @@ gimple_register_canonical_type (tree t)
 TYPE_CANONICAL (t) = TYPE_CANONICAL (TYPE_MAIN_VARIANT (t));
   else
 {
-  gimple_register_canonical_type_1 (TYPE_MAIN_VARIANT (t),
-	hash_canonical_type (TYPE_MAIN_VARIANT (t)));
+  hashval_t h = hash_canonical_type (TYPE_MAIN_VARIANT (t));
+  gimple_register_canonical_type_1 (TYPE_MAIN_VARIANT (t), h);
   TYPE_CANONICAL (t) = TYPE_CANONICAL (TYPE_MAIN_VARIANT (t));
 }
 }
@@ -460,7 +460,7 @@ lto_register_canonical_types (tree node, bool first_p)
   || TREE_CODE (node) == ARRAY_TYPE)
 lto_register_canonical_types (TREE_TYPE (node), first_p);
 
- if (!first_p) 
+ if (!first_p)
 gimple_register_canonical_type (node);
 }
 
@@ -534,7 +534,7 @@ mentions_vars_p_decl_with_vis (tree t)
   if (mentions_vars_p_decl_common (t))
 return true;
 
-  /* Accessor macro has side-effects, use field-name here. */
+  /* Accessor macro has side-effects, use field-name here.  */
   CHECK_NO_VAR (DECL_ASSEMBLER_NAME_RAW (t));
   return false;
 }
@@ -594,7 +594,7 @@ mentions_vars_p_type (tree t)
   CHECK_VAR (TYPE_MIN_VALUE_RAW (t));
   CHECK_VAR (TYPE_MAX_VALUE_RAW (t));
 
-  /* Accessor is for derived node types only. */
+  /* Accessor is for derived node types only.  */
   CHECK_NO_VAR (TYPE_LANG_SLOT_1 (t));
 
   CHECK_VAR (TYPE_CONTEXT (t));
@@ -748,7 +748,7 @@ mentions_vars_p (tree t)
 }
 
 
-/* Return the resolution for the decl with index INDEX from DATA_IN. */
+/* Return the resolution for the decl with index INDEX from DATA_IN.  */
 
 static enum ld_plugin_symbol_resolution
 get_resolution (struct data_in *data_in, unsigned index)
@@ -758,7 +758,7 @@ get_resolution (struct data_in *data_in, unsigned index)
   ld_plugin_symbol_resolution_t ret;
   /* We can have references to not emitted functions in
 	 DECL_FUNCTION_PERSONALITY at least.  So we can and have
-	 to indeed return LDPR_UNKNOWN in some cases.   */
+	 to indeed return LDPR_UNKNOWN in some cases.  */
   if (data_in->globals_resolution.length () <= index)
 	return LDPR_UNKNOWN;
   ret = data_in->globals_resolution[index];
@@ -1090,7 +1090,7 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map)
   if (code == VAR_DECL)
 	{
 	  compare_values (DECL_HARD_REGISTER);

[PATCH 2/3] Add lto-dump tool.

2019-03-14 Thread marxin

gcc/ChangeLog:

2019-03-14  Hrishikesh Kulkarni  
Martin Liska  

* Makefile.in: Add lto-dump.texi.
* cgraph.h: Add new functions dump_visibility and
dump_type_name.
* doc/gcc.texi: Include lto-dump section.
* doc/lto-dump.texi: New file.
* dumpfile.c (dump_switch_p_1): Use parse_dump_option.
(parse_dump_option): Factor out this function.
* dumpfile.h (enum dump_flag): Add new value TDF_ERROR.
(parse_dump_option): Export the function.
* symtab.c (symtab_node::dump_visibility): New function.
(symtab_node::dump_type_name): Likewise.

gcc/lto/ChangeLog:

2019-03-14  Hrishikesh Kulkarni  
Martin Liska  

* Make-lang.in: Add lto_dump-related definition.
* config-lang.in: Likewise.
* lang.opt: Add new language LTODump and options related
to LTO dump tool.
* lto-common.c (lto_read_decls): Support type statistics dump.
(lto_file_read): Likewise for object files.
* lto-dump.c: New file.
* lto-lang.c (lto_option_lang_mask): Move from ..
* lto.c (lto_option_lang_mask): .. here.
* lto.h (lto_option_lang_mask): New declaration.
---
 gcc/Makefile.in|   2 +-
 gcc/cgraph.h   |   6 +
 gcc/doc/gcc.texi   |   5 +
 gcc/doc/lto-dump.texi  | 131 
 gcc/dumpfile.c |  85 ++
 gcc/dumpfile.h |   5 +
 gcc/lto/Make-lang.in   |  20 ++-
 gcc/lto/config-lang.in |   4 +-
 gcc/lto/lang.opt   |  62 
 gcc/lto/lto-common.c   |  40 +
 gcc/lto/lto-dump.c | 344 +
 gcc/lto/lto-lang.c |   6 -
 gcc/lto/lto.c  |   6 +
 gcc/lto/lto.h  |   2 +
 gcc/symtab.c   |  17 ++
 15 files changed, 692 insertions(+), 43 deletions(-)
 create mode 100644 gcc/doc/lto-dump.texi
 create mode 100644 gcc/lto/lto-dump.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 508c674cbdc..c7e4ae94103 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3158,7 +3158,7 @@ TEXI_GCC_FILES = gcc.texi gcc-common.texi gcc-vers.texi frontends.texi	\
 	 gcov.texi trouble.texi bugreport.texi service.texi		\
 	 contribute.texi compat.texi funding.texi gnu.texi gpl_v3.texi	\
 	 fdl.texi contrib.texi cppenv.texi cppopts.texi avr-mmcu.texi	\
-	 implement-c.texi implement-cxx.texi gcov-tool.texi gcov-dump.texi
+	 implement-c.texi implement-cxx.texi gcov-tool.texi gcov-dump.texi lto-dump.texi
 
 # we explicitly use $(srcdir)/doc/tm.texi here to avoid confusion with
 # the generated tm.texi; the latter might have a more recent timestamp,
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 9a19d83fffb..e2c2c00b5ed 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -119,6 +119,12 @@ public:
   /* Return dump name with assembler name.  */
   const char *dump_asm_name () const;
 
+  /* Return visibility name.  */
+  const char *dump_visibility () const;
+
+  /* Return type_name name.  */
+  const char *dump_type_name () const;
+
   /* Add node into symbol table.  This function is not used directly, but via
  cgraph/varpool node creation routines.  */
   void register_symbol (void);
diff --git a/gcc/doc/gcc.texi b/gcc/doc/gcc.texi
index 5297b9cb5d0..4d03e3a6d96 100644
--- a/gcc/doc/gcc.texi
+++ b/gcc/doc/gcc.texi
@@ -68,6 +68,8 @@ Texts being (a) (see below), and with the Back-Cover Texts being (b)
 * gcov: (gcc) Gcov.@command{gcov}---a test coverage program.
 * gcov-tool: (gcc) Gcov-tool.  @command{gcov-tool}---an offline gcda profile processing program.
 * gcov-dump: (gcc) Gcov-dump.  @command{gcov-dump}---an offline gcda and gcno profile dump tool.
+* lto-dump: (gcc) lto-dump.@command{lto-dump}---Tool for
+dumping LTO object files.
 @end direntry
 This file documents the use of the GNU compilers.
 @sp 1
@@ -142,6 +144,8 @@ Introduction, gccint, GNU Compiler Collection (GCC) Internals}.
 * Gcov::@command{gcov}---a test coverage program.
 * Gcov-tool::   @command{gcov-tool}---an offline gcda profile processing program.
 * Gcov-dump::   @command{gcov-dump}---an offline gcda and gcno profile dump tool.
+* lto-dump::@command{lto-dump}---Tool for dumping LTO
+object files.
 * Trouble:: If you have trouble using GCC.
 * Bugs::How, why and where to report bugs.
 * Service:: How To Get Help with GCC
@@ -170,6 +174,7 @@ Introduction, gccint, GNU Compiler Collection (GCC) Internals}.
 @include gcov.texi
 @include gcov-tool.texi
 @include gcov-dump.texi
+@include lto-dump.texi
 @include trouble.texi
 @include bugreport.texi
 @include service.texi
diff --git a/gcc/doc/lto-dump.texi b/gcc/doc/lto-dump.texi
new file mode 100644
index 000..d84397581c0
--- /dev/null
+++ b/gcc/doc/lto-dump.texi
@@ -0,0 +1,131 @@
+@c Copyright (C) 2018-2019 Free Software Foundation, Inc.
+@c This is part of the GCC manual.
+@c For copying conditions, see the file gcc.texi.
+
+@ignore
+@c man begin COPYRIGHT

[PATCH 0/3][stage1] [PATCH v2] LTO dump tool

2019-03-14 Thread marxin
Hi.

The patch series is a remake of Hrishikesh's LTO dump tool that was
part of GSoC last year. I decided to split the patch into 3 parts, where
the first one is purely mechanical splitting code into a newly created
one. The second part introduces the new tool. And the last one is about
GNU coding style fixes.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
Ready to be installed after stage1 opens?

Thanks,
Martin

marxin (3):
  Split part of functionality from lto.c to lto-common.c.
  Add lto-dump tool.
  Fix GNU coding style in lto-common.c.

 gcc/Makefile.in|2 +-
 gcc/cgraph.h   |6 +
 gcc/doc/gcc.texi   |5 +
 gcc/doc/lto-dump.texi  |  131 ++
 gcc/dumpfile.c |   85 +-
 gcc/dumpfile.h |5 +
 gcc/lto/Make-lang.in   |   22 +-
 gcc/lto/config-lang.in |4 +-
 gcc/lto/lang.opt   |   62 +
 gcc/lto/lto-common.c   | 2883 
 gcc/lto/lto-common.h   |   33 +
 gcc/lto/lto-dump.c |  344 +
 gcc/lto/lto-lang.c |7 +-
 gcc/lto/lto.c  | 2869 +--
 gcc/lto/lto.h  |2 +
 gcc/symtab.c   |   17 +
 16 files changed, 3612 insertions(+), 2865 deletions(-)
 create mode 100644 gcc/doc/lto-dump.texi
 create mode 100644 gcc/lto/lto-common.c
 create mode 100644 gcc/lto/lto-common.h
 create mode 100644 gcc/lto/lto-dump.c

-- 
2.21.0



[PATCH] Fix mips subreg handling (PR target/89378)

2019-03-14 Thread Jakub Jelinek
Hi!

On the gcc.dg/vect/pr88598-3.c testcase, gen_vec_extract... is called with
operands[1] that is already a subreg - (subreg:V4SF (reg:V4SI 201 [ _31 ]) 0)
and wraps it in another SUBREG, which is invalid in RTL.

In all the following 4 spots I've verified that the operand's mode is some
128-bit vector mode and the mode we want to use is another 128-bit vector
mode, so we are just VCEing the operand to another same sized mode, so we can
call gen_lowpart to handle everything for us (there are no worries about big
vs. little endian etc.).

Paul Hua has kindly bootstrapped/regtested this on mips (mentioned in the
PR), ok for trunk?

2019-03-14  Jakub Jelinek  

PR target/89378
* config/mips/mips.c (mips_expand_vec_cond_expr): Use gen_lowpart
instead of gen_rtx_SUBREG.
* config/mips/mips-msa.md (vec_extract): Likewise.

--- gcc/config/mips/mips.c.jj   2019-03-11 22:57:00.113599336 +0100
+++ gcc/config/mips/mips.c  2019-03-13 15:46:53.597014213 +0100
@@ -22265,7 +22265,7 @@ mips_expand_vec_cond_expr (machine_mode
  if (mode != vimode)
{
  xop1 = gen_reg_rtx (vimode);
- emit_move_insn (xop1, gen_rtx_SUBREG (vimode, operands[1], 0));
+ emit_move_insn (xop1, gen_lowpart (vimode, operands[1]));
}
  emit_move_insn (src1, xop1);
}
@@ -22282,7 +22282,7 @@ mips_expand_vec_cond_expr (machine_mode
  if (mode != vimode)
{
  xop2 = gen_reg_rtx (vimode);
- emit_move_insn (xop2, gen_rtx_SUBREG (vimode, operands[2], 0));
+ emit_move_insn (xop2, gen_lowpart (vimode, operands[2]));
}
  emit_move_insn (src2, xop2);
}
--- gcc/config/mips/mips-msa.md.jj  2019-01-01 12:37:22.657884713 +0100
+++ gcc/config/mips/mips-msa.md 2019-03-13 15:45:01.442789417 +0100
@@ -346,12 +346,12 @@ (define_expand "vec_extractmode));
   gcc_assert (INTVAL (n) < GET_MODE_NUNITS (V16QImode));
   emit_insn (gen_msa_sldi_b (wd, ws, ws, n));
   temp = gen_reg_rtx (mode);
-  emit_move_insn (temp, gen_rtx_SUBREG (mode, wd, 0));
+  emit_move_insn (temp, gen_lowpart (mode, wd));
 }
   emit_insn (gen_msa_vec_extract_ (operands[0], temp));
   DONE;

Jakub


Re: [PR 89546] Add forgotten requeing in propagate_subaccesses_across_link

2019-03-14 Thread Richard Biener
On Thu, 14 Mar 2019, Martin Jambor wrote:

> Hi,
> 
> PR 89546 is a nasty miscompilation bug that is in the compiler since
> r247604 (May 2017) but is surprisingly hard to trigger.  Since that
> revision, SRA does not consider all aggregates that are assigned a value
> in a gimple assignment statement from another aggregate as automatically
> containing data.  Instead, it records existence of such assignments and
> then propagates this information from RHSs to LHSs in a simple queue
> driven algorithm.  That of course relies on the fact that whenever a
> part of a LHSs is marked as containing data, all relevant parts of that
> aggregate which are on the RHS of some other assignment are re-queued.
> Well, it turns out I forgot to add that to one spot where that has to be
> done.
> 
> Fixed with the patch below.  Bootstrapped and tested on x86_64-linux.
> OK for trunk and gcc 8?

OK.

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> 
> 2019-03-14  Martin Jambor  
> 
>   PR tree-optimization/89546
>   * tree-sra.c (propagate_subaccesses_across_link): Requeue new_acc if
>   any propagation to its children took place.
> 
>   testsuite/
>   * gcc.dg/tree-ssa/pr89546.c: New test.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr89546.c | 100 
>  gcc/tree-sra.c  |   8 +-
>  2 files changed, 106 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr89546.c
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr89546.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr89546.c
> new file mode 100644
> index 000..c4645ae1858
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr89546.c
> @@ -0,0 +1,100 @@
> +/* { dg-do run } */
> +/* { dg-options "-O" } */
> +
> +struct I
> +{
> +  int i;
> +};
> +
> +struct A
> +{
> +  struct I i1;
> +  struct I i2;
> +  struct I i3;
> +};
> +
> +struct B
> +{
> +  struct I i0;
> +  struct A a;
> +};
> +
> +struct C
> +{
> +  struct I i00;
> +  struct B b;
> +};
> +
> +volatile int v;
> +
> +void __attribute__((noipa))
> +consume_i (struct I i)
> +{
> +  v = i.i;
> +}
> +
> +void __attribute__((noipa))
> +consume_a (struct A a)
> +{
> +  v = a.i1.i;
> +}
> +
> +void __attribute__((noipa))
> +consume_b (struct B b)
> +{
> +  v = b.a.i1.i;
> +}
> +
> +void __attribute__((noipa))
> +consume_c (struct C c)
> +{
> +  v = c.b.a.i1.i;
> +}
> +
> +
> +
> +
> +int __attribute__((noipa))
> +foo (struct I input)
> +{
> +  struct I i1, i2, i3;
> +  struct A a1, a2, a3;
> +  struct B b1;
> +  struct C c;
> +
> +  i1 = input;
> +  a1.i1 = i1;
> +  b1.a = a1;
> +
> +  i2.i = 1;
> +  b1.i0 = i2;
> +
> +  c.b = b1;
> +
> +  a2 = c.b.a;
> +  a3 = a2;
> +  i3 = a3.i1;
> +
> +  int t = c.b.i0.i;
> +  a2.i3.i = 4;
> +  consume_i (i1);
> +  consume_i (i2);
> +  consume_b (b1);
> +  consume_a (a1);
> +  consume_a (a2);
> +  consume_a (a3);
> +  consume_c (c);
> +
> +  return i3.i + t;
> +}
> +
> +int
> +main (int argc, char *argv[])
> +{
> +  struct I s;
> +  s.i = 1234;
> +  int i = foo (s);
> +  if (i != 1235)
> +__builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index ca3858d5fc7..fd51a3d0323 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -2752,8 +2752,12 @@ propagate_subaccesses_across_link (struct access 
> *lacc, struct access *racc)
>  
> rchild->grp_hint = 1;
> new_acc->grp_hint |= new_acc->grp_read;
> -   if (rchild->first_child)
> - ret |= propagate_subaccesses_across_link (new_acc, rchild);
> +   if (rchild->first_child
> +   && propagate_subaccesses_across_link (new_acc, rchild))
> + {
> +   ret = 1;
> +   add_access_to_work_queue (new_acc);
> + }
>   }
> else
>   {
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[PR 89546] Add forgotten requeing in propagate_subaccesses_across_link

2019-03-14 Thread Martin Jambor
Hi,

PR 89546 is a nasty miscompilation bug that is in the compiler since
r247604 (May 2017) but is surprisingly hard to trigger.  Since that
revision, SRA does not consider all aggregates that are assigned a value
in a gimple assignment statement from another aggregate as automatically
containing data.  Instead, it records existence of such assignments and
then propagates this information from RHSs to LHSs in a simple queue
driven algorithm.  That of course relies on the fact that whenever a
part of a LHSs is marked as containing data, all relevant parts of that
aggregate which are on the RHS of some other assignment are re-queued.
Well, it turns out I forgot to add that to one spot where that has to be
done.

Fixed with the patch below.  Bootstrapped and tested on x86_64-linux.
OK for trunk and gcc 8?

Thanks,

Martin



2019-03-14  Martin Jambor  

PR tree-optimization/89546
* tree-sra.c (propagate_subaccesses_across_link): Requeue new_acc if
any propagation to its children took place.

testsuite/
* gcc.dg/tree-ssa/pr89546.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr89546.c | 100 
 gcc/tree-sra.c  |   8 +-
 2 files changed, 106 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr89546.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr89546.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr89546.c
new file mode 100644
index 000..c4645ae1858
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr89546.c
@@ -0,0 +1,100 @@
+/* { dg-do run } */
+/* { dg-options "-O" } */
+
+struct I
+{
+  int i;
+};
+
+struct A
+{
+  struct I i1;
+  struct I i2;
+  struct I i3;
+};
+
+struct B
+{
+  struct I i0;
+  struct A a;
+};
+
+struct C
+{
+  struct I i00;
+  struct B b;
+};
+
+volatile int v;
+
+void __attribute__((noipa))
+consume_i (struct I i)
+{
+  v = i.i;
+}
+
+void __attribute__((noipa))
+consume_a (struct A a)
+{
+  v = a.i1.i;
+}
+
+void __attribute__((noipa))
+consume_b (struct B b)
+{
+  v = b.a.i1.i;
+}
+
+void __attribute__((noipa))
+consume_c (struct C c)
+{
+  v = c.b.a.i1.i;
+}
+
+
+
+
+int __attribute__((noipa))
+foo (struct I input)
+{
+  struct I i1, i2, i3;
+  struct A a1, a2, a3;
+  struct B b1;
+  struct C c;
+
+  i1 = input;
+  a1.i1 = i1;
+  b1.a = a1;
+
+  i2.i = 1;
+  b1.i0 = i2;
+
+  c.b = b1;
+
+  a2 = c.b.a;
+  a3 = a2;
+  i3 = a3.i1;
+
+  int t = c.b.i0.i;
+  a2.i3.i = 4;
+  consume_i (i1);
+  consume_i (i2);
+  consume_b (b1);
+  consume_a (a1);
+  consume_a (a2);
+  consume_a (a3);
+  consume_c (c);
+
+  return i3.i + t;
+}
+
+int
+main (int argc, char *argv[])
+{
+  struct I s;
+  s.i = 1234;
+  int i = foo (s);
+  if (i != 1235)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index ca3858d5fc7..fd51a3d0323 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2752,8 +2752,12 @@ propagate_subaccesses_across_link (struct access *lacc, 
struct access *racc)
 
  rchild->grp_hint = 1;
  new_acc->grp_hint |= new_acc->grp_read;
- if (rchild->first_child)
-   ret |= propagate_subaccesses_across_link (new_acc, rchild);
+ if (rchild->first_child
+ && propagate_subaccesses_across_link (new_acc, rchild))
+   {
+ ret = 1;
+ add_access_to_work_queue (new_acc);
+   }
}
  else
{
-- 
2.21.0



Re: [PATCH][i386] Fix DECL_RESULT building for resolver

2019-03-14 Thread Segher Boessenkool
On Thu, Mar 14, 2019 at 12:29:37PM +0100, Richard Biener wrote:
> The following properly sets DECL_CONTEXT of the result decl we build,
> otherwise LTO happily merges those across different resolver decls
> ending up in IPA PTA ICEing on this invalidity.
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu, will apply
> as obvious.
> 
> Wonder if other targets copied from this, ah, powerpc did,
> will commit a powerpc fix alongside (w/o further testing).

Let's hope it works then ;-)

Thanks for the fix!


Segher


Re: V3 [PATCH] i386: Handle REG_EH_REGION note

2019-03-14 Thread Jakub Jelinek
On Thu, Mar 14, 2019 at 09:08:17PM +0800, H.J. Lu wrote:
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2819,6 +2819,8 @@ remove_partial_avx_dependency (void)
>rtx set;
>rtx v4sf_const0 = NULL_RTX;
>  
> +  auto_vec splitted_insn;

Perhaps throwing_insns or flow_transfer_insns or control_flow_insns
instead?

> +   unsigned int i;
> +   FOR_EACH_VEC_ELT (splitted_insn, i, insn)

I actually meant something like:
if (control_flow_insn_p (insn))
  {
/* Split the block after insn.  There will be a
   fallthru edge, which is OK so we keep it.  We
   have to create the exception edges ourselves.  */
split_block (bb, insn);
rtl_make_eh_edge (NULL, bb, BB_END (bb));
  }

only inside of the FOR_EACH_VEC_ELT.  If there is more than one
insn that needs splitting in the bb, it will be in the vector as well
and handled in some later iteration.

Jakub


V3 [PATCH] i386: Handle REG_EH_REGION note

2019-03-14 Thread H.J. Lu
On 3/14/19, H.J. Lu  wrote:
> On Thu, Mar 14, 2019 at 4:34 PM Jakub Jelinek  wrote:
>>
>> On Thu, Mar 14, 2019 at 11:30:21AM +0800, H.J. Lu wrote:
>> > We need to split the basic block if we create new insns, which may
>> > throw exceptions, in the middle of the basic blocks.
>> >
>> > Tested on AVX2 and AVX512 machines with and without
>> >
>> > --with-arch=native
>> >
>> > OK for trunk?
>>
>> That looks much better, I see you chose to follow what lower-subreg.c
>> does
>> here.  I just wonder if instead of the sbitmap of blocks to check for
>> splitting it
>> wouldn't be more efficient to have an auto_vec holding the
>> exact
>> set_insns that need splitting after them and just grab the bb using
>> BLOCK_FOR_INSN.
>>
>
> Good idea.  I will work on it.
>

Here it is.  OK for trunk?

Thanks.

-- 
H.J.
From 64ac61ce7095c49406ea34b05f1c22fbb98ca3f7 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 12 Mar 2019 08:55:24 +0800
Subject: [PATCH] i386: Handle REG_EH_REGION note

When we split:

(insn 18 17 76 2 (set (reg:SF 88 [ _19 ])
(float:SF (mem/c:SI (symbol_ref:DI ("d") [flags 0x2]  ) [1 d+0 S4 A32]))) "x.ii":4:20 170 {*floatsisf2}
 (expr_list:REG_EH_REGION (const_int 2 [0x2])
(nil)))

to

(insn 94 17 18 2 (set (reg:V4SF 115)
(vec_merge:V4SF (vec_duplicate:V4SF (float:SF (mem/c:SI (symbol_ref:DI ("d") [flags 0x2]  ) [1 d+0 S4 A32])))
(reg:V4SF 114)
(const_int 1 [0x1]))) "x.ii":4:20 -1
 (nil))
(insn 18 94 76 2 (set (reg:SF 88 [ _19 ])
(subreg:SF (reg:V4SF 115) 0)) "x.ii":4:20 112 {*movsf_internal}
 (expr_list:REG_EH_REGION (const_int 2 [0x2])
(nil)))

we must copy the REG_EH_REGION note to the first insn and split the block
after the newly added insn.  The REG_EH_REGION on the second insn will be
removed later since it no longer traps.

gcc/

	PR target/89650
	* config/i386/i386.c (remove_partial_avx_dependency): Handle
	REG_EH_REGION note.

gcc/testsuite/

	PR target/89650
	* g++.target/i386/pr89650.C: New test.
---
 gcc/config/i386/i386.c  | 42 +
 gcc/testsuite/g++.target/i386/pr89650.C | 19 +++
 2 files changed, 61 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/i386/pr89650.C

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1f94a45909d..93ff46f329a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2819,6 +2819,8 @@ remove_partial_avx_dependency (void)
   rtx set;
   rtx v4sf_const0 = NULL_RTX;
 
+  auto_vec splitted_insn;
+
   FOR_EACH_BB_FN (bb, cfun)
 {
   FOR_BB_INSNS (bb, insn)
@@ -2875,6 +2877,17 @@ remove_partial_avx_dependency (void)
 	  set_insn = emit_insn_before (set, insn);
 	  df_insn_rescan (set_insn);
 
+	  if (cfun->can_throw_non_call_exceptions)
+	{
+	  /* Handle REG_EH_REGION note.  */
+	  rtx note = find_reg_note (insn, REG_EH_REGION, NULL_RTX);
+	  if (note)
+		{
+		  splitted_insn.safe_push (set_insn);
+		  add_reg_note (set_insn, REG_EH_REGION, XEXP (note, 0));
+		}
+	}
+
 	  src = gen_rtx_SUBREG (dest_mode, vec, 0);
 	  set = gen_rtx_SET (dest, src);
 
@@ -2925,6 +2938,35 @@ remove_partial_avx_dependency (void)
   df_insn_rescan (set_insn);
   df_process_deferred_rescans ();
   loop_optimizer_finalize ();
+
+  if (!splitted_insn.is_empty ())
+	{
+	  free_dominance_info (CDI_DOMINATORS);
+
+	  unsigned int i;
+	  FOR_EACH_VEC_ELT (splitted_insn, i, insn)
+	{
+	  rtx_insn *end;
+	  edge fallthru;
+
+	  bb = BLOCK_FOR_INSN (insn);
+	  end = BB_END (bb);
+
+	  while (insn != end)
+		if (control_flow_insn_p (insn))
+		  {
+		/* Split the block after insn.  There will be a
+		   fallthru edge, which is OK so we keep it.  We
+		   have to create the exception edges ourselves.  */
+		fallthru = split_block (bb, insn);
+		rtl_make_eh_edge (NULL, bb, BB_END (bb));
+		bb = fallthru->dest;
+		insn = BB_HEAD (bb);
+		  }
+		else
+		  insn = NEXT_INSN (insn);
+	}
+	}
 }
 
   bitmap_obstack_release (NULL);
diff --git a/gcc/testsuite/g++.target/i386/pr89650.C b/gcc/testsuite/g++.target/i386/pr89650.C
new file mode 100644
index 000..4b253cbf467
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/pr89650.C
@@ -0,0 +1,19 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "-O2 -flive-range-shrinkage -fno-tree-dce -fno-dce -fnon-call-exceptions -mavx" }
+
+int d, e;
+struct g {
+  float f;
+  g(float h) : f(h + d) {}
+  ~g() {}
+};
+struct i {
+  int a;
+  int b : 4;
+  int 
+  i(int h) : a(), b(), c(h) {}
+};
+int main() {
+  i j(e);
+  g k[]{1, 2};
+}
-- 
2.20.1



Re: C++ PATCH for c++/89612 - ICE with member friend template with noexcept

2019-03-14 Thread Marek Polacek
Ping.

On Thu, Mar 07, 2019 at 04:52:55PM -0500, Marek Polacek wrote:
> This was one of those PRs where the more you poke, the more ICEs turn up.
> This patch fixes the ones I could find.  The original problem was that
> maybe_instantiate_noexcept got a TEMPLATE_DECL created for the member
> friend template in do_friend.  Its noexcept-specification was deferred,
> so we went to the block with push_access_scope, but that crashes on a
> TEMPLATE_DECL.  One approach could be to somehow not defer noexcept-specs
> for friend templates, I guess, but I didn't want to do that.
> 
> So the approach I did take in the end was to handle TEMPLATE_DECLs in
> maybe_instantiate_noexcept.
> 
> That broke in register_parameter_specializations but we don't need this
> code anyway, so let's do away with it -- the current_class_{ref,ptr}
> code is enough to fix the PR that register_parameter_specializations was
> introduced for.
> 
> Another issue was that since here we are instantiating a deferred noexcept,
> in a instantiate_class_template context, processing_template_decl was 0.
> Let's pretend it's on for the tsubst purposes here, and for 
> build_noexcept_spec
> too, so that the *_dependent_expression_p functions work.
> 
> Lastly, I found an invalid testcase that was breaking because a template code
> leaked to constexpr functions.  This I fixed similarly to the recent explicit
> PR fix (r269131).
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2019-03-07  Marek Polacek  
> 
>   PR c++/89612 - ICE with member friend template with noexcept.
>   * except.c (build_noexcept_spec): Call instantiate_non_dependent_expr
>   before perform_implicit_conversion_flags.  Add processing_template_decl
>   sentinel.
>   * pt.c (maybe_instantiate_noexcept): For function templates, use their
>   template result (function decl).  Don't set up local specializations.
>   Temporarily turn on processing_template_decl.  Update the template type
>   too.
> 
>   * g++.dg/cpp0x/noexcept36.C: New test.
>   * g++.dg/cpp1y/noexcept1.C: New test.
>   * g++.dg/cpp1z/noexcept-type21.C: New test.
> 
> diff --git gcc/cp/except.c gcc/cp/except.c
> index 139e871d7a7..d97b8d40542 100644
> --- gcc/cp/except.c
> +++ gcc/cp/except.c
> @@ -1285,10 +1285,13 @@ build_noexcept_spec (tree expr, tsubst_flags_t 
> complain)
>if (TREE_CODE (expr) != DEFERRED_NOEXCEPT
>&& !value_dependent_expression_p (expr))
>  {
> +  expr = instantiate_non_dependent_expr_sfinae (expr, complain);
> +  /* Don't let perform_implicit_conversion_flags create more template
> +  codes.  */
> +  processing_template_decl_sentinel s;
>expr = perform_implicit_conversion_flags (boolean_type_node, expr,
>   complain,
>   LOOKUP_NORMAL);
> -  expr = instantiate_non_dependent_expr (expr);
>expr = cxx_constant_value (expr);
>  }
>if (TREE_CODE (expr) == INTEGER_CST)
> diff --git gcc/cp/pt.c gcc/cp/pt.c
> index 906cfe0a58c..44a2c4606f8 100644
> --- gcc/cp/pt.c
> +++ gcc/cp/pt.c
> @@ -24174,6 +24174,17 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
> complain)
>  
>if (DECL_CLONED_FUNCTION_P (fn))
>  fn = DECL_CLONED_FUNCTION (fn);
> +
> +  tree orig_fn = NULL_TREE;
> +  /* For a member friend template we can get a TEMPLATE_DECL.  Let's use
> + its FUNCTION_DECL for the rest of this function -- push_access_scope
> + doesn't accept TEMPLATE_DECLs.  */
> +  if (DECL_FUNCTION_TEMPLATE_P (fn))
> +{
> +  orig_fn = fn;
> +  fn = DECL_TEMPLATE_RESULT (fn);
> +}
> +
>fntype = TREE_TYPE (fn);
>spec = TYPE_RAISES_EXCEPTIONS (fntype);
>  
> @@ -24204,37 +24215,41 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
> complain)
> push_deferring_access_checks (dk_no_deferred);
> input_location = DECL_SOURCE_LOCATION (fn);
>  
> -   /* A new stack interferes with pop_access_scope.  */
> -   {
> - /* Set up the list of local specializations.  */
> - local_specialization_stack lss (lss_copy);
> -
> - tree save_ccp = current_class_ptr;
> - tree save_ccr = current_class_ref;
> - /* If needed, set current_class_ptr for the benefit of
> -tsubst_copy/PARM_DECL.  */
> - tree tdecl = DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (fn));
> - if (DECL_NONSTATIC_MEMBER_FUNCTION_P (tdecl))
> -   {
> - tree this_parm = DECL_ARGUMENTS (tdecl);
> - current_class_ptr = NULL_TREE;
> - current_class_ref = cp_build_fold_indirect_ref (this_parm);
> - current_class_ptr = this_parm;
> -   }
> +   tree save_ccp = current_class_ptr;
> +   tree save_ccr = current_class_ref;
> +   /* If needed, set current_class_ptr for the benefit of
> +  tsubst_copy/PARM_DECL.  */
> +   tree tdecl = DECL_TEMPLATE_RESULT 

Re: [wwwdocs] Update snapshots.html to refer to sha512sum

2019-03-14 Thread Jakub Jelinek
On Thu, Mar 14, 2019 at 11:45:53AM +, Jonathan Wakely wrote:
> The checksum file uses SHA512 not MD5SUM, so the instructions should
> be updated accordingly.
> 
> SInce there's only one tarfile these days (not separate ones for
> gcc-base, gcc-g++, gcc-fortran, etc) it doesn't seem necessary to
> filter out the "OK" lines with grep, but I could add --quiet to the
> command if we want that. I added --ignore-missing because I doubt most
> people also download the README and index.html files, so the option
> avoids gettings warnings about those files.
> 
> OK for CVS?

Ok, thanks.

> Index: htdocs/snapshots.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/snapshots.html,v
> retrieving revision 1.24
> diff -u -r1.24 snapshots.html
> --- htdocs/snapshots.html 30 Sep 2018 14:38:47 -  1.24
> +++ htdocs/snapshots.html 14 Mar 2019 11:42:57 -
> @@ -35,15 +35,15 @@
>  files so that autoconf et al aren't needed. This is documented in
>  comments in contrib/gcc_update.
>  
> -The program md5sum  which is included with the 
> +The program sha512sum  which is included with the 
>  https://www.gnu.org/software/coreutils/;>GNU Coreutils
>   can be used to verify the integrity of a snapshot or release.
>  The release script generates the file 
> -MD5SUMS that provides a 128-bit checksum for every
> -file in the tarball. Use the following command to verify the
> -sources:
> +sha512.sum that provides a 512-bit checksum for the tarball
> +and other files in the snapshot directory. Use the following command
> +to verify the sources:
>   
> -md5sum --check MD5SUMS | grep -v OK$
> +sha512sum --check --ignore-missing sha512.sum
>  
>  

Re: [PATCH] Fix create_dispatcher_calls ICE (PR ipa/89684)

2019-03-14 Thread Richard Biener
On Wed, 13 Mar 2019, Jakub Jelinek wrote:

> Hi!
> 
> create_dispatcher_calls iterates over ipa_ref *s referring the current node
> and wants to remove them all and create new ones.  The problem is
> that if there are any ipa_ref objects where two or more of them are from the
> same cgraph node to the current node (in the testcases there are both cases
> where there are two or more foo -> foo self-references, or two or more
> foo.avx -> foo references), when we ref->remove_reference () one of them,
> that method will actually move some other ipa_ref that was last in
> references vector over to that and thus invalidate what some other ipa_ref
> pointers point to (the pointers still point to some elements of some
> references vector, but could either point past the last one, or could point
> to an element that was overwritten with something else).
> 
> The following patch fixes it by remove_reference all references immediately,
> but push into the vector their content first; besides that
> ref->remove_reference we are just accessing a couple of fields from that,
> not calling any other methods.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

LGTM.

Richard.

> 2019-03-13  Jakub Jelinek  
> 
>   PR ipa/89684
>   * multiple_target.c (create_dispatcher_calls): Change
>   references_to_redirect from vector of ipa_ref * to vector of ipa_ref.
>   In the node->iterate_referring loop, push *ref rather than ref, call
>   ref->remove_reference () and always pass 0 to iterate_referring.
> 
> --- gcc/multiple_target.c.jj  2019-03-13 09:23:35.345567298 +0100
> +++ gcc/multiple_target.c 2019-03-13 10:12:49.071371927 +0100
> @@ -103,10 +103,16 @@ create_dispatcher_calls (struct cgraph_n
>  inode->resolve_alias (cgraph_node::get (resolver_decl));
>  
>auto_vec edges_to_redirect;
> -  auto_vec references_to_redirect;
> +  /* We need to capture the references by value rather than just pointers to 
> them
> + and remove them right away, as removing them later would invalidate what
> + some other reference pointers point to.  */
> +  auto_vec references_to_redirect;
>  
> -  for (unsigned i = 0; node->iterate_referring (i, ref); i++)
> -references_to_redirect.safe_push (ref);
> +  while (node->iterate_referring (0, ref))
> +{
> +  references_to_redirect.safe_push (*ref);
> +  ref->remove_reference ();
> +}
>  
>/* We need to remember NEXT_CALLER as it could be modified in the loop.  */
>for (cgraph_edge *e = node->callers; e ; e = e->next_caller)
> @@ -146,13 +152,11 @@ create_dispatcher_calls (struct cgraph_n
>   }
>  
> symtab_node *source = ref->referring;
> -   ref->remove_reference ();
> source->create_reference (inode, IPA_REF_ADDR);
>   }
> else if (ref->use == IPA_REF_ALIAS)
>   {
> symtab_node *source = ref->referring;
> -   ref->remove_reference ();
> source->create_reference (inode, IPA_REF_ALIAS);
> source->add_to_same_comdat_group (inode);
>   }
> --- gcc/testsuite/gcc.target/i386/pr89684.c.jj2019-03-13 
> 10:12:05.677080120 +0100
> +++ gcc/testsuite/gcc.target/i386/pr89684.c   2019-03-13 10:11:56.390231682 
> +0100
> @@ -0,0 +1,23 @@
> +/* PR ipa/89684 */
> +/* { dg-do compile } */
> +/* { dg-require-ifunc "" } */
> +
> +void bar (int, void (*) (void));
> +
> +__attribute__((target_clones ("default", "avx")))
> +void foo (void)
> +{
> +  bar (0, foo);
> +  bar (0, foo);
> +}
> +
> +__attribute__((target_clones ("default", "avx", "avx2")))
> +void baz (void)
> +{
> +  bar (0, foo);
> +  bar (0, foo);
> +  bar (0, foo);
> +  bar (0, foo);
> +  bar (0, foo);
> +  bar (0, foo);
> +}
> 
>   Jakub
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: Re : add tsv110 pipeline scheduling

2019-03-14 Thread James Greenhalgh
On Sat, Feb 23, 2019 at 01:28:22PM +, wuyuan (E) wrote:
> Hi ,James:
> Sorry for not responding to your email in time because of Chinese New Year’s 
> holiday and urgent work. The three questions you mentioned last email are due 
> to my misunderstanding of pipeline.
> the first question, These instructions will occupy both the tsv110_ls* and 
> tsv110_fsu* Pipeline at the same time.

Hi Wuyuan,

Please accept my apologies for how long it has taken me to revisit your
patch and review it.

I have two questions:

> +(define_insn_reservation "tsv110_crypto_sha256_fast" 2
> +  (and (eq_attr "tune" "tsv110")
> +   (eq_attr "type" "crypto_sha1_fast"))
> +  "tsv110_fsu1")

I think you intended to check for type crypto_sha256_fast here.

> +;; ALU ops with shift
> +(define_insn_reservation "tsv110_alu_shift" 2
> +  (and (eq_attr "tune" "tsv110")
> +   (eq_attr "type" "extend,\
> + alu_shift_imm,alu_shift_reg,\
> + crc,logic_shift_imm,logic_shift_reg,\
> + mov_shift,mvn_shift,\
> + mov_shift_reg,mvn_shift_reg"))
> +  "tsv110_mdu")
> +  
> +(define_insn_reservation "tsv110_alus_shift" 2
> +  (and (eq_attr "tune" "tsv110")
> +   (eq_attr "type" "alus_shift_imm,alus_shift_reg,\
> + logics_shift_imm,logics_shift_reg"))
> +  "tsv110_alu2")

Is this the correct description? This code says that ALU operations with
shift are executed in MDU, but ALU operations with shift that are also
flag setting are executed in ALU2?

Otherwise, this patch is OK for trunk. Thank you for your patience.

Best Regards,
James

> rewritten as follows:
> (define_insn_reservation
>   "tsv110_neon_ld4_lane" 9
>   (and (eq_attr "tune" "tsv110")
>(eq_attr "type" "neon_load4_all_lanes,neon_load4_all_lanes_q,\
>  neon_load4_one_lane,neon_load4_one_lane_q"))
>   "(tsv110_ls1 + tsv110_fsu1)|(tsv110_ls1 + tsv110_fsu2)|(tsv110_ls2 + 
> tsv110_fsu1)|(tsv110_ls2 + tsv110_fsu2)")
> 
> the second question, These instructions will use tsv110_fsu1 Pipeline or 
> tsv110_fsu2 Pipeline.
> rewritten as follows:
> (define_insn_reservation  "tsv110_neon_abd_aba" 4
>   (and (eq_attr "tune" "tsv110")
>(eq_attr "type" "neon_abd,neon_arith_acc"))
>   "tsv110_fsu1|tsv110_fsu2")
> 
> the third question, These instructions will use tsv110_fsu1 Pipeline or 
> tsv110_fsu2 Pipeline.
> rewritten as follows:
> (define_insn_reservation  "tsv110_neon_abd_aba_q" 4
>   (and (eq_attr "tune" "tsv110")
>(eq_attr "type" "neon_arith_acc_q"))
>   "tsv110_fsu1|tsv110_fsu2")
> 
> In addition to the above changes, I asked hardware engineers and colleagues 
> to review my  patch and modify some of the errors. The detailed patches are 
> as follows:
> 
>   * config/aarch64/aarch64-cores.def (tsv1100): Change scheduling model.
>   * config/aarch64/aarch64.md : Add "tsv110.md"
>   * config/aarch64/tsv110.md: New file.
> 


Re: [PATCH] Fix miscompilation due to REG_EQUAL note on a paradoxical subreg operation (PR rtl-optimization/89679)

2019-03-14 Thread Eric Botcazou
> Below is one patch, more in the spirit of
> https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00418.html
> where we refuse to add REG_EQUAL note in that case.
> 
> Attached is another variant, which keeps it around just in case something
> would make use of it, but makes sure we don't fwprop into such REG_EQUAL
> notes replacing a paradoxical subreg in there with something else.
> 
> Both patches were bootstrapped/regtested on
> {x86_64,i686,powerpc64le,s390x,aarch64}-linux, ok for trunk (which one)?
> 
> 2019-03-13  Jakub Jelinek  
> 
>   PR rtl-optimization/89679
>   * expmed.c (expand_mult_const): Don't add a REG_EQUAL note if it
>   would contain a paradoxical SUBREG.

My vote would be for this one.

-- 
Eric Botcazou


[wwwdocs] Update snapshots.html to refer to sha512sum

2019-03-14 Thread Jonathan Wakely

The checksum file uses SHA512 not MD5SUM, so the instructions should
be updated accordingly.

SInce there's only one tarfile these days (not separate ones for
gcc-base, gcc-g++, gcc-fortran, etc) it doesn't seem necessary to
filter out the "OK" lines with grep, but I could add --quiet to the
command if we want that. I added --ignore-missing because I doubt most
people also download the README and index.html files, so the option
avoids gettings warnings about those files.

OK for CVS?

Index: htdocs/snapshots.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/snapshots.html,v
retrieving revision 1.24
diff -u -r1.24 snapshots.html
--- htdocs/snapshots.html	30 Sep 2018 14:38:47 -	1.24
+++ htdocs/snapshots.html	14 Mar 2019 11:42:57 -
@@ -35,15 +35,15 @@
 files so that autoconf et al aren't needed. This is documented in
 comments in contrib/gcc_update.
 
-The program md5sum  which is included with the 
+The program sha512sum  which is included with the 
 https://www.gnu.org/software/coreutils/;>GNU Coreutils
  can be used to verify the integrity of a snapshot or release.
 The release script generates the file 
-MD5SUMS that provides a 128-bit checksum for every
-file in the tarball. Use the following command to verify the
-sources:
+sha512.sum that provides a 512-bit checksum for the tarball
+and other files in the snapshot directory. Use the following command
+to verify the sources:
  
-md5sum --check MD5SUMS | grep -v OK$
+sha512sum --check --ignore-missing sha512.sum
 
 

[PATCH][i386] Fix DECL_RESULT building for resolver

2019-03-14 Thread Richard Biener


The following properly sets DECL_CONTEXT of the result decl we build,
otherwise LTO happily merges those across different resolver decls
ending up in IPA PTA ICEing on this invalidity.

Bootstrap & regtest running on x86_64-unknown-linux-gnu, will apply
as obvious.

Wonder if other targets copied from this, ah, powerpc did,
will commit a powerpc fix alongside (w/o further testing).

Richard.

2019-03-14  Richard Biener  

PR target/89711
* config/i386/i386.c (make_resolver_func): Properly set
DECL_CONTEXT on the RESULT_DECL.
* config/rs6000/rs6000.c (make_resolver_func): Likewise.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 269678)
+++ gcc/config/i386/i386.c  (working copy)
@@ -32572,6 +32572,7 @@ make_resolver_func (const tree default_d
 }
   /* Build result decl and add to function_decl. */
   t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node);
+  DECL_CONTEXT (t) = decl;
   DECL_ARTIFICIAL (t) = 1;
   DECL_IGNORED_P (t) = 1;
   DECL_RESULT (decl) = t;
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 269678)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -37467,6 +37467,7 @@ make_resolver_func (const tree default_d
 
   /* Build result decl and add to function_decl.  */
   tree t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, 
ptr_type_node);
+  DECL_CONTEXT (t) = decl;
   DECL_ARTIFICIAL (t) = 1;
   DECL_IGNORED_P (t) = 1;
   DECL_RESULT (decl) = t;



Re: [PATCH] Fix PR89710

2019-03-14 Thread Richard Biener
On Thu, 14 Mar 2019, Jakub Jelinek wrote:

> On Thu, Mar 14, 2019 at 11:23:03AM +0100, Richard Biener wrote:
> > I am testing the following.  There's IMHO also a missed optimization
> > (for CFG-cleanup?) that when a block does not end up in a call
> > outgoing abnormal edges can be purged?  In this case it is
> > IPA inlining leaving us with this - the inliner calls
> > gimple_purge_dead_abnormal_call_edges on the return block
> > but both cfun->has_nonlocal_label and cfun->calls_setjmp are
> > false so the function does nothing.  This is probably
> > a premature check there?
> 
> I think it would be better to keep the invariant that there are no abnormal
> non-EH edges if both of the cfun bools are false, so that we don't need to
> do useless work in the 99.9% of cases.

I agree.

> In the inliner case, is it that we haven't yet updated those or something
> similar (if yes, could we update them earlier)?

It's the edge still being present when DCE recomputes the flag.

> In the DCE case, can't we when clearing the flag schedule a cleanup and only
> clear it afterwards?

Well, I guess the cleanup would just work iff we'd keep the flag
check in call_can_make_abnormal_goto but remove the flag check
in gimple_purge_dead_abnormal_call_edges.  Not immediately
after DCE but at least inlining gets rid of the edge (and the
abnormal dispatcher) then.

Does that sound reasonable?

Thus like the following, testing in progress.

Richard.

2019-03-14  Richard Biener  

* tree-cfg.c (gimple_purge_dead_abnormal_call_edges): Remove
premature check.

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 0dc94ea41d4..69837328279 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -8670,10 +8665,6 @@ gimple_purge_dead_abnormal_call_edges (basic_block bb)
   edge_iterator ei;
   gimple *stmt = last_stmt (bb);
 
-  if (!cfun->has_nonlocal_label
-  && !cfun->calls_setjmp)
-return false;
-
   if (stmt && stmt_can_make_abnormal_goto (stmt))
 return false;
 


Re: [PR fortran/60091, patch] - Misleading error messages in rank-2 pointer assignment to rank-1 target

2019-03-14 Thread Dominique d'Humières



> Le 13 mars 2019 à 13:39, Harald Anlauf  a écrit :
> 
> Hi Thomas,
> 
> I am not so convinced that "plain english" messages are the way to go,
> even if they appear better readable at first sight, if conciseness is
> lost.

Well, "Syntax error" is concise, but not really helpful!

>  The main reason is that there three variants of the messages,
> depending on context.  One of them refers to expecting either a
> bounds-specification-list or a bounds-remapping-list.
> 
> Do you prefer sth. like
> 
> "All lower bounds and all or none of the upper bounds must be specified"

This is what I expect for p(:,:)=>a and I won’t complain if "use the later for 
bound remapping" is added
when the target is a rank 1 array.

For mat(2, 6)   => arr I would prefer the above error rather than "Expected 
bounds specification ».

For p(1:,1:)=>a where a is a rank 1 target, I’ll go with

"all the upper bounds must be specified for bound remapping"

> or
> 
> "Either all or none of the upper bounds must be specified at (1)"

This is what I expect for p(1:3,1:)=>a with the same remark as above about 
bound remapping.

> (which we currently print in another context where it is wrong),
> 
> while other compilers print:
> 
> E.g. Crayftn:
> 
>  p(1 ,2:3) => t
>^
> ftn-1768 crayftn: ERROR SUB, File = ptr-remap.f90, Line = 3, Column = 5 
>  Invalid bounds-spec-list or bounds-remapping-list for this pointer 
> assignment.
> 
> E.g. Intel:
> 
> ptr-remap.f90(3): error #8524: The syntax of this data pointer assignment is 
> incorrect: either 'bound spec' or 'bound remapping' is expected in this 
> context.   [1]
>  p(1 ,2:3) => t
> ^
> 
> Pointer remapping belongs IMHO to the 'more advanced’ features

Agreed

> and requires
> some technical insight to get it right, which is why I think the related
> error messages should be more technical and concise.

Here I disagree, the error message should try to tell the user what is wrong
without requiring any access to the standard.

> 
> I'll think for another day or two.
> 
> Thanks,
> Harald

These defines should probably be swapped

+#define BOUNDS_SPEC_LIST "list of %"
+#define BOUNDS_REMAPPING_LIST "list of %"

to match

> R1035 bounds-spec is lower-bound-expr :
> R1036 bounds-remapping is lower-bound-expr : upper-bound-exp

Dominique



Re: [PATCH] Fix PR89710

2019-03-14 Thread Jakub Jelinek
On Thu, Mar 14, 2019 at 11:23:03AM +0100, Richard Biener wrote:
> I am testing the following.  There's IMHO also a missed optimization
> (for CFG-cleanup?) that when a block does not end up in a call
> outgoing abnormal edges can be purged?  In this case it is
> IPA inlining leaving us with this - the inliner calls
> gimple_purge_dead_abnormal_call_edges on the return block
> but both cfun->has_nonlocal_label and cfun->calls_setjmp are
> false so the function does nothing.  This is probably
> a premature check there?

I think it would be better to keep the invariant that there are no abnormal
non-EH edges if both of the cfun bools are false, so that we don't need to
do useless work in the 99.9% of cases.
In the inliner case, is it that we haven't yet updated those or something
similar (if yes, could we update them earlier)?
In the DCE case, can't we when clearing the flag schedule a cleanup and only
clear it afterwards?

Jakub


Re: [PATCH][RFC] Teach GIMPLE FE to build proper CFG + SSA (+ loops)

2019-03-14 Thread Richard Biener
On Wed, 13 Mar 2019, Richard Biener wrote:

> On Wed, 13 Mar 2019, Bin.Cheng wrote:
> 
> > On Wed, Mar 13, 2019 at 3:58 AM Richard Biener  wrote:
> > >
> > >
> > > This makes an attempt at fixing the most annoying parts of the GIMPLE
> > > FE unit testing - the lack of proper CFG preservation and hoops you
> > > need to jump through to make the CFG and SSA builders happy.
> > >
> > > Due to this the __GIMPLE specifiers takes two new flags, "cfg"
> > > for GIMPLE-with-a-CFG and "ssa" for GIMPLE-with-a-CFG-and-SSA.
> > > When there is a CFG you need to start basic-block boundaries with
> > >
> > > __BB(index):
> > >
> > > where 'index' is the basic-block index.  That implicitely defines
> > > a label __BBindex for use in goto stmts and friends (but doesn't
> > > actually add a GIMPLE_LABEL).
> > >
> > > The parsing code isn't defensive right now so you need to watch
> > > out to not use index 0 or 1 (entry and exit block) which are
> > > only implicitely present.
> > >
> > > As a proof of concept I added one BB annotation - loop_header(num)
> > > where "num" is the loop number.  This means you can now also
> > > have loop structures preserved (to some extent - the loop tree is
> > > not explicitely represented nor are loop fathers, both are re-built
> > > via fixup).
> > >
> > > I've not yet adjusted -gimple dumping.
> > >
> > > I've adjusted all testcases I could find.
> > >
> > > The CFG build inside the frontend is a bit awkward (spread out...),
> > > likewise some functionality is asking for split-out.
> > >
> > > This is mainly a RFC for whether you think this is forward looking
> > > enough.  Future enhancements would include EH and abnormal edges
> > > via stmt annotations:
> > Thanks very much for doing this.  Do we have a centralized
> > wiki/document on the formal definition?  It would be very helpful even
> > it's still evolving, otherwise we would need to dig into messages for
> > fragmented clues.
> 
> Earlier this week I transformed the old 
> https://gcc.gnu.org/wiki/GimpleFrontEnd page into one documenting
> the current state but this doesn't yet include any kind of documentation.
> 
> I suppose special syntax documentation should go into our texinfo
> docs, I guess into a new section in the internals manual.  Currently
> we only have documentation for the -fgimple switch.
> 
> Note that I view -gimple dumping as the thing that should give you
> a clue about expected syntax.  I hope to spend a little more time
> on the patch to make it ready for GCC9.

I've settled on another change, forcing explicit gotos for fallthru
edges to the next block.

Otherwise I've cleaned up the patch and removed no longer necessary
support for IFN_PHI from the middle-end.  During cleanup I introduced
a gimple_parser class to hold the extra parsing state and I added
some extra error handling.

Bootstrapped on x86_64-unknown-linux-gnu, testing is still in progress.

I will commit this to make testcase extraction to GIMPLE more feasible
for GCC 9+ (we could even backport the GIMPLE FE changes to GCC 8 if 
needed I guess).  At least it will give me motivation to do further
required changes when I run into the next big LTO testcase running
into a loop optimizer issue... (non-loop passes should work fine
already).

Richard.

>From 5cbec540caa8e86f2167ac980159dbe6933717c4 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Wed, 13 Mar 2019 13:58:10 +0100
Subject: [PATCH] gimplefe-cfg-ssa

2019-03-12  Richard Biener  

c/
* c-tree.h (enum c_declspec_il): New.
(struct c_declspecs): Merge gimple_p and rtl_p into declspec_il
enum bitfield.
* c-parser.c (c_parser_declaration_or_fndef): Adjust accordingly.
Pass start pass and declspec_il to c_parser_parse_gimple_body.
(c_parser_declspecs): Adjust.
* gimple-parser.c: Include cfg.h, cfghooks.h, cfganal.h, tree-cfg.h,
gimple-iterator.h, cfgloop.h, tree-phinodes.h, tree-into-ssa.h
and bitmap.h.
(struct gimple_parser): New.
(gimple_parser::push_edge): New method.
(c_parser_gimple_parse_bb_spec): New helper.
(c_parser_parse_gimple_body): Get start pass and IL specification.
Initialize SSA and CFG.
(c_parser_gimple_compound_statement): Handle CFG and SSA build.
Build a gimple_parser parsing state and pass it along.
(c_parser_gimple_statement): Change intermittend __PHI internal
function argument for the edge.
(c_parser_gimple_or_rtl_pass_list): Handle ssa, cfg flags.
(c_parser_gimple_goto_stmt): Record edges to build.
(c_parser_gimple_if_stmt): Likewise.
* gimple-parser.h (c_parser_parse_gimple_body): Adjust.
(c_parser_gimple_or_rtl_pass_list): Likewise.
* gimple-pretty-print.c: Include cfgloop.h.
(dump_gimple_phi): Adjust.
(dump_gimple_bb_header): Dump loop header for GIMPLE.
(pp_cfg_jump): Adjust.
(dump_implicit_edges): Dump fallthru to next block 

Re: patch to fix PR85860

2019-03-14 Thread Uros Bizjak
Hello!

> The following patch fixes
>
>  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85860
>
> The patch was successfully bootstrapped and tested on x86-64.
>
> Committed as rev. 269662 to trunk and as rev. 269663 to gcc-8-branch.

Index: testsuite/gcc.target/i386/pr85860.c
===
--- testsuite/gcc.target/i386/pr85860.c (nonexistent)
+++ testsuite/gcc.target/i386/pr85860.c (working copy)
@@ -0,0 +1,23 @@
+/* { dg-do compile { target lp64 } } */

{ target int128 }

+/* { dg-options "-O2 -fno-guess-branch-probability
-flive-range-shrinkage -mbmi2" } */

Uros.


[PATCH] Fix PR89710

2019-03-14 Thread Richard Biener


I am testing the following.  There's IMHO also a missed optimization
(for CFG-cleanup?) that when a block does not end up in a call
outgoing abnormal edges can be purged?  In this case it is
IPA inlining leaving us with this - the inliner calls
gimple_purge_dead_abnormal_call_edges on the return block
but both cfun->has_nonlocal_label and cfun->calls_setjmp are
false so the function does nothing.  This is probably
a premature check there?

Anyhow, the fix below is safer at this point (also for backporting),
but it's the 2nd spot of this kind I fix so I wonder if we should
fix the above for GCC 9?  The real issue might be that we
are removing the "last" returns_twice call but retain
abnormal edges not originating from it (and we'll never cleanup
the rest because of those early outs).  DCE is the one that
recomputes calls_setjmp but does not remove abnormal edges.
Other passes might also elide such calls but they leave
calls_setjmp as-is.  I'm a bit nervous with an idea that
DCE can purge all abnormal edges if !cfun->calls_setjmp
&& !cfun->has_nonlocal_label.  Anyway, looks a bit like a
can of worms and it's better to not rely on abnormal edges
not being present when the block is empty.

Richard.

2019-03-14  Richard Biener  

PR tree-optimization/89710
* tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Use
safe_dyn_cast.

* gcc.dg/torture/pr89710.c: New testcase.

Index: gcc/tree-ssa-loop-ch.c
===
--- gcc/tree-ssa-loop-ch.c  (revision 269678)
+++ gcc/tree-ssa-loop-ch.c  (working copy)
@@ -100,7 +100,7 @@ should_duplicate_loop_header_p (basic_bl
   return false;
 }
 
-  gcond *last = dyn_cast  (last_stmt (header));
+  gcond *last = safe_dyn_cast  (last_stmt (header));
   if (!last)
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
Index: gcc/testsuite/gcc.dg/torture/pr89710.c
===
--- gcc/testsuite/gcc.dg/torture/pr89710.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr89710.c  (working copy)
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+
+void
+gm (int *);
+
+__attribute__ ((returns_twice)) void
+jg (void)
+{
+}
+
+void
+eb (void)
+{
+  int r6 = 0;
+
+  if (r6 != 0)
+gm ();
+}
+
+void
+gm (int *r6)
+{
+  jg ();
+
+  for (;;)
+{
+  eb ();
+  *r6 = 0;
+}
+}


diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 0dc94ea41d4..69837328279 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -2585,13 +2585,8 @@ debug_cfg_stats (void)
 static bool
 call_can_make_abnormal_goto (gimple *t)
 {
-  /* If the function has no non-local labels, then a call cannot make an
+  /* If the call has no side effects, then it cannot make an
  abnormal transfer of control.  */
-  if (!cfun->has_nonlocal_label
-  && !cfun->calls_setjmp)
-   return false;
-
-  /* Likewise if the call has no side effects.  */
   if (!gimple_has_side_effects (t))
 return false;
 
@@ -8670,10 +8665,6 @@ gimple_purge_dead_abnormal_call_edges (basic_block bb)
   edge_iterator ei;
   gimple *stmt = last_stmt (bb);
 
-  if (!cfun->has_nonlocal_label
-  && !cfun->calls_setjmp)
-return false;
-
   if (stmt && stmt_can_make_abnormal_goto (stmt))
 return false;
 


Re: [PATCH] Make hashtab/hash-table elt removal not fail on removing non-existent entry

2019-03-14 Thread Jakub Jelinek
On Thu, Mar 14, 2019 at 09:57:27AM +0100, Richard Biener wrote:
> I'd say make it work, thus your patch below is OK.

Ok, here is an updated patch with some self-tests coverage as well that I'll
bootstrap/regtest.

2019-03-14  Jason Merrill  
Jakub Jelinek  

* hash-table.h (remove_elt_with_hash): Return if slot is NULL rather
than if is_empty (*slot).
* hash-set-tests.c (test_set_of_strings): Add tests for addition of
existing elt and for elt removal.
* hash-map-tests.c (test_map_of_strings_to_int): Add test for removal
of already removed elt.

* hashtab.c (htab_remove_elt_with_hash): Return if slot is NULL rather
than if *slot is HTAB_EMPTY_ENTRY.

--- gcc/hash-table.h.jj 2019-01-01 12:37:17.446970209 +0100
+++ gcc/hash-table.h2019-03-14 08:48:18.861131498 +0100
@@ -940,7 +940,7 @@ hash_table
 ::remove_elt_with_hash (const compare_type , hashval_t hash)
 {
   value_type *slot = find_slot_with_hash (comparable, hash, NO_INSERT);
-  if (is_empty (*slot))
+  if (slot == NULL)
 return;
 
   Descriptor::remove (*slot);
--- gcc/hash-set-tests.c.jj 2019-01-01 12:37:21.648901265 +0100
+++ gcc/hash-set-tests.c2019-03-14 10:06:04.688950764 +0100
@@ -48,11 +48,26 @@ test_set_of_strings ()
   ASSERT_EQ (false, s.add (red));
   ASSERT_EQ (false, s.add (green));
   ASSERT_EQ (false, s.add (blue));
+  ASSERT_EQ (true, s.add (green));
 
   /* Verify that the values are now within the set.  */
   ASSERT_EQ (true, s.contains (red));
   ASSERT_EQ (true, s.contains (green));
   ASSERT_EQ (true, s.contains (blue));
+  ASSERT_EQ (3, s.elements ());
+
+  /* Test removal.  */
+  s.remove (red);
+  ASSERT_EQ (false, s.contains (red));
+  ASSERT_EQ (true, s.contains (green));
+  ASSERT_EQ (true, s.contains (blue));
+  ASSERT_EQ (2, s.elements ());
+
+  s.remove (red);
+  ASSERT_EQ (false, s.contains (red));
+  ASSERT_EQ (true, s.contains (green));
+  ASSERT_EQ (true, s.contains (blue));
+  ASSERT_EQ (2, s.elements ());
 }
 
 /* Run all of the selftests within this file.  */
--- gcc/hash-map-tests.c.jj 2019-01-21 20:46:21.963092085 +0100
+++ gcc/hash-map-tests.c2019-03-14 10:06:56.994105872 +0100
@@ -78,6 +78,10 @@ test_map_of_strings_to_int ()
   ASSERT_EQ (5, m.elements ());
   ASSERT_EQ (NULL, m.get (eric));
 
+  m.remove (eric);
+  ASSERT_EQ (5, m.elements ());
+  ASSERT_EQ (NULL, m.get (eric));
+
   /* A plain char * key is hashed based on its value (address), rather
  than the string it points to.  */
   char *another_ant = static_cast  (xcalloc (4, 1));
--- libiberty/hashtab.c.jj  2019-01-01 12:38:46.868503025 +0100
+++ libiberty/hashtab.c 2019-03-14 08:47:49.172612001 +0100
@@ -725,7 +725,7 @@ htab_remove_elt_with_hash (htab_t htab,
   PTR *slot;
 
   slot = htab_find_slot_with_hash (htab, element, hash, NO_INSERT);
-  if (*slot == HTAB_EMPTY_ENTRY)
+  if (slot == NULL)
 return;
 
   if (htab->del_f)


Jakub


Re: [PATCH] Make hashtab/hash-table elt removal not fail on removing non-existent entry

2019-03-14 Thread Richard Biener
On Thu, 14 Mar 2019, Jakub Jelinek wrote:

> On Wed, Mar 13, 2019 at 10:08:09PM -0400, Jason Merrill wrote:
> > > The following testcase ICEs since my recent cxx_eval_loop_expr changes.
> > > The problem is that the Forget saved values of SAVE_EXPRs. inside of the
> > > loop can remove SAVE_EXPRs from new_ctx.values and if that is the last
> > > iteration, we can also do the loop at the end of function (which has been
> > > added there mainly to handle cases where the main loop breaks earlier)
> > > and new_ctx.values->remove ICEs because *iter is already not in the table.
> > 
> > It shouldn't ICE, remove_elt_with_hash is documented to do nothing if the
> > element is not in the table.
> > 
> > Does this untested patch fix the test?
> 
> This changed in https://gcc.gnu.org/ml/gcc-patches/2000-02/msg00988.html
> Before that change, it has been documented that:
> "Naturally the hash table must already exist."
> That patch changed it to do
>   slot = htab_find_slot (htab, element, 0);
>   if (*slot == EMPTY_ENTRY)
> return;
> but htab_find_slot actually never returns *slot == EMPTY_ENTRY if called
> with insert 0:
> +   if (!insert)
> + return NULL;
> It can return *slot == EMPTY_ENTRY only if insert is non-zero, and
> otherwise it must be (for both cases) a non-deleted non-empty entry equal to
> the one we are looking for.
> 
> Thus, I believe what is documented since that patch has never worked and we
> are wasting cycles (if not optimized out) testing something impossible.
> 
> So, I think our options are either to make it work, like below (untested),
> or remove those ifs altogether and put back the
> "Naturally the hash table must already exist." comment.

I'd say make it work, thus your patch below is OK.

Richard.

> 2019-03-14  Jason Merrill  
>   Jakub Jelinek  
> 
>   * hash-table.h (remove_elt_with_hash): Return if slot is NULL rather
>   than if is_empty (*slot).
> 
>   * hashtab.c (htab_remove_elt_with_hash): Return if slot is NULL rather
>   than if *slot is HTAB_EMPTY_ENTRY.
> 
> --- gcc/hash-table.h.jj   2019-01-01 12:37:17.446970209 +0100
> +++ gcc/hash-table.h  2019-03-14 08:48:18.861131498 +0100
> @@ -940,7 +940,7 @@ hash_table
>  ::remove_elt_with_hash (const compare_type , hashval_t hash)
>  {
>value_type *slot = find_slot_with_hash (comparable, hash, NO_INSERT);
> -  if (is_empty (*slot))
> +  if (slot == NULL)
>  return;
>  
>Descriptor::remove (*slot);
> --- libiberty/hashtab.c.jj2019-01-01 12:38:46.868503025 +0100
> +++ libiberty/hashtab.c   2019-03-14 08:47:49.172612001 +0100
> @@ -725,7 +725,7 @@ htab_remove_elt_with_hash (htab_t htab,
>PTR *slot;
>  
>slot = htab_find_slot_with_hash (htab, element, hash, NO_INSERT);
> -  if (*slot == HTAB_EMPTY_ENTRY)
> +  if (slot == NULL)
>  return;
>  
>if (htab->del_f)
> 
> 
>   Jakub
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Fix up tree-ssa-strlen.c ICEs (PR tree-optimization/89703)

2019-03-14 Thread Richard Biener
On Wed, 13 Mar 2019, Jakub Jelinek wrote:

> Hi!
> 
> The C FE sadly passes through some really bad prototypes of builtin
> functions as "harmless":
>   /* Accept "harmless" mismatches in function types such
>  as missing qualifiers or pointer vs same size integer
>  mismatches.  This is for the ffs and fprintf builtins.
>  However, with -Wextra in effect, diagnose return and
>  argument types that are incompatible according to
>  language rules.  */
> In the strlen pass, we'd better have pointers where we expect pointers and
> rightly sized integers where we expect integers.  The following patch
> ensures that by calling gimple_builtin_call_types_compatible_p against
> the builtin_decl_explicit decl.
> 
> Furthermore, I've noticed that a bunch of later added builtins that
> the strlen pass now handles weren't added to valid_builtin_call, so the
> patch adds that too (that is to handle cases where e.g. somebody declares
> strncat as pure, or strcmp as const etc.).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2019-03-13  Jakub Jelinek  
> 
>   PR tree-optimization/89703
>   * tree-ssa-strlen.c (valid_builtin_call): Punt if stmt call types
>   aren't compatible also with builtin_decl_explicit.  Check pure
>   or non-pure status of BUILT_IN_STR{{,N}CMP,N{LEN,{CAT,CPY}{,_CHK}}}
>   and BUILT_IN_STPNCPY{,_CHK}.
> 
>   * gcc.c-torture/compile/pr89703-1.c: New test.
>   * gcc.c-torture/compile/pr89703-2.c: New test.
> 
> --- gcc/tree-ssa-strlen.c.jj  2019-02-26 21:33:38.992826180 +0100
> +++ gcc/tree-ssa-strlen.c 2019-03-13 19:53:20.152551962 +0100
> @@ -971,12 +971,21 @@ valid_builtin_call (gimple *stmt)
>  return false;
>  
>tree callee = gimple_call_fndecl (stmt);
> +  tree decl = builtin_decl_explicit (DECL_FUNCTION_CODE (callee));
> +  if (decl
> +  && decl != callee
> +  && !gimple_builtin_call_types_compatible_p (stmt, decl))
> +return false;
> +
>switch (DECL_FUNCTION_CODE (callee))
>  {
>  case BUILT_IN_MEMCMP:
>  case BUILT_IN_MEMCMP_EQ:
> +case BUILT_IN_STRCMP:
> +case BUILT_IN_STRNCMP:
>  case BUILT_IN_STRCHR:
>  case BUILT_IN_STRLEN:
> +case BUILT_IN_STRNLEN:
>/* The above functions should be pure.  Punt if they aren't.  */
>if (gimple_vdef (stmt) || gimple_vuse (stmt) == NULL_TREE)
>   return false;
> @@ -991,10 +1000,16 @@ valid_builtin_call (gimple *stmt)
>  case BUILT_IN_MEMSET:
>  case BUILT_IN_STPCPY:
>  case BUILT_IN_STPCPY_CHK:
> +case BUILT_IN_STPNCPY:
> +case BUILT_IN_STPNCPY_CHK:
>  case BUILT_IN_STRCAT:
>  case BUILT_IN_STRCAT_CHK:
>  case BUILT_IN_STRCPY:
>  case BUILT_IN_STRCPY_CHK:
> +case BUILT_IN_STRNCAT:
> +case BUILT_IN_STRNCAT_CHK:
> +case BUILT_IN_STRNCPY:
> +case BUILT_IN_STRNCPY_CHK:
>/* The above functions should be neither const nor pure.  Punt if they
>aren't.  */
>if (gimple_vdef (stmt) == NULL_TREE || gimple_vuse (stmt) == NULL_TREE)
> --- gcc/testsuite/gcc.c-torture/compile/pr89703-1.c.jj2019-03-13 
> 19:56:10.619785685 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr89703-1.c   2019-03-13 
> 19:55:48.490146110 +0100
> @@ -0,0 +1,13 @@
> +/* PR tree-optimization/89703 */
> +
> +typedef __SIZE_TYPE__ size_t;
> +extern char *strlen (const char *);
> +extern char *strnlen (const char *, size_t);
> +extern char c[2];
> +
> +void
> +foo (char **q)
> +{
> +  q[0] = strlen (c);
> +  q[1] = strnlen (c, 2);
> +}
> --- gcc/testsuite/gcc.c-torture/compile/pr89703-2.c.jj2019-03-13 
> 19:56:13.495738838 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr89703-2.c   2019-03-13 
> 19:55:55.310035032 +0100
> @@ -0,0 +1,13 @@
> +/* PR tree-optimization/89703 */
> +
> +typedef __SIZE_TYPE__ size_t;
> +extern void *memcpy (void *, const void *, size_t);
> +extern char *strlen (const char *);
> +extern char c[2];
> +
> +void
> +foo (char **q)
> +{
> +  memcpy (c, "a", 2);
> +  q[0] = strlen (c);
> +}
> 
>   Jakub
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: V2 [PATCH] i386: Handle REG_EH_REGION note

2019-03-14 Thread H.J. Lu
On Thu, Mar 14, 2019 at 4:34 PM Jakub Jelinek  wrote:
>
> On Thu, Mar 14, 2019 at 11:30:21AM +0800, H.J. Lu wrote:
> > We need to split the basic block if we create new insns, which may
> > throw exceptions, in the middle of the basic blocks.
> >
> > Tested on AVX2 and AVX512 machines with and without
> >
> > --with-arch=native
> >
> > OK for trunk?
>
> That looks much better, I see you chose to follow what lower-subreg.c does
> here.  I just wonder if instead of the sbitmap of blocks to check for 
> splitting it
> wouldn't be more efficient to have an auto_vec holding the exact
> set_insns that need splitting after them and just grab the bb using
> BLOCK_FOR_INSN.
>

Good idea.  I will work on it.

Thanks.

-- 
H.J.


Re: V2 [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions

2019-03-14 Thread Uros Bizjak
On Thu, Mar 14, 2019 at 9:22 AM H.J. Lu  wrote:

> > > We can't use the %X5 since %X5 is used on operands.
> >
> > So, please introduce some other modifier ("X" was not to be taken
> > literally, but *some* letter). Why are you overloading 'P'?
>
> Here is the updated patch with the 'M' modifier.
>
>
> gcc/
>
> PR target/89523
> * config/i386/i386.c (ix86_print_operand): Handle 'M' to add
> addr32 prefix to VSIB address for X32.
> * config/i386/sse.md (*avx512pf_gatherpfsf_mask): Prepend
> "%M2" to opcode.
> (*avx512pf_gatherpfdf_mask): Likewise.
> (*avx512pf_scatterpfsf_mask): Likewise.
> (*avx512pf_scatterpfdf_mask): Likewise.
> (*avx2_gathersi): Prepend "%M3" to opcode.
> (*avx2_gathersi_2): Prepend "%M2" to opcode.
> (*avx2_gatherdi): Prepend "%M3" to opcode.
> (*avx2_gatherdi_2): Prepend "%M2" to opcode.
> (*avx2_gatherdi_3): Prepend "%M3" to opcode.
> (*avx2_gatherdi_4): Prepend "%M2" to opcode.`
> (*avx512f_gathersi): Prepend "%M4" to opcode.
> (*avx512f_gathersi_2): Prepend "%M3" to opcode.
> (*avx512f_gatherdi): Prepend "%M4" to opcode.
> (*avx512f_gatherdi_2): Prepend "%M3" to opcode.
> (*avx512f_scattersi): Prepend "%M0" to opcode.
> (*avx512f_scatterdi): Likewise.
>
> gcc/testsuite/
>
> PR target/89523
> * gcc.target/i386/pr89523-1a.c: New test.
> * gcc.target/i386/pr89523-1b.c: Likewise.
> * gcc.target/i386/pr89523-2.c: Likewise.
> * gcc.target/i386/pr89523-3.c: Likewise.
> * gcc.target/i386/pr89523-4.c: Likewise.
> * gcc.target/i386/pr89523-5.c: Likewise.
> * gcc.target/i386/pr89523-6.c: Likewise.
> * gcc.target/i386/pr89523-7.c: Likewise.
> * gcc.target/i386/pr89523-8.c: Likewise.
> * gcc.target/i386/pr89523-9.c: Likewise.

The patch looks safe and relatively non-intrusive, so OK for gcc-9 and
backports.

Thanks,
Uros.


Re: [PATCH] Fix up tree-ssa-strlen.c ICEs (PR tree-optimization/89703)

2019-03-14 Thread Jakub Jelinek
On Thu, Mar 14, 2019 at 12:50:28AM +, Joseph Myers wrote:
> > The C FE sadly passes through some really bad prototypes of builtin
> > functions as "harmless":
> >   /* Accept "harmless" mismatches in function types such
> >  as missing qualifiers or pointer vs same size integer
> >  mismatches.  This is for the ffs and fprintf builtins.
> >  However, with -Wextra in effect, diagnose return and
> >  argument types that are incompatible according to
> >  language rules.  */
> 
> Note that this isn't just about pre-standard headers with unexpected types 
> (if it were, it might be obsolete).  It's also needed for building glibc 
> (to build glibc with -Wextra with GCC trunk, one of the -Wno- options 
> needed is -Wno-builtin-declaration-mismatch), because of how various code 
> creates function aliases between functions that have the same ABI but 
> different types (long versus int, etc.).

I was mainly woried about the pointer vs same sized integer mismatches.
Seems we refuse them on arguments:
  /* Fail for types with incompatible modes/sizes.  */
  if (TYPE_MODE (TREE_VALUE (oldargs))
  != TYPE_MODE (TREE_VALUE (newargs)))
return NULL_TREE;
  /* Fail for function and object pointer mismatches.  */
  if ((FUNCTION_POINTER_TYPE_P (oldtype)
   != FUNCTION_POINTER_TYPE_P (newtype))
  || POINTER_TYPE_P (oldtype) != POINTER_TYPE_P (newtype))
return NULL_TREE;
but not on the return type, where there is just:
  if (TYPE_MODE (oldrettype) != TYPE_MODE (newrettype))
return NULL_TREE;
Whether a builtin returns a pointer or integer is at least for the
middle-end quite important difference, so I'm just surprised more bugs in
other passes haven't been filed yet.  I can deal with that in the
tree-ssa-strlen.c pass with the patch I've posted, just am worried about
potential issues elsewhere.

Jakub


Re: V2 [PATCH] i386: Handle REG_EH_REGION note

2019-03-14 Thread Jakub Jelinek
On Thu, Mar 14, 2019 at 11:30:21AM +0800, H.J. Lu wrote:
> We need to split the basic block if we create new insns, which may
> throw exceptions, in the middle of the basic blocks.
> 
> Tested on AVX2 and AVX512 machines with and without
> 
> --with-arch=native
> 
> OK for trunk?

That looks much better, I see you chose to follow what lower-subreg.c does
here.  I just wonder if instead of the sbitmap of blocks to check for splitting 
it
wouldn't be more efficient to have an auto_vec holding the exact
set_insns that need splitting after them and just grab the bb using
BLOCK_FOR_INSN.

Jakub


Re: V2 [PATCH] x32: Add addr32 prefix to UNSPEC_VSIBADDR instructions

2019-03-14 Thread H.J. Lu
On Mon, Mar 4, 2019 at 10:09 PM Uros Bizjak  wrote:
>
> On Mon, Mar 4, 2019 at 2:54 PM H.J. Lu  wrote:
> >
> > On Sun, Mar 03, 2019 at 10:34:29PM +0100, Uros Bizjak wrote:
> > > On Sun, Mar 3, 2019 at 10:18 PM H.J. Lu  wrote:
> > > >
> > > > On Sun, Mar 3, 2019 at 9:27 AM Uros Bizjak  wrote:
> > > > >
> > > > > On Thu, Feb 28, 2019 at 8:10 PM H.J. Lu  wrote:
> > > > > >
> > > > > > 32-bit indices in VSIB address are sign-extended to 64 bits.  In 
> > > > > > x32,
> > > > > > when 32-bit indices are used as addresses, like in
> > > > > >
> > > > > > vgatherdps %ymm7, 0(,%ymm9,1), %ymm6
> > > > > >
> > > > > > 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 
> > > > > > which
> > > > > > is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR 
> > > > > > instructions
> > > > > > for x32 if there is no base register nor symbol.
> > > > > >
> > > > > > This fixes 175.vpr and 254.gap in SPEC CPU 2000 on x32 with
> > > > > >
> > > > > > -Ofast -funroll-loops -march=haswell
> > > > >
> > > > > 1. Testcases 2 to 9 fail on fedora-29 with:
> > > > >
> > > > > In file included from /usr/include/features.h:452,
> > > > >  from /usr/include/bits/libc-header-start.h:33,
> > > > >  from /usr/include/stdlib.h:25,
> > > > >  from 
> > > > > /ssd/uros/gcc-build-fast/gcc/include/mm_malloc.h:27,
> > > > >  from 
> > > > > /ssd/uros/gcc-build-fast/gcc/include/xmmintrin.h:34,
> > > > >  from 
> > > > > /ssd/uros/gcc-build-fast/gcc/include/immintrin.h:29,
> > > > >  from
> > > > > /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.target/i386/pr89523-2.c:7:
> > > > > /usr/include/gnu/stubs.h:13:11: fatal error: gnu/stubs-x32.h: No such
> > > > > file or directory
> > > >
> > > > I will update tests to remove  "#include immintrin.h"
> > > >
> > > > > 2. Does the patch work with -maddress-mode={short,long}?
> > > >
> > > > Yes.
> > > >
> > > > > 3. The implementation is wrong. You should use operand substitution
> > > > > with VSIB address as operand, not substitution without operand.
> > > >
> > > > How can I add an addr32 prefix with operand substitution?  This is
> > > > very similar to "%^".  My updated patch will use "%^".
> > >
> > > Yes, using %^ is what I think would be the optimal solution. Other
> > > than that, in your proposed patch, operand-less %_ scans the entire
> > > current_output_insn to dig to the UNSPEC_VSIBADDR. You can just use
> > > operand substitution, and do e.g. "%X2vgatherpf0..." where 'X'
> > > processes operand 2 (vsib_address_operand) and conditionally outputs
> > > addr32.
> > >
> > > BTW: In a new version of the patch, please specify what is changed
> > > from the previous version. Otherwise, review of a new version is more
> > > or less a guesswork what changed.
> > >
> >
> > Here is the updated patch.  The change is
> >
> > return "%P5vscatterpf1ps\t{%5%{%0%}|%X5%{%0%}}";
> >
> > instead of
> >
> > return "%^vscatterpf1ps\t{%5%{%0%}|%X5%{%0%}}";
>
> Did I miss some version of the patch that introduced %^? You used %_
> in your previous patch. Did your try with %^?

Yes.   It is very similar to

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg02109.html

with

+ case '^':
+   if (TARGET_X32)
+ {
+   subrtx_var_iterator::array_type array;
+   FOR_EACH_SUBRTX_VAR (iter, array,
+PATTERN (current_output_insn), ALL)
+ {
+   rtx addr = *iter;
+   if (!MEM_P (addr))
+ continue;
+   addr = XEXP (addr, 0);
+   if (GET_CODE (addr) == UNSPEC
+   && XINT (addr, 1) == UNSPEC_VSIBADDR)
+ {
+   /* NB: 32-bit indices in VSIB address are
+ sign-extended to 64 bits. In x32, if 32-bit
+ address 0xf7fa3010 is sign-extended to
+ 0xf7fa3010 which is invalid address.
+ Add addr32 prefix if there is no base register
+ nor symbol.  */
+   bool ok;
+   struct ix86_address parts;
+   ok = ix86_decompose_address (XVECEXP (addr, 0, 0),
+);
+   gcc_assert (ok && parts.index == NULL_RTX);
+   if (parts.base == NULL_RTX
+   && (parts.disp == NULL_RTX
+   || !symbolic_operand (parts.disp,
+ GET_MODE (parts.disp
+ fputs ("addr32 ", file);
+   break;
+ }
+ }
+ }

> > We can't use the %X5 since %X5 is used on operands.
>
> So, please introduce some other modifier ("X" was not to be taken
> literally, but *some* letter). Why are you overloading 'P'?

Here is the updated patch with the 'M' modifier.


H.J.


> I don't know why are you using operand 5 here, you can use operand 2 directly.


> Uros.
>
> > I also added a test for -maddress-mode=long.
> >
> >
> > H.J.
> > ---
> > 32-bit indices in VSIB address are sign-extended to 64 bits.  In x32,
> > when 32-bit indices are used as addresses, like in
> >
> > vgatherdps %ymm7, 0(,%ymm9,1), %ymm6
> >
> > 32-bit indices, 0xf7fa3010, is sign-extended to 0xf7fa3010 which
> > is invalid address.  Add addr32 prefix to UNSPEC_VSIBADDR instructions
> > for x32 if there is no 

[PATCH] Make hashtab/hash-table elt removal not fail on removing non-existent entry

2019-03-14 Thread Jakub Jelinek
On Wed, Mar 13, 2019 at 10:08:09PM -0400, Jason Merrill wrote:
> > The following testcase ICEs since my recent cxx_eval_loop_expr changes.
> > The problem is that the Forget saved values of SAVE_EXPRs. inside of the
> > loop can remove SAVE_EXPRs from new_ctx.values and if that is the last
> > iteration, we can also do the loop at the end of function (which has been
> > added there mainly to handle cases where the main loop breaks earlier)
> > and new_ctx.values->remove ICEs because *iter is already not in the table.
> 
> It shouldn't ICE, remove_elt_with_hash is documented to do nothing if the
> element is not in the table.
> 
> Does this untested patch fix the test?

This changed in https://gcc.gnu.org/ml/gcc-patches/2000-02/msg00988.html
Before that change, it has been documented that:
"Naturally the hash table must already exist."
That patch changed it to do
  slot = htab_find_slot (htab, element, 0);
  if (*slot == EMPTY_ENTRY)
return;
but htab_find_slot actually never returns *slot == EMPTY_ENTRY if called
with insert 0:
+ if (!insert)
+   return NULL;
It can return *slot == EMPTY_ENTRY only if insert is non-zero, and
otherwise it must be (for both cases) a non-deleted non-empty entry equal to
the one we are looking for.

Thus, I believe what is documented since that patch has never worked and we
are wasting cycles (if not optimized out) testing something impossible.

So, I think our options are either to make it work, like below (untested),
or remove those ifs altogether and put back the
"Naturally the hash table must already exist." comment.

2019-03-14  Jason Merrill  
Jakub Jelinek  

* hash-table.h (remove_elt_with_hash): Return if slot is NULL rather
than if is_empty (*slot).

* hashtab.c (htab_remove_elt_with_hash): Return if slot is NULL rather
than if *slot is HTAB_EMPTY_ENTRY.

--- gcc/hash-table.h.jj 2019-01-01 12:37:17.446970209 +0100
+++ gcc/hash-table.h2019-03-14 08:48:18.861131498 +0100
@@ -940,7 +940,7 @@ hash_table
 ::remove_elt_with_hash (const compare_type , hashval_t hash)
 {
   value_type *slot = find_slot_with_hash (comparable, hash, NO_INSERT);
-  if (is_empty (*slot))
+  if (slot == NULL)
 return;
 
   Descriptor::remove (*slot);
--- libiberty/hashtab.c.jj  2019-01-01 12:38:46.868503025 +0100
+++ libiberty/hashtab.c 2019-03-14 08:47:49.172612001 +0100
@@ -725,7 +725,7 @@ htab_remove_elt_with_hash (htab_t htab,
   PTR *slot;
 
   slot = htab_find_slot_with_hash (htab, element, hash, NO_INSERT);
-  if (*slot == HTAB_EMPTY_ENTRY)
+  if (slot == NULL)
 return;
 
   if (htab->del_f)


Jakub


Re: [PR72741] Encode OpenACC 'routine' directive's level of parallelism inside Fortran module files

2019-03-14 Thread Thomas Schwinge
Hi Thomas!

On Wed, 13 Mar 2019 23:13:46 +0100, Thomas Koenig  wrote:
> Am 13.03.19 um 18:50 schrieb Thomas Schwinge:
> >> There are many ways to deal with it without bumping MOD_VERSION in a
> >> backwards but not forwards compatible way, so that a newer compiler will be
> >> able to parse old *.mod files, and newer compiler new ones as long as this
> >> problematic stuff doesn't appear in.
> > Like the attached, actually pretty simple now?
> 
> Can you explain a) how this works

I'll be happy to elaborate, but I'm not sure at which level you'd like me
to explain?

This is basically the very same thing that's being done for other 'struct
symbol_attribute' flag fields, by interpreting the 'enum
oacc_routine_lop' values as individual flags, and with the corollary (to
maintain MOD_VERSION compatibility as best as we can) that we don't
stream out its default value (so doing correspondingly to when a "real"
flag's value is 'false').

> and b) how you tested it?

I had mentioned that in the commit message: with the relevant old/new GCC
combinations, using the included test case.

Happy to explain further, if necessary.


Grüße
 Thomas