[PATCH] rs6000: MMA test case ICEs using -O3

2021-03-30 Thread Peter Bergner via Gcc-patches
The mma_assemble_input_operand predicate does not accept reg+reg indexed
addresses which can lead to ICEs.  The problem is that the quad_address_p
function only accepts reg+offset addresses that are valid for quad word
accesses, but not reg+reg addresses which are also valid for quad word
accesses when dealing with vector types.  The solution used here is to
call memory_operand, which uses rs6000_legitimate_address_p to ensure
the address is valid.  For reg+offset addresses, it uses quad_address_p like
before, but for reg+reg addresses, it calls legitimate_indexed_address_p
addresses which fixes this specific ICE.

This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
I also compiled some non-trivial DGEMM and SGEMM test cases that use our
MMA builtins and I don't see any generated code differences.

Ok for trunk?

The same bad test in mma_assemble_input_operand exists in GCC 10, but I
have been unable to get it to ICE there with this test case.  I assume
we still want to fix it there too?  If so, ok for GCC 10 after some trunk
burn in?

Peter


gcc/
PR target/99842
* config/rs6000/predicates.md:

gcc/testsuite/
PR target/99842
* g++.target/powerpc/pr99842.C: New.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 859af75dfbd..e48c6eee19e 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1171,8 +1171,7 @@
 (define_special_predicate "mma_assemble_input_operand"
   (match_test "(mode == V16QImode
&& (vsx_register_operand (op, mode)
-   || (MEM_P (op)
-   && quad_address_p (XEXP (op, 0), mode, false"))
+   || memory_operand (op, mode)))"))
 
 ;; Return 1 if this operand is valid for an MMA disassemble insn.
 (define_predicate "mma_disassemble_output_operand"
diff --git a/gcc/testsuite/g++.target/powerpc/pr99842.C 
b/gcc/testsuite/g++.target/powerpc/pr99842.C
new file mode 100644
index 000..d84de3b4570
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr99842.C
@@ -0,0 +1,188 @@
+/* PR target/99842 */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O3 -mdejagnu-cpu=power10 -w" } */
+
+/* Verify we do not ICE on the following source.  */
+
+enum { a, b, c, d };
+template  struct e;
+template  struct e {
+  typedef h f;
+};
+template  struct ac;
+template  struct ac : ac {};
+template  struct l;
+template  class n;
+template  class o;
+template  class ag;
+template  class af;
+template  struct ad;
+template  struct an {
+  typedef n::ai, ac::aj> f;
+};
+template  struct am { typedef o f; };
+template ::ao,
+  typename = typename ac::av>
+struct ak;
+template  struct ak {
+  typedef typename am::f f;
+};
+template  struct aq;
+template  struct aq { typedef ar at; };
+template  ap bf(const typename ad::f *);
+template  ap aw(typename ad::f *ax) { return bf(ax); 
}
+typedef __attribute__((altivec(vector__))) double au;
+template <> struct ad { typedef double f; };
+template <> au bf(const double *ax) { return __builtin_vec_vsx_ld(0, ax); }
+template  struct az {};
+template  class o : public l {
+public:
+  typedef typename ac::ah ah;
+  template  al +=(const o &);
+};
+template  struct l {};
+template  struct ac> {
+  typedef typename ba::ah ah;
+  enum { ai, aj };
+};
+template 
+class af
+: public ak<
+  af, const n>,
+ n, bd>,
+  int, int>::f {};
+template  struct be;
+template  void bi(bj, bg bm, g) {
+  typename an::f bk(bm);
+}
+template  void bl(bj, bg bm, g bp) {
+  be::bn(a, bm, bp);
+}
+template  struct bo;
+class bs {
+public:
+  bs(double *, int);
+  double ()(int, int) { return bq[br]; }
+  template  bw bt(int i, int j) {
+double  = operator()(i, j);
+return aw();
+  }
+  double *bq;
+  int br;
+};
+class ca : public bs {
+public:
+  ca(double *by, int bz) : bs(by, bz) {}
+};
+template  class ce : public am::f {
+protected:
+  template  void cb(l) {
+af, const n>,
+   n>
+cc;
+bl(0, cc, az());
+  }
+  template  void ch(long);
+  template  void ch(l cf) { cb(cf); }
+};
+template 
+struct ac> {
+  typedef cg ah;
+  typedef int av;
+};
+template 
+class n : public ce> {
+public:
+  template  n(ab p) { n::template ch(p); }
+};
+template  struct ac> {
+  typedef ba ao;
+  typedef typename e::f ah;
+  typedef typename aq::av, typename ac::av, bc>::at av;
+};
+template  class cm;
+template 
+class ag
+: public cm::av, typename ac::av, int>::at> 
{
+};
+template 
+class cm : public ak, n>>::f {};
+template 
+template 
+al ::operator+=(const o &) {
+  af, const n>,
+ n>
+  co;
+  bi(0, co, int());
+}
+enum { cp };
+template  struct cq;
+template  struct cr {
+  enum { q };
+  enum { ae = cq::at };
+};
+template <> struct cq {
+  enum { at = d };
+};
+struct t {
+  template  static void bn(ba, bb, s) {
+typedef typename bb::ah x;
+x u;
+bo::bn(0, 0, ca(0, 0), ca(, 1), 0, 0, 0);
+  }
+};

[committed] analyzer: remove old decl of region::dump_to_pp

2021-03-30 Thread David Malcolm via Gcc-patches
This was made redundant in the GCC 11 rewrite of state
(808f4dfeb3a95f50f15e71148e5c1067f90a126d).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as d0b7c821754e2b16e9e84d877082105799adf238.

gcc/analyzer/ChangeLog:
* region.h (region::dump_to_pp): Remove old decl.
---
 gcc/analyzer/region.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/gcc/analyzer/region.h b/gcc/analyzer/region.h
index ea24b38b6a1..175a82a0cb2 100644
--- a/gcc/analyzer/region.h
+++ b/gcc/analyzer/region.h
@@ -128,11 +128,6 @@ public:
  pretty_printer *pp) const;
   label_text get_desc (bool simple=true) const;
 
-  void dump_to_pp (const region_model ,
-  pretty_printer *pp,
-  const char *prefix,
-  bool is_last_child) const;
-
   virtual void dump_to_pp (pretty_printer *pp, bool simple) const = 0;
   void dump (bool simple) const;
 
-- 
2.26.2



[committed] analyzer: only call get_diagnostic_tree when it's needed

2021-03-30 Thread David Malcolm via Gcc-patches
impl_sm_context::get_diagnostic_tree could be expensive, and
I find myself needing to put a breakpoint on it to debug
PR analyzer/99771, so only call it if we're about to use
the result.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7917-g0f9aa35c79a0fe195d5076375b5794246cf44819.

gcc/analyzer/ChangeLog:
* sm-file.cc (fileptr_state_machine::on_stmt): Only call
get_diagnostic_tree if the result will be used.
* sm-malloc.cc (malloc_state_machine::on_stmt): Likewise.
(malloc_state_machine::on_deallocator_call): Likewise.
(malloc_state_machine::on_realloc_call): Likewise.
(malloc_state_machine::on_realloc_call): Likewise.
* sm-sensitive.cc
(sensitive_state_machine::warn_for_any_exposure): Likewise.
* sm-taint.cc (taint_state_machine::on_stmt): Likewise.
---
 gcc/analyzer/sm-file.cc  |  2 +-
 gcc/analyzer/sm-malloc.cc| 10 +++---
 gcc/analyzer/sm-sensitive.cc |  8 +---
 gcc/analyzer/sm-taint.cc |  4 +++-
 4 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/gcc/analyzer/sm-file.cc b/gcc/analyzer/sm-file.cc
index 7a81c8ff632..d64c313e31c 100644
--- a/gcc/analyzer/sm-file.cc
+++ b/gcc/analyzer/sm-file.cc
@@ -344,7 +344,6 @@ fileptr_state_machine::on_stmt (sm_context *sm_ctxt,
if (is_named_call_p (callee_fndecl, "fclose", call, 1))
  {
tree arg = gimple_call_arg (call, 0);
-   tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
 
sm_ctxt->on_transition (node, stmt, arg, m_start, m_closed);
 
@@ -356,6 +355,7 @@ fileptr_state_machine::on_stmt (sm_context *sm_ctxt,
 
if (sm_ctxt->get_state (stmt, arg) == m_closed)
  {
+   tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
sm_ctxt->warn (node, stmt, arg,
   new double_fclose (*this, diag_arg));
sm_ctxt->set_next_state (stmt, arg, m_stop);
diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
index ef250c80915..ae03b068a88 100644
--- a/gcc/analyzer/sm-malloc.cc
+++ b/gcc/analyzer/sm-malloc.cc
@@ -1674,11 +1674,11 @@ malloc_state_machine::on_stmt (sm_context *sm_ctxt,
   if (TREE_CODE (op) == MEM_REF)
{
  tree arg = TREE_OPERAND (op, 0);
- tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
 
  state_t state = sm_ctxt->get_state (stmt, arg);
  if (unchecked_p (state))
{
+ tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
  sm_ctxt->warn (node, stmt, arg,
 new possible_null_deref (*this, diag_arg));
  const allocation_state *astate = as_a_allocation_state (state);
@@ -1686,12 +1686,14 @@ malloc_state_machine::on_stmt (sm_context *sm_ctxt,
}
  else if (state == m_null)
{
+ tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
  sm_ctxt->warn (node, stmt, arg,
 new null_deref (*this, diag_arg));
  sm_ctxt->set_next_state (stmt, arg, m_stop);
}
  else if (freed_p (state))
{
+ tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
  const allocation_state *astate = as_a_allocation_state (state);
  sm_ctxt->warn (node, stmt, arg,
 new use_after_free (*this, diag_arg,
@@ -1738,7 +1740,6 @@ malloc_state_machine::on_deallocator_call (sm_context 
*sm_ctxt,
   if (argno >= gimple_call_num_args (call))
 return;
   tree arg = gimple_call_arg (call, argno);
-  tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
 
   state_t state = sm_ctxt->get_state (call, arg);
 
@@ -1752,6 +1753,7 @@ malloc_state_machine::on_deallocator_call (sm_context 
*sm_ctxt,
   if (!astate->m_deallocators->contains_p (d))
{
  /* Wrong allocator.  */
+ tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
  pending_diagnostic *pd
= new mismatching_deallocation (*this, diag_arg,
astate->m_deallocators,
@@ -1766,6 +1768,7 @@ malloc_state_machine::on_deallocator_call (sm_context 
*sm_ctxt,
   else if (state == d->m_freed)
 {
   /* freed -> stop, with warning.  */
+  tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
   sm_ctxt->warn (node, call, arg,
 new double_free (*this, diag_arg, d->m_name));
   sm_ctxt->set_next_state (call, arg, m_stop);
@@ -1773,6 +1776,7 @@ malloc_state_machine::on_deallocator_call (sm_context 
*sm_ctxt,
   else if (state == m_non_heap)
 {
   /* non-heap -> stop, with warning.  */
+  tree diag_arg = sm_ctxt->get_diagnostic_tree (arg);
   sm_ctxt->warn (node, call, arg,
 new free_of_non_heap (*this, diag_arg,
   d->m_name));
@@ -1806,7 +1810,6 @@ 

[committed] analyzer testsuite: fix typo

2021-03-30 Thread David Malcolm via Gcc-patches
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/symbolic-1.c: Fix typo.
---
 gcc/testsuite/gcc.dg/analyzer/symbolic-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/analyzer/symbolic-1.c 
b/gcc/testsuite/gcc.dg/analyzer/symbolic-1.c
index 9d228e6331c..feab9ce3a2d 100644
--- a/gcc/testsuite/gcc.dg/analyzer/symbolic-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/symbolic-1.c
@@ -1,6 +1,6 @@
 #include "analyzer-decls.h"
 
-/* The example from store2.h  */
+/* The example from store.h  */
 
 void test_1 (char a, char b, char c, char d, char e, char f,
 int i, int j)
-- 
2.26.2



Re: [PATCH v9] Practical improvement to libgcc complex divide

2021-03-30 Thread Patrick McGehearty via Gcc-patches

Thank you for your interest in this project.

On 3/27/2021 6:18 PM, Bernhard Reutner-Fischer wrote:

On Fri, 26 Mar 2021 23:14:41 +
Patrick McGehearty via Gcc-patches  wrote:


Changes in Version 9 since Version 8:

Revised code to meet gcc coding standards in all files, especially
 with respect to adding spaces around operations and removing
 excess () in #define macro definitions.

Did you gather cycle counter diffs for current against new for the
normal and some edge cases?

I don't see additional value in looking at cycle counters in addition
to the timing information I presented. I prefer wall clock time to
cycle counters, perhaps because cycle counters are less reliably
implemented on some platforms (that's a separate discussion, though).
Every platform will have different counters and different timing.

Let me run through a rough operation count analysis of the double
precision case. I will also postulate instruction latencies
that might be typical of a recent processor. Real processors
will different somewhat but not by large amounts.

The current method uses Smith's method. Showing half of that method:
  if (fabs(c) < fabs(d))
    {
  ratio = c / d;
  denom = (c * ratio) + d;
  xx = ((a * ratio) + b) / denom;
  yy = ((b * ratio) - a) / denom;
    }


It requires:
two fabs computations (typically 1 cycle each)
one floating point compare (typically 2 cycles)
1 branch decision (50-50 prediction reliabilty)
    (1 cycle if predicted, 9 cycles if not predicted; ave=5)
3 multiplys, 3 divides, 3 add/subtract operations
  (assume 5 cycle latency for add/mul and 15 cycle for divide)
    (75 cycles)
Total: 84 cycles (typical average, old method)

I omit the NAN analysis as that is unchanged.

The cost of the new method is data dependent.
First, assume all scaling tests are false and ratio does not underflow.
Further, assume that data is consistently in those ranges, to allow
all branch predictions except the first to be predicted.
I'll also assume we have an excellent branch predictor mechanism
that can handle predicting

Then we have:
5 additional fabs, and 4 additional compare and branches (all predicted)
and one || operations for 18 additional cycles.
[This optimistic count assumes fabs(a,b) > RMIN, allowing
us to bypass the fabs < RMAX2 tests.

New total: 84+18 = 102 (optimistic average, new method)

If we can't bypass the fabs < RMAX2 tests, we need to add
four more fabs, and four && operations, and four compare
operations for 12 more cycles.
Possible total: 84+18+12 = 114

Finally, if we hit a scaling operation, then we have four
multiply operations for an additional 20 cycles plus
we've mispredicted the branch for 9 more.
114+28 = 144 cycles.
This analysis is not particularly accurate. It does not account for
some architectures having several floating point units or for
speculative execution hiding latency or many other variations in
computer architecture. Still, its good enough for a ballpark estimate
that says the typical case will be around 15-25% slower and the most
extreme cases will be around 70% slower.

When we look at the patch writeup, I report that for double precision,
the "moderate set" which represents the typical case, we see 4% to 24%
measured slowdown. The "full set" which represents having a
substantial number of values which might cause branch predictors to go
wrong as costing 38% to 56% overhead. Apparently my 'back of the
envelope' analysis above is not too far off.

Oh, I failed to cover the case where fabs(ratio) < RMIN.
That happens in the full set about 1-2% of the time.
It requires two additional divide operations for an extra
30 cycles, plus maybe 10 cycles for the mispredicted branch.
That might bump up the worst case to be about double the
normal case. That's acceptable when you consider that slow
path prevents the code from returning a completely wrong result.

As a quick aside, the test for ratio underflowing reduces the error
rate by a factor of 4 from 1.70% to 0.35% in the full exponent range
data set with about half the total overhead of the new method. Adding
all the other tests and scaling code reduces errors larger than 12
bits to roughly 1 in 10 million and 18 bits or larger to roughly 1 in
100 million. Those remaining errors are due to the issue of
subtraction cancellation, which is a harder problem than avoiding
underflow/overflow issues.




Can you please share data for -Os between current and your proposed new?

As for data for -Os for current and proposed new, it's not very interesting.
There are no differences as complex divide is not inlined by gcc.
Instead, there is a simple call to __divdc3.
I checked for inlining with gcc 4.8, gcc8.3, and gcc11.0.1 as well as with
a compiler that has my code changes.

If one uses -fcx-fortran-rules then Smith's method for complex divide is 
inlined with -Os.
If one uses -fcx-limited-range then the naive method for complex divide 
is inlined with -Os.

The new code does not change 

Re: [PATCH] c++: placeholder type constraint and argument packs [PR99815]

2021-03-30 Thread Jason Merrill via Gcc-patches

On 3/30/21 4:35 PM, Patrick Palka wrote:

When checking dependence of a placeholder type constraint, if the first
template argument of the constraint is an argument pack, we need to
expand it so that we properly separate the implicit 'auto' argument from
the rest.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


gcc/cp/ChangeLog:

PR c++/99815
* pt.c (placeholder_type_constraint_dependent_p): Expand
argument packs to separate the first non-pack argument
from the rest.

gcc/testsuite/ChangeLog:

PR c++/99815
* g++.dg/cpp2a/concepts-placeholder5.C: New test.
---
  gcc/cp/pt.c   |  5 +++
  .../g++.dg/cpp2a/concepts-placeholder5.C  | 32 +++
  2 files changed, 37 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index a056ecefd1d..dc6f2f37f9b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -28189,6 +28189,11 @@ placeholder_type_constraint_dependent_p (tree t)
tree id = unpack_concept_check (t);
tree args = TREE_OPERAND (id, 1);
tree first = TREE_VEC_ELT (args, 0);
+  if (ARGUMENT_PACK_P (first))
+{
+  args = expand_template_argument_pack (args);
+  first = TREE_VEC_ELT (args, 0);
+}
gcc_checking_assert (TREE_CODE (first) == WILDCARD_DECL
   || is_auto (first));
for (int i = 1; i < TREE_VEC_LENGTH (args); ++i)
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-placeholder5.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-placeholder5.C
new file mode 100644
index 000..eaea41a36eb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-placeholder5.C
@@ -0,0 +1,32 @@
+// PR c++/99815
+// { dg-do compile { target c++20 } }
+
+template 
+struct is_same { static constexpr bool value = false; };
+
+template 
+struct is_same { static constexpr bool value = true; };
+
+template 
+concept C = is_same::value; // { dg-error "wrong number" }
+
+template  void f() {
+  C auto x = 0; // { dg-error "constraints" }
+}
+
+template void f(); // { dg-bogus "" }
+template void f(); // { dg-message "required from here" }
+template void f<>(); // { dg-message "required from here" }
+template void f(); // { dg-message "required from here" }
+
+template  void g() {
+  C auto x = 0; // { dg-error "constraints" }
+}
+
+template void g<>(); // { dg-bogus "" }
+template void g(); // { dg-message "required from here" }
+
+template  void h() {
+  C auto x = 0; // { dg-error "constraints" }
+  C auto y = 0;
+}





Re: [PATCH] c++: Adjust mangling of __alignof__ [PR88115]

2021-03-30 Thread Jason Merrill via Gcc-patches

On 3/30/21 3:17 PM, Patrick Palka wrote:

We currently mangle __alignof__ as a vendor extended operator,
but that's problematic for the reasons mentioned in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88115#c6.

This patch changes the mangling of __alignof__ to instead use the
new "vendor extended expression" syntax that's proposed in
https://github.com/itanium-cxx-abi/cxx-abi/issues/112.  Clang does
the same thing already, so after this patch both GCC and Clang agree
about the mangling of __alignof__(type) and __alignof__(expr).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


gcc/cp/ChangeLog:

PR c++/88115
* mangle.c (write_expression): Adjust the mangling of
__alignof__.

include/ChangeLog:

PR c++/88115
* demangle.h (enum demangle_component_type): Add
DEMANGLE_COMPONENT_VENDOR_EXPR.

libiberty/ChangeLog:

PR c++/88115
* cp-demangle.c (d_dump, d_make_comp, d_expression_1)
(d_count_templates_scopes): Handle DEMANGLE_COMPONENT_VENDOR_EXPR.
(d_print_comp_inner): Likewise.
: Revert r11-4926
change.
: Likewise.
* testsuite/demangle-expected: Adjust __alignof__ mangling
tests.

gcc/testsuite/ChangeLog:

PR c++/88115
* g++.dg/cpp0x/alignof7.C: Adjust expected mangling.
---
  gcc/cp/mangle.c   |  8 ++---
  gcc/testsuite/g++.dg/cpp0x/alignof7.C |  4 +--
  include/demangle.h|  3 ++
  libiberty/cp-demangle.c   | 47 +++
  libiberty/testsuite/demangle-expected |  4 +--
  5 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 0a9e5aa79a0..57ce9a6710f 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -3124,11 +3124,9 @@ write_expression (tree expr)
  if (abi_version_at_least (15))
{
  /* We used to mangle __alignof__ like alignof.  */
- write_string ("v111__alignof__");
- if (TYPE_P (TREE_OPERAND (expr, 0)))
-   write_type (TREE_OPERAND (expr, 0));
- else
-   write_expression (TREE_OPERAND (expr, 0));
+ write_string ("u11__alignof__");
+ write_template_arg (TREE_OPERAND (expr, 0));
+ write_char ('E');
  return;
}
}
diff --git a/gcc/testsuite/g++.dg/cpp0x/alignof7.C 
b/gcc/testsuite/g++.dg/cpp0x/alignof7.C
index a4d7f24a4d7..2369b879392 100644
--- a/gcc/testsuite/g++.dg/cpp0x/alignof7.C
+++ b/gcc/testsuite/g++.dg/cpp0x/alignof7.C
@@ -18,5 +18,5 @@ template void f4(std::size_t);
  
  // { dg-final { scan-assembler "_Z2f1IiEvDTatT_E" } }

  // { dg-final { scan-assembler "_Z2f2IiEvDTaztlT_EE" } }
-// { dg-final { scan-assembler "_Z2f3IiEvDTv111__alignof__T_E" } }
-// { dg-final { scan-assembler "_Z2f4IiEvDTv111__alignof__tlT_EE" } }
+// { dg-final { scan-assembler "_Z2f3IiEvDTu11__alignof__T_EE" } }
+// { dg-final { scan-assembler "_Z2f4IiEvDTu11__alignof__XtlT_" } }
diff --git a/include/demangle.h b/include/demangle.h
index 23b47265d94..b45234e6887 100644
--- a/include/demangle.h
+++ b/include/demangle.h
@@ -408,6 +408,9 @@ enum demangle_component_type
   number which involves neither modifying the mangled string nor
   allocating a new copy of the literal in memory.  */
DEMANGLE_COMPONENT_LITERAL_NEG,
+  /* A vendor's builtin expression.  The left subtree holds the name of
+ the type, and the right subtree is a template argument list.  */
+  DEMANGLE_COMPONENT_VENDOR_EXPR,
/* A libgcj compiled resource.  The left subtree is the name of the
   resource.  */
DEMANGLE_COMPONENT_JAVA_RESOURCE,
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index d3e798455cc..a528b7b5ed3 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -815,6 +815,9 @@ d_dump (struct demangle_component *dc, int indent)
  case DEMANGLE_COMPONENT_LITERAL_NEG:
printf ("negative literal\n");
break;
+case DEMANGLE_COMPONENT_VENDOR_EXPR:
+  printf ("vendor expression\n");
+  break;
  case DEMANGLE_COMPONENT_JAVA_RESOURCE:
printf ("java resource\n");
break;
@@ -976,6 +979,7 @@ d_make_comp (struct d_info *di, enum 
demangle_component_type type,
  case DEMANGLE_COMPONENT_TRINARY_ARG1:
  case DEMANGLE_COMPONENT_LITERAL:
  case DEMANGLE_COMPONENT_LITERAL_NEG:
+case DEMANGLE_COMPONENT_VENDOR_EXPR:
  case DEMANGLE_COMPONENT_COMPOUND_NAME:
  case DEMANGLE_COMPONENT_VECTOR_TYPE:
  case DEMANGLE_COMPONENT_CLONE:
@@ -3345,6 +3349,7 @@ d_unresolved_name (struct d_info *di)
  ::= st 
  ::= 
::= 
+   ::= u  * E # vendor extended 
expression
  ::= 
  
 ::= 

@@ -3425,6 +3430,15 @@ d_expression_1 (struct d_info *di)
return d_make_comp (di, DEMANGLE_COMPONENT_INITIALIZER_LIST,
  

Re: [PATCH] dwarf2out: Fix up ranges for -gdwarf-5 -gsplit-dwarf [PR99490]

2021-03-30 Thread Jason Merrill via Gcc-patches

On 3/12/21 3:20 AM, Jakub Jelinek wrote:

Hi!

For -gdwarf-4 -gsplit-dwarf we used to emit .debug_ranges section
(so in the binaries/shared libraries) with DW_AT_ranges from skeleton
units as well as .debug_info.dwo pointing to it through DW_FORM_sec_offset
(and DW_AT_GNU_ranges_base pointing into section, not sure for what
reason exactly).
When DWARF5 support was being added, we've started using .debug_rnglists
section, added DW_AT_rnglists_base to the DW_TAG_skeleton_unit, kept
DW_AT_ranges with DW_FORM_sec_offset in the skeleton and switched
over to DW_FORM_rnglistx for DW_AT_ranges in .debug_info.dwo.
But the DWARF5 spec actually means for the ranges section (at least
everything for those DW_AT_ranges in .debug_info.dwo) to sit
in .debug_rnglists.dwo section next to the .debug_info.dwo, rather than
having consumers look it up in the binary/shared library instead.
Based on some discussions in the DWARF discuss mailing list:
http://lists.dwarfstd.org/pipermail/dwarf-discuss-dwarfstd.org/2021-March/thread.html#4765
this patch mostly follows what LLVM emits for that right now:
1) small .debug_rnglists section (when needed) just to cover the
skeleton DW_AT_ranges (if present); the content of the section
uses the Split DWARFy DW_RLE_* codes with addrx encodings where
possible
2) DW_AT_ranges in the skeleton uses DW_FORM_sec_offset (difference
from LLVM which uses DW_FORM_rnglistx, which makes it larger
and ambiguous)
3) DW_AT_rnglists_base attribute is gone from the skeleton (again,
unlike LLVM where it is just confusing what exactly it means because
it is inherited; it would make sense if we emitted DW_FORM_rnglistx
in non-split DWARF, but unless ranges are shared, I'm afraid we'd
make DWARF larger with fewer relocations by that)
4) usually big .debug_rnglists.dwo section again with using DW_RLE_*x*
where possible
5) DW_AT_ranges with DW_FORM_rnglistx from .debug_info.dwo referring to
that .debug_rnglists.dwo ranges

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-12  Jakub Jelinek  

PR debug/99490
* dwarf2out.c (debug_ranges_dwo_section): New variable.
(DW_RANGES_IDX_SKELETON): Define.
(struct dw_ranges): Add begin_entry and end_entry members.
(DEBUG_DWO_RNGLISTS_SECTION): Define.
(add_ranges_num): Adjust r initializer for addition of *_entry
members.
(add_ranges_by_labels): For -gsplit-dwarf and force_direct,
set idx to DW_RANGES_IDX_SKELETON.
(index_rnglists): Don't set r->idx if it is equal to
DW_RANGES_IDX_SKELETON.  Initialize r->begin_entry and
r->end_entry for -gsplit-dwarf if those will be needed by
output_rnglists.
(output_rnglists): Add DWO argument.  If true, switch to
debug_ranges_dwo_section rather than debug_ranges_section.
Adjust l1/l2 label indexes.  Only output the offset table when
dwo is true and don't include in there the skeleton range
entry if present.  For -gsplit-dwarf, skip ranges that belong
to the other rnglists section.  Change return type from void
to bool and return true if there are any range entries for
the other section.  For dwarf_split_debug_info use
DW_RLE_startx_endx, DW_RLE_startx_length and DW_RLE_base_addressx
entries instead of DW_RLE_start_end, DW_RLE_start_length and
DW_RLE_base_address.
(init_sections_and_labels): Initialize debug_ranges_dwo_section
if -gsplit-dwarf and DWARF >= 5.  Adjust ranges_section_label
and range_base_label indexes.
(dwarf2out_finish): Call index_rnglists earlier before finalizing
.debug_addr.  Never emit DW_AT_rnglists_base attribute.  For
-gsplit-dwarf and DWARF >= 5 call output_rnglists up to twice
with different dwo arguments.
(dwarf2out_c_finalize): Clear debug_ranges_dwo_section.

--- gcc/dwarf2out.c.jj  2021-03-10 17:36:37.037537129 +0100
+++ gcc/dwarf2out.c 2021-03-11 12:50:00.402418873 +0100
@@ -171,6 +171,7 @@ static GTY(()) section *debug_line_str_s
  static GTY(()) section *debug_str_dwo_section;
  static GTY(()) section *debug_str_offsets_section;
  static GTY(()) section *debug_ranges_section;
+static GTY(()) section *debug_ranges_dwo_section;
  static GTY(()) section *debug_frame_section;
  
  /* Maximum size (in bytes) of an artificially generated label.  */

@@ -3152,11 +3153,17 @@ struct GTY(()) dw_ranges {
/* If this is positive, it's a block number, otherwise it's a
   bitwise-negated index into dw_ranges_by_label.  */
int num;
+  /* If idx is equal to DW_RANGES_IDX_SKELETON, it should be emitted
+ into .debug_rnglists section rather than .debug_rnglists.dwo
+ for -gsplit-dwarf and DWARF >= 5.  */
+#define DW_RANGES_IDX_SKELETON ((1U << 31) - 1)
/* Index for the range list for DW_FORM_rnglistx.  */
unsigned int idx : 31;
/* True if this range might be 

[PATCH] c++: placeholder type constraint and argument packs [PR99815]

2021-03-30 Thread Patrick Palka via Gcc-patches
When checking dependence of a placeholder type constraint, if the first
template argument of the constraint is an argument pack, we need to
expand it so that we properly separate the implicit 'auto' argument from
the rest.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

PR c++/99815
* pt.c (placeholder_type_constraint_dependent_p): Expand
argument packs to separate the first non-pack argument
from the rest.

gcc/testsuite/ChangeLog:

PR c++/99815
* g++.dg/cpp2a/concepts-placeholder5.C: New test.
---
 gcc/cp/pt.c   |  5 +++
 .../g++.dg/cpp2a/concepts-placeholder5.C  | 32 +++
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index a056ecefd1d..dc6f2f37f9b 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -28189,6 +28189,11 @@ placeholder_type_constraint_dependent_p (tree t)
   tree id = unpack_concept_check (t);
   tree args = TREE_OPERAND (id, 1);
   tree first = TREE_VEC_ELT (args, 0);
+  if (ARGUMENT_PACK_P (first))
+{
+  args = expand_template_argument_pack (args);
+  first = TREE_VEC_ELT (args, 0);
+}
   gcc_checking_assert (TREE_CODE (first) == WILDCARD_DECL
   || is_auto (first));
   for (int i = 1; i < TREE_VEC_LENGTH (args); ++i)
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-placeholder5.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-placeholder5.C
new file mode 100644
index 000..eaea41a36eb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-placeholder5.C
@@ -0,0 +1,32 @@
+// PR c++/99815
+// { dg-do compile { target c++20 } }
+
+template 
+struct is_same { static constexpr bool value = false; };
+
+template 
+struct is_same { static constexpr bool value = true; };
+
+template 
+concept C = is_same::value; // { dg-error "wrong number" }
+
+template  void f() {
+  C auto x = 0; // { dg-error "constraints" }
+}
+
+template void f(); // { dg-bogus "" }
+template void f(); // { dg-message "required from here" }
+template void f<>(); // { dg-message "required from here" }
+template void f(); // { dg-message "required from here" }
+
+template  void g() {
+  C auto x = 0; // { dg-error "constraints" }
+}
+
+template void g<>(); // { dg-bogus "" }
+template void g(); // { dg-message "required from here" }
+
+template  void h() {
+  C auto x = 0; // { dg-error "constraints" }
+  C auto y = 0;
+}
-- 
2.31.1.133.g84d06cdc06



Re: [PATCH, rs6000 V3] Update "prefix" attribute for Power10 [PR99133]

2021-03-30 Thread Segher Boessenkool
Hi!

On Tue, Mar 30, 2021 at 02:38:32PM -0500, Pat Haugen wrote:
> Update prefixed attribute for Power10.
> 
> This patch creates a new attribute, "maybe_prefixed", which is used to mark
> those instructions that may have a prefixed form. The existing "prefixed"
> attribute is now used to mark all instructions that are prefixed form.

It doesn't yet set maybe_prefixed on most insns that will need it?  What
does that mean for say movsi that can be plwz?

> +;; Whether this insn has a prefixed form and a non-prefixed form.
> +(define_attr "maybe_prefixed" "no,yes"
> +  (if_then_else (eq_attr "type" "load,fpload,vecload,store,fpstore,vecstore,
> +  integer,add")
> + (const_string "yes")
> + (const_string "no")))

Ah.  Okay :-)  It probably is better to set the maybe_prefixed attribute
explicitly, but this will do for now.  Status quo and all that.

I don't see how the "add" and "integer" cases can work...

>  (define_attr "prefixed" "no,yes"
>(cond [(ior (match_test "!TARGET_PREFIXED")
> -   (match_test "!NONJUMP_INSN_P (insn)"))
> +   (match_test "!NONJUMP_INSN_P (insn)")
> +   (eq_attr "maybe_prefixed" "no"))
>(const_string "no")

It's a cond, so you can have separate cases instead of ior, if that
reads better (the generated compiler code will be equivalent).  It can
help if you want to place some comments, for example ;-)


Okay for trunk.  Thank you!


Segher


[PATCH] aarch64: Fix up *add3_poly_1 [PR99813]

2021-03-30 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the PR, Uai constraint stands for
aarch64_sve_scalar_inc_dec_immediate
while Uav for
aarch64_sve_addvl_addpl_immediate.
Both *add3_aarch64 and *add3_poly_1 patterns use
  * return aarch64_output_sve_scalar_inc_dec (operands[2]);
  * return aarch64_output_sve_addvl_addpl (operands[2]);
in that order, but the former with Uai,Uav order, while the
latter with Uav,Uai instead.  This patch swaps the constraints
so that they match the output.

Bootstrapped/regtested on aarch64-linux, ok for trunk?

2021-03-30  Jakub Jelinek  
Richard Sandiford  

PR target/99813
* config/aarch64/aarch64.md (*add3_poly_1): Swap Uai and Uav
constraints on operands[2] and similarly 0 and rk constraints
on operands[1] corresponding to that.

* g++.target/aarch64/sve/pr99813.C: New test.

--- gcc/config/aarch64/aarch64.md.jj2021-02-25 23:07:07.851319165 +0100
+++ gcc/config/aarch64/aarch64.md   2021-03-30 11:13:35.994077470 +0200
@@ -2050,8 +2050,8 @@
   [(set
 (match_operand:GPI 0 "register_operand" "=r,r,r,r,r,r,")
 (plus:GPI
- (match_operand:GPI 1 "register_operand" "%rk,rk,rk,rk,rk,0,rk")
- (match_operand:GPI 2 "aarch64_pluslong_or_poly_operand" 
"I,r,J,Uaa,Uav,Uai,Uat")))]
+ (match_operand:GPI 1 "register_operand" "%rk,rk,rk,rk,0,rk,rk")
+ (match_operand:GPI 2 "aarch64_pluslong_or_poly_operand" 
"I,r,J,Uaa,Uai,Uav,Uat")))]
   "TARGET_SVE && operands[0] != stack_pointer_rtx"
   "@
   add\\t%0, %1, %2
--- gcc/testsuite/g++.target/aarch64/sve/pr99813.C.jj   2021-03-30 
11:22:13.430290522 +0200
+++ gcc/testsuite/g++.target/aarch64/sve/pr99813.C  2021-03-30 
11:22:55.526819721 +0200
@@ -0,0 +1,27 @@
+// PR target/99813
+/* { dg-do assemble { target aarch64_asm_sve_ok } } */
+/* { dg-options "-O3 -march=armv8.2-a+sve -fvect-cost-model=unlimited 
-fno-tree-dominator-opts -mtune=cortex-a72" } */
+
+long a, b;
+bool c[2][14][2][16], f[2][14][2][16];
+bool d;
+char e[2][4][2][6];
+void g() {
+  a = 0;
+  for (int h = 0; h < 2; ++h)
+for (int i = 0; i < 14; ++i)
+  for (int j = 0; j < 2; ++j)
+for (int k = 0; k < 16; ++k)
+  c[h][i][j][k] = 0;
+  d = 0;
+  for (int h; h < 2; ++h)
+for (int i = 0; i < 4; ++i)
+  for (int j = 0; j < 2; ++j)
+for (int k = 0; k < 6; ++k)
+  e[h][i][j][k] = 6;
+  for (int h = 0; h < 2; ++h)
+for (int i = 0; i < 14; ++i)
+  for (int j = 0; j < 2; ++j)
+for (int k = 0; k < 16; ++k)
+  f[h][i][j][k] = b = 9;
+}

Jakub



Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2021-03-30 Thread Jan Hubicka
Hi,
this patch backports the initial support to gcc10 branch.  Since the
trunk and branch diverged there is non-trivial change to cpuinfo
discovery.  I do;

--- a/libgcc/config/i386/cpuinfo.c
+++ b/libgcc/config/i386/cpuinfo.c
@@ -111,6 +111,12 @@ get_amd_cpu (unsigned int family, unsigned int model)
   if (model >= 0x30)
 __cpu_model.__cpu_subtype = AMDFAM17H_ZNVER2;
   break;
+case 0x19:
+  __cpu_model.__cpu_type = AMDFAM19H;
+  /* AMD family 19h version 1.  */
+  if (model <= 0x0f)
+   __cpu_model.__cpu_subtype = AMDFAM19H_ZNVER3;
+  break;
 default:
   break;
 }

While your patch also sets ZNVER3 for case where VAES is supporte that
would require backporting more of logic detecting VAES.  Is that
necessary? I see it may make znver3 to be defaulted on future znver4 if
it stays with amdfam19, but we did not do this before.

Bootstrapped/regtested x86_64-linux.  With -march=native on znver3
machine we get right flags, but trunk in addition passes:

-mno-amx-bf16
-mno-amx-int8
-mno-amx-tile
-mno-avxvnni
-mno-hreset
-mno-kl
-mno-serialize
-mno-tsxldtrk
-mno-uintr
-mno-widekl

Which are options we did not backported.
Atop of that I plan to backport the tuning patches with exception of
gather which seems bit controversal and can wait for gcc11.

Honza

2021-03-30  Jan Hubicka  

Backport

Venkataramanan Kumar  
Sharavan Kumar  
* common/config/i386/cpuinfo.h (get_amd_cpu) recognize znver3.
* common/config/i386/i386-common.c (processor_names): Add
znver3.
(processor_alias_table): Add znver3 and AMDFAM19H entry.
* common/config/i386/i386-cpuinfo.h (processor_types): Add
AMDFAM19H.
(processor_subtypes): AMDFAM19H_ZNVER3.
* config.gcc (i[34567]86-*-linux* | ...): Likewise.
* config/i386/driver-i386.c: (host_detect_local_cpu): Let
-march=native recognize znver3 processors.
* config/i386/i386-c.c (ix86_target_macros_internal): Add
znver3.
* config/i386/i386-options.c (m_znver3): New definition.
(m_ZNVER): Include m_znver3.
(processor_cost_table): Add znver3.
* config/i386/i386.c (ix86_reassociation_width): Likewise.
* config/i386/i386.h (TARGET_znver3): New definition.
(enum processor_type): Add PROCESSOR_ZNVER3.
* config/i386/i386.md (define_attr "cpu"): Add znver3.
* config/i386/x86-tune-sched.c: (ix86_issue_rate): Likewise.
(ix86_adjust_cost): Likewise.
* config/i386/x86-tune.def (X86_TUNE_AVOID_256FMA_CHAINS:
Likewise.
* config/i386/znver1.md: Add new reservations for znver3.
* doc/extend.texi: Add details about znver3.
* doc/invoke.texi: Likewise.

gcc/testsuite/ChangeLog:

2021-03-30  Jan Hubicka  

* gcc.target/i386/funcspec-56.inc: Handle new march.

libgcc/ChangeLog:

2021-03-30  Jan Hubicka  

* config/i386/cpuinfo.c (get_amd_cpu): Support amdfam19.
* config/i386/cpuinfo.h (enum processor_types): Add AMDFAM19H.
(enum processor_subtypes): Add AMDFAM19H_ZNVER3.

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 1e4d25f052a..97335d42af1 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -1582,7 +1582,8 @@ const char *const processor_names[] =
   "btver1",
   "btver2",
   "znver1",
-  "znver2"
+  "znver2",
+  "znver3"
 };
 
 /* Guarantee that the array is aligned with enum processor_type.  */
@@ -1775,6 +1776,16 @@ const pta processor_alias_table[] =
   | PTA_CLZERO | PTA_CLFLUSHOPT | PTA_XSAVEC | PTA_XSAVES
   | PTA_SHA | PTA_LZCNT | PTA_POPCNT | PTA_CLWB | PTA_RDPID
   | PTA_WBNOINVD},
+  {"znver3", PROCESSOR_ZNVER2, CPU_ZNVER2,
+PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
+  | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
+  | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_AVX2
+  | PTA_BMI | PTA_BMI2 | PTA_F16C | PTA_FMA | PTA_PRFCHW
+  | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT | PTA_FSGSBASE
+  | PTA_RDRND | PTA_MOVBE | PTA_MWAITX | PTA_ADX | PTA_RDSEED
+  | PTA_CLZERO | PTA_CLFLUSHOPT | PTA_XSAVEC | PTA_XSAVES
+  | PTA_SHA | PTA_LZCNT | PTA_POPCNT | PTA_CLWB | PTA_RDPID
+  | PTA_WBNOINVD | PTA_VAES | PTA_VPCLMULQDQ | PTA_PKU},
   {"btver1", PROCESSOR_BTVER1, CPU_GENERIC,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
   | PTA_SSSE3 | PTA_SSE4A | PTA_ABM | PTA_CX16 | PTA_PRFCHW
diff --git a/gcc/config.gcc b/gcc/config.gcc
index d093b6b7f79..6fcdd771d4c 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -662,7 +662,7 @@ pentium4 pentium4m pentiumpro prescott lakemont"
 # 64-bit x86 processors supported by --with-arch=.  Each processor
 # MUST be separated by exactly one space.
 x86_64_archs="amdfam10 athlon64 athlon64-sse3 barcelona bdver1 bdver2 \
-bdver3 bdver4 znver1 znver2 btver1 btver2 k8 k8-sse3 opteron \
+bdver3 bdver4 

[PATCH, rs6000 V3] Update "prefix" attribute for Power10 [PR99133]

2021-03-30 Thread Pat Haugen via Gcc-patches
Update prefixed attribute for Power10.

This patch creates a new attribute, "maybe_prefixed", which is used to mark
those instructions that may have a prefixed form. The existing "prefixed"
attribute is now used to mark all instructions that are prefixed form.

This patch differs from the prior version in that it doesn't modify the
existing settings of the "prefixed" attribute but just adds the new attribute
and sets/tests it appropriately.

Bootstrap/regtest on powerpc64le (Power10) and powerpc64 (Power8 32/64) with no
new regressions. Ok for trunk?

-Pat


2021-03-30  Pat Haugen  

gcc/
PR target/99133
* config/rs6000/altivec.md (xxspltiw_v4si, xxspltiw_v4sf_inst,
xxspltidp_v2df_inst, xxsplti32dx_v4si_inst, xxsplti32dx_v4sf_inst,
xxblend_, xxpermx_inst, xxeval): Mark prefixed.
* config/rs6000/mma.md (mma_, mma_,
mma_, mma_, mma_, mma_,
mma_, mma_, mma_, mma_):
Likewise.
* config/rs6000/rs6000.c (rs6000_final_prescan_insn): Adjust test. 
* config/rs6000/rs6000.md (define_attr "maybe_prefixed"): New.
(define_attr "prefixed"): Update initializer.
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 27a269b9e72..21f1cc6f15b 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -826,7 +826,8 @@ (define_insn "xxspltiw_v4si"
 UNSPEC_XXSPLTIW))]
  "TARGET_POWER10"
  "xxspltiw %x0,%1"
- [(set_attr "type" "vecsimple")])
+ [(set_attr "type" "vecsimple")
+  (set_attr "prefixed" "yes")])
 
 (define_expand "xxspltiw_v4sf"
   [(set (match_operand:V4SF 0 "register_operand" "=wa")
@@ -845,7 +846,8 @@ (define_insn "xxspltiw_v4sf_inst"
 UNSPEC_XXSPLTIW))]
  "TARGET_POWER10"
  "xxspltiw %x0,%1"
- [(set_attr "type" "vecsimple")])
+ [(set_attr "type" "vecsimple")
+  (set_attr "prefixed" "yes")])
 
 (define_expand "xxspltidp_v2df"
   [(set (match_operand:V2DF 0 "register_operand" )
@@ -864,7 +866,8 @@ (define_insn "xxspltidp_v2df_inst"
 UNSPEC_XXSPLTID))]
   "TARGET_POWER10"
   "xxspltidp %x0,%1"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "prefixed" "yes")])
 
 (define_expand "xxsplti32dx_v4si"
   [(set (match_operand:V4SI 0 "register_operand" "=wa")
@@ -893,7 +896,8 @@ (define_insn "xxsplti32dx_v4si_inst"
 UNSPEC_XXSPLTI32DX))]
   "TARGET_POWER10"
   "xxsplti32dx %x0,%2,%3"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "prefixed" "yes")])
 
 (define_expand "xxsplti32dx_v4sf"
   [(set (match_operand:V4SF 0 "register_operand" "=wa")
@@ -921,7 +925,8 @@ (define_insn "xxsplti32dx_v4sf_inst"
 UNSPEC_XXSPLTI32DX))]
   "TARGET_POWER10"
   "xxsplti32dx %x0,%2,%3"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "prefixed" "yes")])
 
 (define_insn "xxblend_"
   [(set (match_operand:VM3 0 "register_operand" "=wa")
@@ -931,7 +936,8 @@ (define_insn "xxblend_"
UNSPEC_XXBLEND))]
   "TARGET_POWER10"
   "xxblendv %x0,%x1,%x2,%x3"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "prefixed" "yes")])
 
 (define_expand "xxpermx"
   [(set (match_operand:V2DI 0 "register_operand" "+wa")
@@ -975,7 +981,8 @@ (define_insn "xxpermx_inst"
 UNSPEC_XXPERMX))]
   "TARGET_POWER10"
   "xxpermx %x0,%x1,%x2,%x3,%4"
-  [(set_attr "type" "vecsimple")])
+  [(set_attr "type" "vecsimple")
+   (set_attr "prefixed" "yes")])
 
 (define_expand "vstrir_"
   [(set (match_operand:VIshort 0 "altivec_register_operand")
@@ -3623,7 +3630,8 @@ (define_insn "xxeval"
 UNSPEC_XXEVAL))]
"TARGET_POWER10"
"xxeval %0,%1,%2,%3,%4"
-   [(set_attr "type" "vecsimple")])
+   [(set_attr "type" "vecsimple")
+(set_attr "prefixed" "yes")])
 
 (define_expand "vec_unpacku_hi_v16qi"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index a00d3a3de26..1f6fc03d2ac 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -540,7 +540,8 @@ (define_insn "mma_"
MMA_VVI4I4I8))]
   "TARGET_MMA"
   " %A0,%x1,%x2,%3,%4,%5"
-  [(set_attr "type" "mma")])
+  [(set_attr "type" "mma")
+   (set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
   [(set (match_operand:XO 0 "fpr_reg_operand" "=")
@@ -553,7 +554,8 @@ (define_insn "mma_"
MMA_AVVI4I4I8))]
   "TARGET_MMA"
   " %A0,%x2,%x3,%4,%5,%6"
-  [(set_attr "type" "mma")])
+  [(set_attr "type" "mma")
+   (set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
   [(set (match_operand:XO 0 "fpr_reg_operand" "=")
@@ -565,7 +567,8 @@ (define_insn "mma_"
MMA_VVI4I4I2))]
   "TARGET_MMA"
   " %A0,%x1,%x2,%3,%4,%5"
-  [(set_attr "type" "mma")])
+  [(set_attr "type" "mma")
+   (set_attr "prefixed" "yes")])
 
 (define_insn "mma_"
   [(set (match_operand:XO 0 "fpr_reg_operand" "=")

[PATCH] c++: Adjust mangling of __alignof__ [PR88115]

2021-03-30 Thread Patrick Palka via Gcc-patches
We currently mangle __alignof__ as a vendor extended operator,
but that's problematic for the reasons mentioned in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88115#c6.

This patch changes the mangling of __alignof__ to instead use the
new "vendor extended expression" syntax that's proposed in
https://github.com/itanium-cxx-abi/cxx-abi/issues/112.  Clang does
the same thing already, so after this patch both GCC and Clang agree
about the mangling of __alignof__(type) and __alignof__(expr).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

PR c++/88115
* mangle.c (write_expression): Adjust the mangling of
__alignof__.

include/ChangeLog:

PR c++/88115
* demangle.h (enum demangle_component_type): Add
DEMANGLE_COMPONENT_VENDOR_EXPR.

libiberty/ChangeLog:

PR c++/88115
* cp-demangle.c (d_dump, d_make_comp, d_expression_1)
(d_count_templates_scopes): Handle DEMANGLE_COMPONENT_VENDOR_EXPR.
(d_print_comp_inner): Likewise.
: Revert r11-4926
change.
: Likewise.
* testsuite/demangle-expected: Adjust __alignof__ mangling
tests.

gcc/testsuite/ChangeLog:

PR c++/88115
* g++.dg/cpp0x/alignof7.C: Adjust expected mangling.
---
 gcc/cp/mangle.c   |  8 ++---
 gcc/testsuite/g++.dg/cpp0x/alignof7.C |  4 +--
 include/demangle.h|  3 ++
 libiberty/cp-demangle.c   | 47 +++
 libiberty/testsuite/demangle-expected |  4 +--
 5 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 0a9e5aa79a0..57ce9a6710f 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -3124,11 +3124,9 @@ write_expression (tree expr)
  if (abi_version_at_least (15))
{
  /* We used to mangle __alignof__ like alignof.  */
- write_string ("v111__alignof__");
- if (TYPE_P (TREE_OPERAND (expr, 0)))
-   write_type (TREE_OPERAND (expr, 0));
- else
-   write_expression (TREE_OPERAND (expr, 0));
+ write_string ("u11__alignof__");
+ write_template_arg (TREE_OPERAND (expr, 0));
+ write_char ('E');
  return;
}
}
diff --git a/gcc/testsuite/g++.dg/cpp0x/alignof7.C 
b/gcc/testsuite/g++.dg/cpp0x/alignof7.C
index a4d7f24a4d7..2369b879392 100644
--- a/gcc/testsuite/g++.dg/cpp0x/alignof7.C
+++ b/gcc/testsuite/g++.dg/cpp0x/alignof7.C
@@ -18,5 +18,5 @@ template void f4(std::size_t);
 
 // { dg-final { scan-assembler "_Z2f1IiEvDTatT_E" } }
 // { dg-final { scan-assembler "_Z2f2IiEvDTaztlT_EE" } }
-// { dg-final { scan-assembler "_Z2f3IiEvDTv111__alignof__T_E" } }
-// { dg-final { scan-assembler "_Z2f4IiEvDTv111__alignof__tlT_EE" } }
+// { dg-final { scan-assembler "_Z2f3IiEvDTu11__alignof__T_EE" } }
+// { dg-final { scan-assembler "_Z2f4IiEvDTu11__alignof__XtlT_" } }
diff --git a/include/demangle.h b/include/demangle.h
index 23b47265d94..b45234e6887 100644
--- a/include/demangle.h
+++ b/include/demangle.h
@@ -408,6 +408,9 @@ enum demangle_component_type
  number which involves neither modifying the mangled string nor
  allocating a new copy of the literal in memory.  */
   DEMANGLE_COMPONENT_LITERAL_NEG,
+  /* A vendor's builtin expression.  The left subtree holds the name of
+ the type, and the right subtree is a template argument list.  */
+  DEMANGLE_COMPONENT_VENDOR_EXPR,
   /* A libgcj compiled resource.  The left subtree is the name of the
  resource.  */
   DEMANGLE_COMPONENT_JAVA_RESOURCE,
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index d3e798455cc..a528b7b5ed3 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -815,6 +815,9 @@ d_dump (struct demangle_component *dc, int indent)
 case DEMANGLE_COMPONENT_LITERAL_NEG:
   printf ("negative literal\n");
   break;
+case DEMANGLE_COMPONENT_VENDOR_EXPR:
+  printf ("vendor expression\n");
+  break;
 case DEMANGLE_COMPONENT_JAVA_RESOURCE:
   printf ("java resource\n");
   break;
@@ -976,6 +979,7 @@ d_make_comp (struct d_info *di, enum 
demangle_component_type type,
 case DEMANGLE_COMPONENT_TRINARY_ARG1:
 case DEMANGLE_COMPONENT_LITERAL:
 case DEMANGLE_COMPONENT_LITERAL_NEG:
+case DEMANGLE_COMPONENT_VENDOR_EXPR:
 case DEMANGLE_COMPONENT_COMPOUND_NAME:
 case DEMANGLE_COMPONENT_VECTOR_TYPE:
 case DEMANGLE_COMPONENT_CLONE:
@@ -3345,6 +3349,7 @@ d_unresolved_name (struct d_info *di)
 ::= st 
 ::= 
::= 
+   ::= u  * E # vendor extended 
expression
 ::= 
 
::= 
@@ -3425,6 +3430,15 @@ d_expression_1 (struct d_info *di)
   return d_make_comp (di, DEMANGLE_COMPONENT_INITIALIZER_LIST,
  type, d_exprlist (di, 'E'));
 }
+  else if (peek == 'u')
+{

c++: duplicate const static members [PR 99283]

2021-03-30 Thread Nathan Sidwell


This is	the bug	that keeps on giving.  Reducing	it has been successful 
at hitting other defects. In this case, some more specialization hash 
table fun, plus an issue with reading in a definition of a duplicated 
declaration.  At least I discovered a null context check is no longer 
needed.


PR c++/99283
gcc/cp/
* module.cc (dumper::operator): Make less brittle.
(trees_out::core_bools): VAR_DECLs always have a context.
(trees_out::key_mergeable): Use same_type_p for asserting.
(trees_in::read_var_def): Propagate
DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P.
gcc/testsuite/
* g++.dg/modules/pr99283-5.h: New.
* g++.dg/modules/pr99283-5_a.H: New.
* g++.dg/modules/pr99283-5_b.H: New.
* g++.dg/modules/pr99283-5_c.C: New.

--
Nathan Sidwell
diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc
index 8a1cfbdfdcb..fab6b573d24 100644
--- c/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -4325,8 +4325,8 @@ dumper::operator () (const char *format, ...)
 	case 'N': /* Name.  */
 	  {
 	tree t = va_arg (args, tree);
-	if (t && TREE_CODE (t) == OVERLOAD)
-	  t = OVL_FIRST (t);
+	while (t && TREE_CODE (t) == OVERLOAD)
+	  t = OVL_FUNCTION (t);
 	fputc ('\'', dumps->stream);
 	dumps->nested_name (t);
 	fputc ('\'', dumps->stream);
@@ -5206,8 +5206,7 @@ trees_out::core_bools (tree t)
   else if (code == VAR_DECL)
 	{
 	  /* This is DECL_INITIALIZED_P.  */
-	  if (DECL_CONTEXT (t)
-	  && TREE_CODE (DECL_CONTEXT (t)) != FUNCTION_DECL)
+	  if (TREE_CODE (DECL_CONTEXT (t)) != FUNCTION_DECL)
 	/* We'll set this when reading the definition.  */
 	flag_1 = false;
 	}
@@ -10331,8 +10330,8 @@ trees_out::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
 	  if (mk & MK_tmpl_alias_mask)
 		/* It should be in both tables.  */
 		gcc_checking_assert
-		  (match_mergeable_specialization (false, entry)
-		   == TREE_TYPE (existing));
+		  (same_type_p (match_mergeable_specialization (false, entry),
+TREE_TYPE (existing)));
 	  if (mk & MK_tmpl_tmpl_mask)
 		existing = DECL_TI_TEMPLATE (existing);
 	}
@@ -10345,7 +10344,10 @@ trees_out::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
 	}
 
 	  /* The walkabout should have found ourselves.  */
-	  gcc_checking_assert (existing == decl);
+	  gcc_checking_assert (TREE_CODE (decl) == TYPE_DECL
+			   ? same_type_p (TREE_TYPE (decl),
+	  TREE_TYPE (existing))
+			   : existing == decl);
 	}
 }
   else if (mk != MK_unique)
@@ -11513,7 +11515,11 @@ trees_in::read_var_def (tree decl, tree maybe_template)
   if (DECL_EXTERNAL (decl))
 	DECL_NOT_REALLY_EXTERN (decl) = true;
   if (VAR_P (decl))
-	DECL_INITIALIZED_P (decl) = true;
+	{
+	  DECL_INITIALIZED_P (decl) = true;
+	  if (maybe_dup && DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (maybe_dup))
+	DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = true;
+	}
   DECL_INITIAL (decl) = init;
   if (!dyn_init)
 	;
diff --git c/gcc/testsuite/g++.dg/modules/pr99283-5.h w/gcc/testsuite/g++.dg/modules/pr99283-5.h
new file mode 100644
index 000..3c3421f2d1c
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr99283-5.h
@@ -0,0 +1,9 @@
+template
+struct __traits
+{
+  static const int __digits = 8;
+  static const _Value __min = 0;
+};
+
+template
+const _Value __traits<_Value>::__min;
diff --git c/gcc/testsuite/g++.dg/modules/pr99283-5_a.H w/gcc/testsuite/g++.dg/modules/pr99283-5_a.H
new file mode 100644
index 000..6406dfe8102
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr99283-5_a.H
@@ -0,0 +1,14 @@
+// PR 99283 part 5
+// { dg-additional-options -fmodule-header }
+// { dg-module-cmi {} }
+
+#include "pr99283-5.h"
+
+template
+const int __traits<_Value>::__digits;
+
+template
+void Foo ()
+{
+  __traits::__digits;
+}
diff --git c/gcc/testsuite/g++.dg/modules/pr99283-5_b.H w/gcc/testsuite/g++.dg/modules/pr99283-5_b.H
new file mode 100644
index 000..3f4237e2e4f
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr99283-5_b.H
@@ -0,0 +1,12 @@
+// { dg-additional-options {-fmodule-header -fno-module-lazy} }
+// { dg-module-cmi {} }
+
+#include "pr99283-5.h"
+
+template
+void Bar ()
+{
+  __traits::__min;
+}
+
+import  "pr99283-5_a.H";
diff --git c/gcc/testsuite/g++.dg/modules/pr99283-5_c.C w/gcc/testsuite/g++.dg/modules/pr99283-5_c.C
new file mode 100644
index 000..cc7e795c829
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr99283-5_c.C
@@ -0,0 +1,5 @@
+// { dg-additional-options {-fmodules-ts -fno-module-lazy} }
+
+import  "pr99283-5_b.H";
+
+static_assert(!__traits::__min);


Re: [PATCH] c++: Fix ICE on PTRMEM_CST in lambda in inline var initializer [PR99790]

2021-03-30 Thread Jason Merrill via Gcc-patches

On 3/30/21 5:38 AM, Jakub Jelinek wrote:

Hi!

The following testcase ICEs (since the addition of inline var support),
because the lambda contains PTRMEM_CST but finish_function is called for the
lambda quite early during parsing it (from finish_lambda_function) when
the containing class is still incomplete.  That means that during
genericization cplus_expand_constant keeps the PTRMEM_CST unmodified, but
later nothing lowers it when the class is finalized.
Using sizeof etc. on the class in such contexts is rejected by both g++ and
clang++, and when the PTRMEM_CST appears e.g. in static var initializers
rather than in functions, we handle it correctly because c_parse_final_cleanups
-> lower_var_init will handle those cplus_expand_constant when all classes
are already finalized.

The following patch fixes it by calling cplus_expand_constant again during
gimplification, as we are now unconditionally unit at a time, I'd think
everything that could be completed will be before we start gimplification.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2021-03-30  Jakub Jelinek  

PR c++/99790
* cp-gimplify.c (cp_gimplify_expr): Handle PTRMEM_CST.

* g++.dg/cpp1z/pr99790.C: New test.

--- gcc/cp/cp-gimplify.c.jj 2021-03-20 17:01:59.791040946 +0100
+++ gcc/cp/cp-gimplify.c2021-03-29 14:27:37.532223156 +0200
@@ -660,6 +660,14 @@ cp_gimplify_expr (tree *expr_p, gimple_s
ret = GS_UNHANDLED;
break;
  
+case PTRMEM_CST:

+  *expr_p = cplus_expand_constant (*expr_p);
+  if (TREE_CODE (*expr_p) == PTRMEM_CST)
+   ret = GS_ERROR;
+  else
+   ret = GS_OK;
+  break;
+
  case RETURN_EXPR:
if (TREE_OPERAND (*expr_p, 0)
  && (TREE_CODE (TREE_OPERAND (*expr_p, 0)) == INIT_EXPR
--- gcc/testsuite/g++.dg/cpp1z/pr99790.C.jj 2021-03-29 14:38:35.887814018 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/pr99790.C2021-03-29 14:37:41.828422188 
+0200
@@ -0,0 +1,9 @@
+// PR c++/99790
+// { dg-do compile { target c++17 } }
+
+struct A;
+struct B { void (*fn) (A *); };
+template 
+int foo (const T &);
+struct A { int a; static constexpr B b{[] (A *n) { n->*::a = 2; }}; };
+int a = foo (A::b);

Jakub





[PATCH] aarch64: PR target/99820: Guard on available SVE issue info before using

2021-03-30 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This fixes a simple segfault ICE when using the use_new_vector_costs tunable 
with a CPU tuning that it wasn't intended for.
I'm not adding a testcase here as we intend to remove the tunable for GCC 12 
anyway (the new costing logic will remain and will benefit
from this extra check, but the -moverride option will no longer exist).

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

PR target/99820
* config/aarch64/aarch64.c (aarch64_analyze_loop_vinfo): Check for
available issue_info before using it.


nvcn.patch
Description: nvcn.patch


Re: [wwwdoc] gcc-11/changes: Document RISC-V changes

2021-03-30 Thread Kito Cheng via Gcc-patches
committed with few tweaks.

On Wed, Mar 24, 2021 at 11:21 AM Kito Cheng  wrote:
>
> ---
>  htdocs/gcc-11/changes.html | 32 +++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
>
> diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
> index 69869309..ec136349 100644
> --- a/htdocs/gcc-11/changes.html
> +++ b/htdocs/gcc-11/changes.html
> @@ -688,7 +688,37 @@ a work-in-progress.
>
>  
>
> -
> +RISC-V
> +
> +  Support big-endian for RISC-V, thanks to Marcus Comstedt.
> +  Implement new style of architecture extension test macros,
> +  each architecture extension has corresponding feature test macro, which
> +  could use to test its existence and version information.
> +  
> +  Legacy architecture extension test macro like 
> __riscv_atomic,
> +  but it will still supported for at least 2 release cycles.
> +  
> +  Support IFUNC for riscv*-*-linux*.
> +  Introduce --with-multilib-generator to configure time 
> option,
> +  this option could flexible config multi-lib settings, syntax is same as
> +  RISC-V's multilib-generator.
> +  
> +  Extend the sytax for multilib-generator, support expansion
> +  operator * to reduce the complexity of complicated 
> multi-lib
> +  re-use rule.
> +  
> +  Support -mcpu=* option, the behavior is aligned to RISC-V
> +  clang/LLVM, it will set pipeline model and architecture extension, like
> +  -mtune=* plus -march=*.
> +  
> +  Support for TLS stack protector canary access, thanks to Cooper Qu.
> +  
> +  Support __builtin_thread_pointer for RISC-V.
> +  Introduce shorten_memrefs optimization, which could 
> reduce the
> +  code size for memory access, thanks to Craig Blackmore.
> +  
> +
> +
>
>  
>
> --
> 2.30.2
>


[PATCH] testsuite, v2: Disable zero-scratch-regs-{8,9,10,11}.c on all but ... [PR97680]

2021-03-30 Thread Jakub Jelinek via Gcc-patches
On Tue, Mar 30, 2021 at 02:33:17PM +0200, Richard Biener via Gcc-patches wrote:
> > I don't know, perhaps.
> > Seems the target hook is only defined on
> > config/i386/i386.c:#undef TARGET_ZERO_CALL_USED_REGS
> > config/i386/i386.c:#define TARGET_ZERO_CALL_USED_REGS 
> > ix86_zero_call_used_regs
> > config/sparc/sparc.c:#undef TARGET_ZERO_CALL_USED_REGS
> > config/sparc/sparc.c:#define TARGET_ZERO_CALL_USED_REGS 
> > sparc_zero_call_used_regs
> > but apparently many of the tests actually succeed on various targets that
> > don't define those hooks.  E.g. I haven't seen them to fail on aarch64,
> > on arm only the -10.c fails, on powerpc*/s390* all {8,9,10,11} fail (plus
> > 5 is skipped on power*-aix*).
> > On ia64 according to testresults {6,7,8,9,10,11} fail, some with ICEs.
> > On mipsel according to testresults {9,10,11} fail, some with ICEs.
> > On nvptx at least 1-9 succeed, 10-11 don't know, don't have assert.h around.
> >
> > So, do we want to fill in negative dg-skip-if for the 6-11 tests or
> > positive?  In any case, is there any hope any of the maintainers or the
> > original submitter will change anything for GCC 12, or are we going to end
> > up with a very narrowly supported feature?
> 
> It looks like the latter - I've seen no attempt by the original authors to 
> make
> the feature work on more targets than they cared for.

So, like this instead?

I've kept {5,6,7} with aix,ia64,ia64 skipped because those seems like
outliers, it works pretty much everywhere but on those.
The rest have known good targets.

Tested on x86_64-linux, verified all tests are run there.

2021-03-30  Jakub Jelinek  

PR testsuite/97680
* c-c++-common/zero-scratch-regs-6.c: Skip on ia64.
* c-c++-common/zero-scratch-regs-7.c: Likewise.
* c-c++-common/zero-scratch-regs-8.c: Change from dg-skip-if of
selected unsupported triplets to all targets but selected triplets
of supported targets.
* c-c++-common/zero-scratch-regs-9.c: Likewise.
* c-c++-common/zero-scratch-regs-10.c: Likewise.
* c-c++-common/zero-scratch-regs-11.c: Likewise.

--- gcc/testsuite/c-c++-common/zero-scratch-regs-6.c.jj 2020-10-31 
17:41:19.793739605 +0100
+++ gcc/testsuite/c-c++-common/zero-scratch-regs-6.c2021-03-30 
16:19:00.509582587 +0200
@@ -1,4 +1,5 @@
 /* { dg-do run } */
+/* { dg-skip-if "not implemented" { ia64*-*-* } } */
 /* { dg-options "-O2 -fzero-call-used-regs=all-gpr-arg" } */
 
 #include "zero-scratch-regs-1.c"
--- gcc/testsuite/c-c++-common/zero-scratch-regs-7.c.jj 2020-10-31 
17:41:19.793739605 +0100
+++ gcc/testsuite/c-c++-common/zero-scratch-regs-7.c2021-03-30 
16:19:12.059454807 +0200
@@ -1,4 +1,5 @@
 /* { dg-do run } */
+/* { dg-skip-if "not implemented" { ia64*-*-* } } */
 /* { dg-options "-O2 -fzero-call-used-regs=all-gpr" } */
 
 #include "zero-scratch-regs-1.c"
--- gcc/testsuite/c-c++-common/zero-scratch-regs-8.c.jj 2020-11-11 
01:46:03.392696119 +0100
+++ gcc/testsuite/c-c++-common/zero-scratch-regs-8.c2021-03-30 
16:21:28.453945834 +0200
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-skip-if "not implemented" { powerpc*-*-* } } */
+/* { dg-skip-if "not implemented" { ! { i?86*-*-* x86_64*-*-* sparc*-*-* 
aarch64*-*-* arm*-*-* nvptx*-*-* } } } */
 /* { dg-options "-O2 -fzero-call-used-regs=all-arg" } */
 
 #include "zero-scratch-regs-1.c"
--- gcc/testsuite/c-c++-common/zero-scratch-regs-9.c.jj 2020-11-11 
01:46:03.392696119 +0100
+++ gcc/testsuite/c-c++-common/zero-scratch-regs-9.c2021-03-30 
16:21:41.711799156 +0200
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-skip-if "not implemented" { powerpc*-*-* } } */
+/* { dg-skip-if "not implemented" { ! { i?86*-*-* x86_64*-*-* sparc*-*-* 
aarch64*-*-* arm*-*-* nvptx*-*-* } } } */
 /* { dg-options "-O2 -fzero-call-used-regs=all" } */
 
 #include "zero-scratch-regs-1.c"
--- gcc/testsuite/c-c++-common/zero-scratch-regs-10.c.jj2021-03-18 
15:32:56.459617723 +0100
+++ gcc/testsuite/c-c++-common/zero-scratch-regs-10.c   2021-03-30 
16:21:55.017651951 +0200
@@ -1,6 +1,5 @@
 /* { dg-do run } */
-/* { dg-skip-if "not implemented" { powerpc*-*-* } } */
-/* { dg-skip-if "not implemented" { arm*-*-* } } */
+/* { dg-skip-if "not implemented" { ! { i?86*-*-* x86_64*-*-* sparc*-*-* 
aarch64*-*-* nvptx*-*-* } } } */
 /* { dg-options "-O2" } */
 
 #include 
--- gcc/testsuite/c-c++-common/zero-scratch-regs-11.c.jj2020-11-11 
01:46:03.392696119 +0100
+++ gcc/testsuite/c-c++-common/zero-scratch-regs-11.c   2021-03-30 
16:22:04.439547999 +0200
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-skip-if "not implemented" { powerpc*-*-* } } */
+/* { dg-skip-if "not implemented" { ! { i?86*-*-* x86_64*-*-* sparc*-*-* 
aarch64*-*-* arm*-*-* nvptx*-*-* } } } */
 /* { dg-options "-O2 -fzero-call-used-regs=all" } */
 
 #include "zero-scratch-regs-10.c"


Jakub



[PATCH] aarch64: PR target/99822 Don't allow zero register in first operand of SUBS/ADDS-immediate

2021-03-30 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

In this PR we end up generating an invalid instruction:
adds x1,xzr,#2

because the pattern accepts zero as an operand in the comparison, but the 
instruction doesn't.
Fix it by adjusting the predicate and constraints.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

PR target/99822
* config/aarch64/aarch64.md (sub3_compare1_imm): Do not allow zero
in operand 1.

gcc/testsuite/ChangeLog:

PR target/99822
* gcc.c-torture/compile/pr99822.c: New test.


zero-cmpimm.patch
Description: zero-cmpimm.patch


Re: [PATCH] slp: reject non-multiple of 2 laned SLP trees (PR99825)

2021-03-30 Thread Richard Biener
On Tue, 30 Mar 2021, Tamar Christina wrote:

> Hi Richi,
> 
> TWO_OPERANDS allows any order or number of combinations of + and - operations
> but the pattern matcher only supports pairs of operations.
> 
> This patch has the pattern matcher for complex numbers reject SLP trees where
> the lanes are not a multiple of 2.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/99825
>   * tree-vect-slp-patterns.c (vect_check_evenodd_blend):
>   Reject non-mult 2 lanes.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/99825
>   * gfortran.dg/vect/pr99825.f90: New test.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/testsuite/gfortran.dg/vect/pr99825.f90 
> b/gcc/testsuite/gfortran.dg/vect/pr99825.f90
> new file mode 100644
> index 
> ..b1423df32254e80dc2958b95083831173011afed
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/vect/pr99825.f90
> @@ -0,0 +1,15 @@
> +! { dg-do compile }
> +! { dg-additional-options "-march=armv8.3-a -O3" { target { aarch64*-*-* } } 
> }
> +
> +program main
> +  complex, dimension(3, 2) :: a
> +  complex, dimension(2, 4) :: b
> +  complex, dimension(3, 4) :: c, res1
> +
> +  data a /0, (-5., -7.), (11., -13.), 0, 0, 0/
> +  data b /0, 0, 0, 0, 0, 0, 0, 0/
> +  data res1 /0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0/
> +
> +  c = matmul(a, conjg(b))
> +  if (any(res1 /= c)) stop 2
> +end program main
> diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
> index 
> cdd61e9a5cf0c5698fd94c29c6615d293ce4c41a..b25655c987698fcca68377e3bc83ff7f3426
>  100644
> --- a/gcc/tree-vect-slp-patterns.c
> +++ b/gcc/tree-vect-slp-patterns.c
> @@ -332,7 +332,8 @@ static inline bool
>  vect_check_evenodd_blend (lane_permutation_t ,
>unsigned even, unsigned odd)
>  {
> -  if (permutes.length () == 0)
> +  if (permutes.length () == 0
> +  || permutes.length () % 2 != 0)
>  return false;
>  
>unsigned val[2] = {even, odd};
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] arm: Fix emission of Tag_ABI_VFP_args with MVE and -mfloat-abi=hard (PR target/99773)

2021-03-30 Thread Christophe Lyon via Gcc-patches
On Tue, 30 Mar 2021 at 14:59, Richard Earnshaw
 wrote:
>
>
>
> On 30/03/2021 13:37, Christophe Lyon via Gcc-patches wrote:
> > When compiling with -mfloat-abi=hard -march=armv8.1-m.main+mve, we
> > want to emit Tag_ABI_VFP_args even though we are not emitting
> > floating-point instructions (we need "+mve.fp" for that), because we
> > use MVE registers to pass FP arguments.
> >
> > This patch removes the condition on (! TARGET_SOFT_FLOAT) because this
> > is a case where TARGET_SOFT_FLOAT is true, and TARGET_HARD_FLOAT_ABI
> > is true too.
> >
> > 2021-03-30  Richard Earnshaw  
> >
> >   gcc/
> >   PR target/99773
> >   * config/arm/arm.c (arm_file_start): Fix emission of
> >   Tag_ABI_VFP_args attribute.
>
> For historical reasons, for commits I use rearn...@arm.com as my email
> address.
>
> Otherwise, OK.
>
> R.
>
> :)

Oops sorry ;-)

Pushed, thanks

Christophe

>
> > ---
> >   gcc/config/arm/arm.c | 11 ---
> >   1 file changed, 4 insertions(+), 7 deletions(-)
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index e89f5e2..518bfed 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -28150,14 +28150,11 @@ arm_file_start (void)
> > if (print_tune_info)
> >   arm_print_tune_info ();
> >
> > -  if (! TARGET_SOFT_FLOAT)
> > - {
> > -   if (TARGET_HARD_FLOAT && TARGET_VFP_SINGLE)
> > - arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 1);
> > +  if (TARGET_HARD_FLOAT && TARGET_VFP_SINGLE)
> > + arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 1);
> >
> > -   if (TARGET_HARD_FLOAT_ABI)
> > - arm_emit_eabi_attribute ("Tag_ABI_VFP_args", 28, 1);
> > - }
> > +  if (TARGET_HARD_FLOAT_ABI)
> > + arm_emit_eabi_attribute ("Tag_ABI_VFP_args", 28, 1);
> >
> > /* Some of these attributes only apply when the corresponding 
> > features
> >are used.  However we don't have any easy way of figuring this out.
> >


RE: [PATCH] aarch64: PR target/99037 Fix RTL represntation in move_lo_quad patterns

2021-03-30 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Christophe Lyon 
> Sent: 30 March 2021 13:57
> To: Kyrylo Tkachov 
> Cc: Jakub Jelinek ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] aarch64: PR target/99037 Fix RTL represntation in
> move_lo_quad patterns
> 
> On Mon, 29 Mar 2021 at 12:56, Kyrylo Tkachov via Gcc-patches
>  wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Jakub Jelinek 
> > > Sent: 29 March 2021 11:45
> > > To: Kyrylo Tkachov 
> > > Cc: gcc-patches@gcc.gnu.org
> > > Subject: Re: [PATCH] aarch64: PR target/99037 Fix RTL represntation in
> > > move_lo_quad patterns
> > >
> > > On Mon, Mar 29, 2021 at 10:41:17AM +, Kyrylo Tkachov wrote:
> > > > Hi all,
> > > >
> > > > This patch fixes the RTL representation of the move_lo_quad patterns
> to
> > > use aarch64_simd_or_scalar_imm_zero
> > > > for the zero part rather than a vec_duplicate of zero or a const_int 0.
> > > > The expander that generates them is also adjusted so that we use and
> > > match the correct const_vector forms throughout.
> > > >
> > > > Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-
> > > none-elf.
> > > > Jakub, does this look like what you've had in mind?
> > >
> > > I'd probably also remove VQ_2E iterator that is now unused (the
> > > VQMOV_NO2E
> > > one is used in other patterns).  But otherwise yes.
> >
> > Thanks, I've pushed it to trunk.
> > gcc/ChangeLog:
> >
> > PR target/99037
> > * config/aarch64/aarch64-simd.md (move_lo_quad_internal_):
> Use
> > aarch64_simd_or_scalar_imm_zero to match zeroes.  Remove pattern
> > matching const_int 0.
> > (move_lo_quad_internal_be_): Likewise.
> > (move_lo_quad_): Update for the above.
> > * config/aarch64/iterators.md (VQ_2E): Delete.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/99808
> > * gcc.target/aarch64/pr99808.c: New test.
> >
> 
> The new test fails with -mabi=ilp32:
> FAIL: gcc.target/aarch64/pr99808.c (test for excess errors)
> Excess errors:
> /gcc/testsuite/gcc.target/aarch64/pr99808.c:11:3: error: cannot
> convert a value of type 'long unsigned int' to vector type
> '__Float64x1_t' which has different size
> /gcc/testsuite/gcc.target/aarch64/pr99808.c:11:3: error: cannot
> convert a value of type 'long unsigned int' to vector type
> '__Float64x1_t' which has different size
> /gcc/testsuite/gcc.target/aarch64/pr99808.c:12:3: error: cannot
> convert a value of type 'long unsigned int' to vector type
> '__Float32x2_t' which has different size
> 
> Can you fix it?
> 

Oops, thanks for catching it. Fixed.

gcc/testsuite/ChangeLog:

PR target/99808
* gcc.target/aarch64/pr99808.c: Use ULL constant suffix.

> Thanks
> 
> Christophe
> 
> > >
> > > > gcc/ChangeLog:
> > > >
> > > > PR target/99037
> > > > * config/aarch64/aarch64-simd.md
> > > (move_lo_quad_internal_): Use
> > > > aarch64_simd_or_scalar_imm_zero to match zeroes.  Remove
> > > pattern
> > > > matching const_int 0.
> > > > (move_lo_quad_internal_be_): Likewise.
> > > > (move_lo_quad_): Update for the above.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > PR target/99808
> > > > * gcc.target/aarch64/pr99808.c: New test.

> > >
> > >
> > >
> > >   Jakub
> >


ilp32-test.patch
Description: ilp32-test.patch


Re: [PATCH] testsuite: Disable zero-scratch-regs-{8,9,10,11}.c on s390* [PR97680]

2021-03-30 Thread Andreas Krebbel via Gcc-patches
On 3/30/21 12:43 PM, Jakub Jelinek wrote:
> Hi!
> 
> These test FAIL on s390*:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:
>  In function 'foo8':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-10.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:
>  In function 'foo8':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-11.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> In file included from 
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-8.c:5:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:
>  In function 'foo':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-8.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:
>  In function 'foo':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-9.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> Like on powerpc or arm, they need backend support which isn't there and
> likely should be added for GCC 12.
> 
> Ok to skip the test on s390* until then?

Fine with me.

Thanks!

Andreas

> 
> 2021-03-30  Jakub Jelinek  
> 
>   PR testsuite/97680
>   * c-c++-common/zero-scratch-regs-8.c: Skip on s390.
>   * c-c++-common/zero-scratch-regs-9.c: Likewise.
>   * c-c++-common/zero-scratch-regs-10.c: Likewise.
>   * c-c++-common/zero-scratch-regs-11.c: Likewise.
> 
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-8.c.jj   2020-11-11 
> 01:46:03.392696119 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-8.c  2021-03-30 
> 12:32:11.099667255 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-options "-O2 -fzero-call-used-regs=all-arg" } */
>  
>  #include "zero-scratch-regs-1.c"
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-9.c.jj   2020-11-11 
> 01:46:03.392696119 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-9.c  2021-03-30 
> 12:32:26.707493760 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-options "-O2 -fzero-call-used-regs=all" } */
>  
>  #include "zero-scratch-regs-1.c"
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-10.c.jj  2021-03-18 
> 15:32:56.459617723 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-10.c 2021-03-30 
> 12:31:56.468829910 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-skip-if "not implemented" { arm*-*-* } } */
>  /* { dg-options "-O2" } */
>  
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-11.c.jj  2020-11-11 
> 01:46:03.392696119 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-11.c 2021-03-30 
> 12:32:46.012279152 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-options "-O2 -fzero-call-used-regs=all" } */
>  
>  #include "zero-scratch-regs-10.c"
> 
>   Jakub
> 



[GCC 12 PATCH 2/2] ipa-sra: Improve debug info for removed parameters (PR 93385)

2021-03-30 Thread Martin Jambor
Hi,

Whereas the previous patch fixed issues with code left behind after
IPA-SRA removed a parameter but only reset all affected debug bind
statements, this one updates them with expressions which can allow the
debugger to print the removed value - see the added test-case.

Even though I originally did not want to create DEBUG_EXPR_DECLs for
intermediate values, I ended up doing so, because otherwise the code
started creating statements like

   # DEBUG __aD.198693 => [(const struct _Alloc_nodeD.171110 
*)D#195]._M_tD.184726->_M_implD.171154

which not only is a bit scary but also gimple-fold ICEs on
it. Therefore I decided they are probably quite necessary and have
them.

The patch simply notes each removed SSA name present in a debug
statement and then works from it backwards, looking if it can
reconstruct the expression it represents (which can fail if a
non-degenerate PHI node is in the way).  If it can, it populates two
hash maps with those expressions so that 1) removed assignments are
replaced with a debug bind defining a new intermediate debug_decl_expr
and 2) existing debug binds that refer to SSA names that are bing
removed now refer to corresponding debug_decl_exprs.

If a removed parameter is passed to another function, the debugging
information still cannot describe its value there - see the xfailed
test in the testcase.  I sort of know what needs to be done but the
handling of debug information for removed parameters is LTO unfriendly
in general and so needs a bit more work.

Bootstrapped and tested on x86_64-linux, i686-linux and aarch64-linux.
Also LTO-bootstrapped and LTO-profiledbootstrapped on x86_64-linux.

I would like to commit the patch early in stage1 for GCC 12 and will
be grateful for a review and any feedback.

Thanks,

Martin

gcc/ChangeLog:

2021-03-29  Martin Jambor  

PR ipa/93385
* ipa-param-manipulation.h (class ipa_param_body_adjustments): New
members remap_with_debug_expressions, m_dead_ssa_debug_equiv,
m_dead_stmt_debug_equiv and prepare_debug_expressions.  Added
parameter to mark_dead_statements.
* ipa-param-manipulation.c: Include tree-phinodes.h and cfgexpand.h.
(ipa_param_body_adjustments::mark_dead_statements): New parameter
debugstack, push into it all SSA names used in debug statements,
produce m_dead_ssa_debug_equiv mapping for the removed param.
(replace_with_mapped_expr): New function.
(ipa_param_body_adjustments::remap_with_debug_expressions): Likewise.
(ipa_param_body_adjustments::prepare_debug_expressions): Likewise.
(ipa_param_body_adjustments::common_initialization): Gather and
procecc SSA which will be removed but are in debug statements. Simplify.
(ipa_param_body_adjustments::ipa_param_body_adjustments): Initialize
new members.
* tree-inline.c (remap_gimple_stmt): Create a debug bind when possible
when avoiding a copy of an unnecessary statement.  Remap removed SSA
names in existing debug statements.
(tree_function_versioning): Do not create DEBUG_EXPR_DECL for removed
parameters if we have already done so.

gcc/testsuite/ChangeLog:

2021-03-29  Martin Jambor  

PR ipa/93385
* gcc.dg/guality/ipa-sra-1.c: New test.
---
 gcc/ipa-param-manipulation.c | 281 ++-
 gcc/ipa-param-manipulation.h |  12 +-
 gcc/testsuite/gcc.dg/guality/ipa-sra-1.c |  45 
 gcc/tree-inline.c|  45 ++--
 4 files changed, 306 insertions(+), 77 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/guality/ipa-sra-1.c

diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 3e07fd72fe2..a202501fc95 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -43,6 +43,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "alloc-pool.h"
 #include "symbol-summary.h"
 #include "symtab-clones.h"
+#include "tree-phinodes.h"
+#include "cfgexpand.h"
 
 
 /* Actual prefixes of different newly synthetized parameters.  Keep in sync
@@ -989,10 +991,12 @@ phi_arg_will_live_p (gphi *phi, bitmap blocks_to_copy, 
tree arg)
 
 /* Populate m_dead_stmts given that DEAD_PARAM is going to be removed without
any replacement or splitting.  REPL is the replacement VAR_SECL to base any
-   remaining uses of a removed parameter on.  */
+   remaining uses of a removed parameter on.  Push all removed SSA names that
+   are used within debug statements to DEBUGSTACK.  */
 
 void
-ipa_param_body_adjustments::mark_dead_statements (tree dead_param)
+ipa_param_body_adjustments::mark_dead_statements (tree dead_param,
+ vec *debugstack)
 {
   /* Current IPA analyses which remove unused parameters never remove a
  non-gimple register ones which have any use except as parameters in other
@@ -1004,6 +1008,7 @@ ipa_param_body_adjustments::mark_dead_statements (tree 

[GCC 12 PATCH 1/2] ipa-sra: Introduce a mini-DCE to tree-inline.c (PR 93385)

2021-03-30 Thread Martin Jambor
Hi,

PR 93385 reveals that if the user explicitely disables DCE, IPA-SRA
can leave behind statements which are useless because their results
are eventually not used but can have problematic side effects,
especially since their inputs are now bogus that useless parameters
were removed.

This patch fixes the problem by doing a def-use walk when
materializing clones, marking which statements should not be copied
and which SSA_NAMEs do not need to be computed because eventually they
would be DCEd.

When an argument of a call within such a function is removed,
however, that change needs to be communicated to call redirection code.
This is call specific information and therefore cannot be reasonably
encoded in clone node summary and has to be put in call summaries.
Combining these with stuff in performed_splits in clone_info would be
very cumbersome and therefore this patch removes performed_splits and
moves all information it into call summaries too.  This has also the
advantage that the code is hopefully a bit easier to understand and we
do not need any special dummy variables.

The new edge summaries are private to ipa-param-manipulation.c and
hopefully will never be needed elsewhere.  It simply contains 1) a
mapping from the original argument indices to the actual indices in the
call statement as it is now, 2) information needed to identify
arguments representing pass-through IPA-SRA splits with which have
been added to the call arguments in place of an original
argument/reference and 3) a delta to the index where va_args may
start.

Bootstrapped and tested on x86_64-linux, i686-linux and aarch64-linux.
Also LTO-bootstrapped and LTO-profiledbootstrapped on x86_64-linux.

I would like to commit the patch early in stage1 for GCC 12 and will
be grateful for a review and any feedback.

Thanks,

Martin


gcc/ChangeLog:

2021-03-24  Martin Jambor  

PR ipa/93385
* symtab-clones.h (clone_info): Removed member param_adjustments.
* ipa-param-manipulation.h: Adjust initial comment to reflect how we
deal with pass-through splits now.
(ipa_param_performed_split): Removed.
(ipa_param_adjustments::modify_call): Adjusted parameters.
(class ipa_param_body_adjustments): New members m_dead_stmts,
m_dead_ssas, mark_dead_statements, modify_call_argument and
m_new_call_arg_modification_info.  Adjusted parameters of
register_replacement, modify_gimple_stmt and modify_call_stmt.
(ipa_verify_edge_has_no_modifications): Declare.
* ipa-param-manipulation.c (struct pass_through_split_map): New type.
(ipa_edge_modification_info): Likewise.
(ipa_edge_modification_sum): Likewise.
(ipa_edge_modifications): New edge summary.
(ipa_verify_edge_has_no_modifications): New function.
(transitive_split_p): Removed.
(transitive_split_map): Likewise.
(init_transitive_splits): Likewise.
(ipa_param_adjustments::modify_call): Adjusted to use the new edge
summary instead of performed_splits.
(ipa_param_body_adjustments::register_replacement): Drop dummy
parameter, set base_index of the created ipa_param_body_replacement.
(phi_arg_will_live_p): New function.
(ipa_param_body_adjustments::mark_dead_statements): New method.
(ipa_param_body_adjustments::common_initialization): Call it.  Do not
create IPA_SRA dummy decls.
(ipa_param_body_adjustments::ipa_param_body_adjustments): Initialize
new mwmbers.
(simple_tree_swap_info): Removed.
(remap_split_decl_to_dummy): Likewise.
(record_argument_state_1): New function.
(record_argument_state): Likewise.
(ipa_param_body_adjustments::modify_call_stmt): New parameter
orig_stmt.  Do not work with dummy decls, save necessary info about
changes to ipa_edge_modifications.
(ipa_param_body_adjustments::modify_gimple_stmt): New parameter
orig_stmt, pass it to modify_call_stmt.
(ipa_param_body_adjustments::modify_cfun_body): Adjust call to
modify_gimple_stmt.
* tree-inline.c (remap_gimple_stmt): Do not copy dead statements,
reset dead debug statements, pass original statement to
modify_gimple_stmt.
(copy_phis_for_bb): Do not copy dead PHI nodes.
(expand_call_inline): Do not remap performed_splits.
(update_clone_info): Likewise.

gcc/testsuite/ChangeLog:

2021-03-22  Martin Jambor  

PR ipa/93385
* gcc.dg/ipa/pr93385.c: New test.
* gcc.dg/ipa/ipa-sra-23.c: Likewise.
* gcc.dg/ipa/ipa-sra-24.c: Likewise.
* g++.dg/ipa/ipa-sra-4.C: Likewise.
---
 gcc/cgraph.c  |  22 +-
 gcc/cgraphclones.c|   3 -
 gcc/ipa-param-manipulation.c  | 916 --
 gcc/ipa-param-manipulation.h  |  92 ++-
 gcc/symtab-clones.h   |  15 +-
 

Re: [PATCH] arm: Fix emission of Tag_ABI_VFP_args with MVE and -mfloat-abi=hard (PR target/99773)

2021-03-30 Thread Richard Earnshaw via Gcc-patches




On 30/03/2021 13:37, Christophe Lyon via Gcc-patches wrote:

When compiling with -mfloat-abi=hard -march=armv8.1-m.main+mve, we
want to emit Tag_ABI_VFP_args even though we are not emitting
floating-point instructions (we need "+mve.fp" for that), because we
use MVE registers to pass FP arguments.

This patch removes the condition on (! TARGET_SOFT_FLOAT) because this
is a case where TARGET_SOFT_FLOAT is true, and TARGET_HARD_FLOAT_ABI
is true too.

2021-03-30  Richard Earnshaw  

gcc/
PR target/99773
* config/arm/arm.c (arm_file_start): Fix emission of
Tag_ABI_VFP_args attribute.


For historical reasons, for commits I use rearn...@arm.com as my email 
address.


Otherwise, OK.

R.

:)


---
  gcc/config/arm/arm.c | 11 ---
  1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e89f5e2..518bfed 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28150,14 +28150,11 @@ arm_file_start (void)
if (print_tune_info)
arm_print_tune_info ();
  
-  if (! TARGET_SOFT_FLOAT)

-   {
- if (TARGET_HARD_FLOAT && TARGET_VFP_SINGLE)
-   arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 1);
+  if (TARGET_HARD_FLOAT && TARGET_VFP_SINGLE)
+   arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 1);
  
-	  if (TARGET_HARD_FLOAT_ABI)

-   arm_emit_eabi_attribute ("Tag_ABI_VFP_args", 28, 1);
-   }
+  if (TARGET_HARD_FLOAT_ABI)
+   arm_emit_eabi_attribute ("Tag_ABI_VFP_args", 28, 1);
  
/* Some of these attributes only apply when the corresponding features

 are used.  However we don't have any easy way of figuring this out.



Re: [PATCH] aarch64: PR target/99037 Fix RTL represntation in move_lo_quad patterns

2021-03-30 Thread Christophe Lyon via Gcc-patches
On Mon, 29 Mar 2021 at 12:56, Kyrylo Tkachov via Gcc-patches
 wrote:
>
>
>
> > -Original Message-
> > From: Jakub Jelinek 
> > Sent: 29 March 2021 11:45
> > To: Kyrylo Tkachov 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH] aarch64: PR target/99037 Fix RTL represntation in
> > move_lo_quad patterns
> >
> > On Mon, Mar 29, 2021 at 10:41:17AM +, Kyrylo Tkachov wrote:
> > > Hi all,
> > >
> > > This patch fixes the RTL representation of the move_lo_quad patterns to
> > use aarch64_simd_or_scalar_imm_zero
> > > for the zero part rather than a vec_duplicate of zero or a const_int 0.
> > > The expander that generates them is also adjusted so that we use and
> > match the correct const_vector forms throughout.
> > >
> > > Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-
> > none-elf.
> > > Jakub, does this look like what you've had in mind?
> >
> > I'd probably also remove VQ_2E iterator that is now unused (the
> > VQMOV_NO2E
> > one is used in other patterns).  But otherwise yes.
>
> Thanks, I've pushed it to trunk.
> gcc/ChangeLog:
>
> PR target/99037
> * config/aarch64/aarch64-simd.md (move_lo_quad_internal_): Use
> aarch64_simd_or_scalar_imm_zero to match zeroes.  Remove pattern
> matching const_int 0.
> (move_lo_quad_internal_be_): Likewise.
> (move_lo_quad_): Update for the above.
> * config/aarch64/iterators.md (VQ_2E): Delete.
>
> gcc/testsuite/ChangeLog:
>
> PR target/99808
> * gcc.target/aarch64/pr99808.c: New test.
>

The new test fails with -mabi=ilp32:
FAIL: gcc.target/aarch64/pr99808.c (test for excess errors)
Excess errors:
/gcc/testsuite/gcc.target/aarch64/pr99808.c:11:3: error: cannot
convert a value of type 'long unsigned int' to vector type
'__Float64x1_t' which has different size
/gcc/testsuite/gcc.target/aarch64/pr99808.c:11:3: error: cannot
convert a value of type 'long unsigned int' to vector type
'__Float64x1_t' which has different size
/gcc/testsuite/gcc.target/aarch64/pr99808.c:12:3: error: cannot
convert a value of type 'long unsigned int' to vector type
'__Float32x2_t' which has different size

Can you fix it?

Thanks

Christophe

> >
> > > gcc/ChangeLog:
> > >
> > > PR target/99037
> > > * config/aarch64/aarch64-simd.md
> > (move_lo_quad_internal_): Use
> > > aarch64_simd_or_scalar_imm_zero to match zeroes.  Remove
> > pattern
> > > matching const_int 0.
> > > (move_lo_quad_internal_be_): Likewise.
> > > (move_lo_quad_): Update for the above.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR target/99808
> > > * gcc.target/aarch64/pr99808.c: New test.
> >
> >
> >
> >   Jakub
>


[PATCH] slp: reject non-multiple of 2 laned SLP trees (PR99825)

2021-03-30 Thread Tamar Christina via Gcc-patches
Hi Richi,

TWO_OPERANDS allows any order or number of combinations of + and - operations
but the pattern matcher only supports pairs of operations.

This patch has the pattern matcher for complex numbers reject SLP trees where
the lanes are not a multiple of 2.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/99825
* tree-vect-slp-patterns.c (vect_check_evenodd_blend):
Reject non-mult 2 lanes.

gcc/testsuite/ChangeLog:

PR tree-optimization/99825
* gfortran.dg/vect/pr99825.f90: New test.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gfortran.dg/vect/pr99825.f90 
b/gcc/testsuite/gfortran.dg/vect/pr99825.f90
new file mode 100644
index 
..b1423df32254e80dc2958b95083831173011afed
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/vect/pr99825.f90
@@ -0,0 +1,15 @@
+! { dg-do compile }
+! { dg-additional-options "-march=armv8.3-a -O3" { target { aarch64*-*-* } } }
+
+program main
+  complex, dimension(3, 2) :: a
+  complex, dimension(2, 4) :: b
+  complex, dimension(3, 4) :: c, res1
+
+  data a /0, (-5., -7.), (11., -13.), 0, 0, 0/
+  data b /0, 0, 0, 0, 0, 0, 0, 0/
+  data res1 /0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0/
+
+  c = matmul(a, conjg(b))
+  if (any(res1 /= c)) stop 2
+end program main
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index 
cdd61e9a5cf0c5698fd94c29c6615d293ce4c41a..b25655c987698fcca68377e3bc83ff7f3426
 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -332,7 +332,8 @@ static inline bool
 vect_check_evenodd_blend (lane_permutation_t ,
 unsigned even, unsigned odd)
 {
-  if (permutes.length () == 0)
+  if (permutes.length () == 0
+  || permutes.length () % 2 != 0)
 return false;
 
   unsigned val[2] = {even, odd};


-- 
diff --git a/gcc/testsuite/gfortran.dg/vect/pr99825.f90 b/gcc/testsuite/gfortran.dg/vect/pr99825.f90
new file mode 100644
index ..b1423df32254e80dc2958b95083831173011afed
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/vect/pr99825.f90
@@ -0,0 +1,15 @@
+! { dg-do compile }
+! { dg-additional-options "-march=armv8.3-a -O3" { target { aarch64*-*-* } } }
+
+program main
+  complex, dimension(3, 2) :: a
+  complex, dimension(2, 4) :: b
+  complex, dimension(3, 4) :: c, res1
+
+  data a /0, (-5., -7.), (11., -13.), 0, 0, 0/
+  data b /0, 0, 0, 0, 0, 0, 0, 0/
+  data res1 /0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0/
+
+  c = matmul(a, conjg(b))
+  if (any(res1 /= c)) stop 2
+end program main
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index cdd61e9a5cf0c5698fd94c29c6615d293ce4c41a..b25655c987698fcca68377e3bc83ff7f3426 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -332,7 +332,8 @@ static inline bool
 vect_check_evenodd_blend (lane_permutation_t ,
 			 unsigned even, unsigned odd)
 {
-  if (permutes.length () == 0)
+  if (permutes.length () == 0
+  || permutes.length () % 2 != 0)
 return false;
 
   unsigned val[2] = {even, odd};



[PATCH] arm: Fix emission of Tag_ABI_VFP_args with MVE and -mfloat-abi=hard (PR target/99773)

2021-03-30 Thread Christophe Lyon via Gcc-patches
When compiling with -mfloat-abi=hard -march=armv8.1-m.main+mve, we
want to emit Tag_ABI_VFP_args even though we are not emitting
floating-point instructions (we need "+mve.fp" for that), because we
use MVE registers to pass FP arguments.

This patch removes the condition on (! TARGET_SOFT_FLOAT) because this
is a case where TARGET_SOFT_FLOAT is true, and TARGET_HARD_FLOAT_ABI
is true too.

2021-03-30  Richard Earnshaw  

gcc/
PR target/99773
* config/arm/arm.c (arm_file_start): Fix emission of
Tag_ABI_VFP_args attribute.
---
 gcc/config/arm/arm.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e89f5e2..518bfed 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28150,14 +28150,11 @@ arm_file_start (void)
   if (print_tune_info)
arm_print_tune_info ();
 
-  if (! TARGET_SOFT_FLOAT)
-   {
- if (TARGET_HARD_FLOAT && TARGET_VFP_SINGLE)
-   arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 1);
+  if (TARGET_HARD_FLOAT && TARGET_VFP_SINGLE)
+   arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 1);
 
- if (TARGET_HARD_FLOAT_ABI)
-   arm_emit_eabi_attribute ("Tag_ABI_VFP_args", 28, 1);
-   }
+  if (TARGET_HARD_FLOAT_ABI)
+   arm_emit_eabi_attribute ("Tag_ABI_VFP_args", 28, 1);
 
   /* Some of these attributes only apply when the corresponding features
 are used.  However we don't have any easy way of figuring this out.
-- 
2.7.4



Re: [PATCH] testsuite: Disable zero-scratch-regs-{8,9,10,11}.c on s390* [PR97680]

2021-03-30 Thread Richard Biener via Gcc-patches
On Tue, Mar 30, 2021 at 1:56 PM Jakub Jelinek  wrote:
>
> On Tue, Mar 30, 2021 at 01:28:40PM +0200, Richard Biener wrote:
> > > Ok to skip the test on s390* until then?
> >
> > Can we change the test to do
> >
> > { dg-skip-if "not implemented" { ! { target x86_64-*-*  } } }
> >
> > instead?  IIRC it's nowhere implemented but on x86_64.
>
> I don't know, perhaps.
> Seems the target hook is only defined on
> config/i386/i386.c:#undef TARGET_ZERO_CALL_USED_REGS
> config/i386/i386.c:#define TARGET_ZERO_CALL_USED_REGS ix86_zero_call_used_regs
> config/sparc/sparc.c:#undef TARGET_ZERO_CALL_USED_REGS
> config/sparc/sparc.c:#define TARGET_ZERO_CALL_USED_REGS 
> sparc_zero_call_used_regs
> but apparently many of the tests actually succeed on various targets that
> don't define those hooks.  E.g. I haven't seen them to fail on aarch64,
> on arm only the -10.c fails, on powerpc*/s390* all {8,9,10,11} fail (plus
> 5 is skipped on power*-aix*).
> On ia64 according to testresults {6,7,8,9,10,11} fail, some with ICEs.
> On mipsel according to testresults {9,10,11} fail, some with ICEs.
> On nvptx at least 1-9 succeed, 10-11 don't know, don't have assert.h around.
>
> So, do we want to fill in negative dg-skip-if for the 6-11 tests or
> positive?  In any case, is there any hope any of the maintainers or the
> original submitter will change anything for GCC 12, or are we going to end
> up with a very narrowly supported feature?

It looks like the latter - I've seen no attempt by the original authors to make
the feature work on more targets than they cared for.

Richard.

> Jakub
>


Re: [PATCH] testsuite: Disable zero-scratch-regs-{8,9,10,11}.c on s390* [PR97680]

2021-03-30 Thread Jakub Jelinek via Gcc-patches
On Tue, Mar 30, 2021 at 01:28:40PM +0200, Richard Biener wrote:
> > Ok to skip the test on s390* until then?
> 
> Can we change the test to do
> 
> { dg-skip-if "not implemented" { ! { target x86_64-*-*  } } }
> 
> instead?  IIRC it's nowhere implemented but on x86_64.

I don't know, perhaps.
Seems the target hook is only defined on
config/i386/i386.c:#undef TARGET_ZERO_CALL_USED_REGS
config/i386/i386.c:#define TARGET_ZERO_CALL_USED_REGS ix86_zero_call_used_regs
config/sparc/sparc.c:#undef TARGET_ZERO_CALL_USED_REGS
config/sparc/sparc.c:#define TARGET_ZERO_CALL_USED_REGS 
sparc_zero_call_used_regs
but apparently many of the tests actually succeed on various targets that
don't define those hooks.  E.g. I haven't seen them to fail on aarch64,
on arm only the -10.c fails, on powerpc*/s390* all {8,9,10,11} fail (plus
5 is skipped on power*-aix*).
On ia64 according to testresults {6,7,8,9,10,11} fail, some with ICEs.
On mipsel according to testresults {9,10,11} fail, some with ICEs.
On nvptx at least 1-9 succeed, 10-11 don't know, don't have assert.h around.

So, do we want to fill in negative dg-skip-if for the 6-11 tests or
positive?  In any case, is there any hope any of the maintainers or the
original submitter will change anything for GCC 12, or are we going to end
up with a very narrowly supported feature?

Jakub



Re: [PATCH] x86: Define __rdtsc and __rdtscp as macros

2021-03-30 Thread Uros Bizjak via Gcc-patches
On Tue, Mar 30, 2021 at 1:03 PM Jakub Jelinek  wrote:
>
> On Tue, Mar 30, 2021 at 12:59:16PM +0200, Richard Biener wrote:
> > > > > So yes, a better solution would be nice but I can't see any since the
> > > > > underlying issue is known since a long time and thus the pragmatic
> > > > > solution is the best (IMHO), also from a QOI perspective.  For 
> > > > > intrinsics
> > > > > it also avoids differences with -O0 vs -O with what we accept and 
> > > > > reject.
> > > >
> > > > Here is a simple patch for GCC 11 by defining __rdtsc and __rdtscp
> > > > as macros.   OK for master?
> > >
> > > I don't want to step on anyone's toes by approving this approach, so
> > > I'd like to ask Richard and Jakub if they agree with the solution.
> >
> > I'm OK with the solution for __rdtsc & friends.
>
> Ok for me too (temporarily until we have a fix for the general problem).
>
> > I suppose there's nothing that guarantees taking the address of an 
> > intrinsic is going to work?
>
> I bet one gets tons of different errors that way.  After all, for -O0 a lot
> of intrinsics are macros.  And, for those that are inline functions, a lot
> of them will be rejected if an immediate argument doesn't have a constant
> value.

LGTM for the patch.

Thanks,
Uros.


Re: [PATCH] testsuite: Disable zero-scratch-regs-{8,9,10,11}.c on s390* [PR97680]

2021-03-30 Thread Richard Biener via Gcc-patches
On Tue, Mar 30, 2021 at 12:44 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> These test FAIL on s390*:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:
>  In function 'foo8':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-10.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:
>  In function 'foo8':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-11.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> In file included from 
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-8.c:5:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:
>  In function 'foo':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-8.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:
>  In function 'foo':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-9.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> Like on powerpc or arm, they need backend support which isn't there and
> likely should be added for GCC 12.
>
> Ok to skip the test on s390* until then?

Can we change the test to do

{ dg-skip-if "not implemented" { ! { target x86_64-*-*  } } }

instead?  IIRC it's nowhere implemented but on x86_64.

> 2021-03-30  Jakub Jelinek  
>
> PR testsuite/97680
> * c-c++-common/zero-scratch-regs-8.c: Skip on s390.
> * c-c++-common/zero-scratch-regs-9.c: Likewise.
> * c-c++-common/zero-scratch-regs-10.c: Likewise.
> * c-c++-common/zero-scratch-regs-11.c: Likewise.
>
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-8.c.jj 2020-11-11 
> 01:46:03.392696119 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-8.c2021-03-30 
> 12:32:11.099667255 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-options "-O2 -fzero-call-used-regs=all-arg" } */
>
>  #include "zero-scratch-regs-1.c"
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-9.c.jj 2020-11-11 
> 01:46:03.392696119 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-9.c2021-03-30 
> 12:32:26.707493760 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-options "-O2 -fzero-call-used-regs=all" } */
>
>  #include "zero-scratch-regs-1.c"
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-10.c.jj2021-03-18 
> 15:32:56.459617723 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-10.c   2021-03-30 
> 12:31:56.468829910 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-skip-if "not implemented" { arm*-*-* } } */
>  /* { dg-options "-O2" } */
>
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-11.c.jj2020-11-11 
> 01:46:03.392696119 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-11.c   2021-03-30 
> 12:32:46.012279152 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-options "-O2 

[PATCH] tree-optimization/99824 - avoid excessive integer type precision in VN

2021-03-30 Thread Richard Biener
VN sometimes builds new integer types to handle accesss where precision
of the access type does not match the access size.  The way
ao_ref_init_from_vn_reference is computing the access size ignores
the access type in case the ref operands have an outermost
COMPONENT_REF which, in case it is an array for example, can be
way larger than the access size.  This can cause us to try
building an integer type with precision larger than WIDE_INT_MAX_PRECISION
eventually leading to memory corruption.

The following adjusts ao_ref_init_from_vn_reference to only lower
access sizes via the outermost COMPONENT_REF but otherwise honor
the access size as specified by the access type.

It also places an assert in integer type building that we remain
in the limits of WIDE_INT_MAX_PRECISION.  I chose the shared code
where we set TYPE_MIN/MAX_VALUE because that will immediately
cross the wide_ints capacity otherwise.

Bootstrapped and tested on x86_64-unknown-linux-gnu on trunk,
regtest ongoing on the branch.  On trunk the issue does not
reproduce but is latent to some extent.

2021-03-30  Richard Biener  

PR tree-optimization/99824
* stor-layout.c (set_min_and_max_values_for_integral_type):
Assert the precision is within the bounds of
WIDE_INT_MAX_PRECISION.
* tree-ssa-sccvn.c (ao_ref_init_from_vn_reference): Use
the outermost component ref only to lower the access size
and initialize that from the access type.

* gcc.dg/torture/pr99824.c: New testcase.
---
 gcc/stor-layout.c  |  2 ++
 gcc/testsuite/gcc.dg/torture/pr99824.c | 33 ++
 gcc/tree-ssa-sccvn.c   | 24 +++
 3 files changed, 49 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr99824.c

diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 784f131ebb8..94b8b21c7a8 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2838,6 +2838,8 @@ set_min_and_max_values_for_integral_type (tree type,
   if (precision < 1)
 return;
 
+  gcc_assert (precision <= WIDE_INT_MAX_PRECISION);
+
   TYPE_MIN_VALUE (type)
 = wide_int_to_tree (type, wi::min_value (precision, sgn));
   TYPE_MAX_VALUE (type)
diff --git a/gcc/testsuite/gcc.dg/torture/pr99824.c 
b/gcc/testsuite/gcc.dg/torture/pr99824.c
new file mode 100644
index 000..9022d4a4b8e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr99824.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+
+unsigned int
+strlenx(char *s)
+{
+  char *orig_s = s;
+  for (; *s; ++s)
+;
+  return s - orig_s;
+}
+
+struct i2c_adapter {
+char name[48];
+};
+
+struct {
+int instance;
+struct i2c_adapter i2c_adap[];
+} * init_cx18_i2c_cx;
+
+const struct i2c_adapter cx18_i2c_adap_template = {""};
+int init_cx18_i2c___trans_tmp_1;
+
+void
+init_cx18_i2c()
+{
+  int i = 0;
+  for (;; i++) {
+  init_cx18_i2c_cx->i2c_adap[i] = cx18_i2c_adap_template;
+  init_cx18_i2c___trans_tmp_1
+   = strlenx(init_cx18_i2c_cx->i2c_adap[i].name);
+  }
+}
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 1c0500ce61e..0567a2e9ff5 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -1002,22 +1002,26 @@ ao_ref_init_from_vn_reference (ao_ref *ref,
   poly_offset_int size = -1;
   tree size_tree = NULL_TREE;
 
-  /* First get the final access size from just the outermost expression.  */
+  machine_mode mode = TYPE_MODE (type);
+  if (mode == BLKmode)
+size_tree = TYPE_SIZE (type);
+  else
+size = GET_MODE_BITSIZE (mode);
+  if (size_tree != NULL_TREE
+  && poly_int_tree_p (size_tree))
+size = wi::to_poly_offset (size_tree);
+
+  /* Lower the final access size from the outermost expression.  */
   op = [0];
+  size_tree = NULL_TREE;
   if (op->opcode == COMPONENT_REF)
 size_tree = DECL_SIZE (op->op0);
   else if (op->opcode == BIT_FIELD_REF)
 size_tree = op->op0;
-  else
-{
-  machine_mode mode = TYPE_MODE (type);
-  if (mode == BLKmode)
-   size_tree = TYPE_SIZE (type);
-  else
-   size = GET_MODE_BITSIZE (mode);
-}
   if (size_tree != NULL_TREE
-  && poly_int_tree_p (size_tree))
+  && poly_int_tree_p (size_tree)
+  && (!known_size_p (size)
+ || known_lt (wi::to_poly_offset (size_tree), size)))
 size = wi::to_poly_offset (size_tree);
 
   /* Initially, maxsize is the same as the accessed element size.
-- 
2.26.2


Re: [PATCH] x86: Define __rdtsc and __rdtscp as macros

2021-03-30 Thread Jakub Jelinek via Gcc-patches
On Tue, Mar 30, 2021 at 12:59:16PM +0200, Richard Biener wrote:
> > > > So yes, a better solution would be nice but I can't see any since the
> > > > underlying issue is known since a long time and thus the pragmatic
> > > > solution is the best (IMHO), also from a QOI perspective.  For 
> > > > intrinsics
> > > > it also avoids differences with -O0 vs -O with what we accept and 
> > > > reject.
> > >
> > > Here is a simple patch for GCC 11 by defining __rdtsc and __rdtscp
> > > as macros.   OK for master?
> >
> > I don't want to step on anyone's toes by approving this approach, so
> > I'd like to ask Richard and Jakub if they agree with the solution.
> 
> I'm OK with the solution for __rdtsc & friends.  

Ok for me too (temporarily until we have a fix for the general problem).

> I suppose there's nothing that guarantees taking the address of an intrinsic 
> is going to work?

I bet one gets tons of different errors that way.  After all, for -O0 a lot
of intrinsics are macros.  And, for those that are inline functions, a lot
of them will be rejected if an immediate argument doesn't have a constant
value.

Jakub



Re: [PATCH] x86: Define __rdtsc and __rdtscp as macros

2021-03-30 Thread Richard Biener via Gcc-patches
On Tue, Mar 30, 2021 at 10:14 AM Uros Bizjak  wrote:
>
> On Fri, Mar 26, 2021 at 6:24 PM H.J. Lu  wrote:
> >
> > On Fri, Mar 26, 2021 at 5:09 AM Richard Biener
> >  wrote:
> > >
> > > On Fri, Mar 26, 2021 at 11:26 AM Jakub Jelinek  wrote:
> > > >
> > > > On Fri, Mar 26, 2021 at 11:13:21AM +0100, Richard Biener wrote:
> > > > > On Fri, Mar 26, 2021 at 9:34 AM Jakub Jelinek via Gcc-patches
> > > > >  wrote:
> > > > > >
> > > > > > On Thu, Mar 25, 2021 at 11:36:37AM -0700, H.J. Lu via Gcc-patches 
> > > > > > wrote:
> > > > > > > How can we move forward with it?  I'd like to resolve it in GCC 
> > > > > > > 11.
> > > > > >
> > > > > > I think it is too late for GCC 11 for this.
> > > > > > Especially if the solution would be that we change the behavior of 
> > > > > > existing
> > > > > > attribute, we would need enough time to test everything in the wild 
> > > > > > that
> > > > > > we don't break it badly,
> > > > >
> > > > > But isn't the suggested change only going to make programs we reject 
> > > > > now
> > > > > with an error accepted or ICEing?  Thus, no program that works right 
> > > > > now
> > > > > should break.
> > > >
> > > > That is true, but even
> > > > accepts-invalid
> > > > and
> > > > ice-on-invalid-code
> > > > would be important regressions.
> > > > Changing the always_inline attribute behavior without at least avoiding
> > > > the first of those for our intrinsics would be bad, and we need to look 
> > > > what
> > > > people use always_inline in the wild for and what are their 
> > > > expectations.
> > > > And for the intrinsics we need something maintainable, we have > 5000
> > > > intrinsics on i386 alone, > 4000 on aarch64, > 7000 on arm, > 600 on 
> > > > rs6000,
> > > > > 100 on sparc, I bet most of them rely on the current behavior.
> > > > I think the world doesn't end if we do it for GCC 12 only, do it right 
> > > > for
> > > > everything we are aware of and have many months to figure out what 
> > > > impact it
> > > > will have on programs in the wild.
> > >
> > > As said, my opinion is that this fallout doesn't "exist" in the wild
> > > since it can
> > > only exist for code we reject right now which in my definition of
> > > "out in the wild" makes it not exist.  I consider only code accepted by
> > > the compiler as valid "out in the wild" example.
> > >
> > > See also the behavior of always-inline with regard to the optimize 
> > > attribute.
> > >
> > > So yes, a better solution would be nice but I can't see any since the
> > > underlying issue is known since a long time and thus the pragmatic
> > > solution is the best (IMHO), also from a QOI perspective.  For intrinsics
> > > it also avoids differences with -O0 vs -O with what we accept and reject.
> >
> > Here is a simple patch for GCC 11 by defining __rdtsc and __rdtscp
> > as macros.   OK for master?
>
> I don't want to step on anyone's toes by approving this approach, so
> I'd like to ask Richard and Jakub if they agree with the solution.

I'm OK with the solution for __rdtsc & friends.  I suppose there's nothing
that guarantees taking the address of an intrinsic is going to work?

It of course still leaves the more general problem unsolved.

Richard.

> Uros,


[PATCH] aarch64: Tweak post-RA handling of CONST_INT moves [PR98136]

2021-03-30 Thread Richard Sandiford via Gcc-patches
This PR is a regression caused by r8-5967, where we replaced
a call to aarch64_internal_mov_immediate in aarch64_add_offset
with a call to aarch64_force_temporary, which in turn uses the
normal emit_move_insn{,_1} routines.

The problem is that aarch64_add_offset can be called while
outputting a thunk, where we require all instructions to be
valid without splitting.  However, the move expanders were
not splitting CONST_INT moves themselves.

I think the right fix is to make the move expanders work
even in this scenario, rather than require callers to handle
it as a special case.

Tested on aarch64-linux-gnu, pushed to trunk so far.  I'm a bit
hesistant to apply it to GCC 10 so close to the release, so it'll
probably have to wait for GCC 10.4.

Thanks,
Richard


gcc/
PR target/98136
* config/aarch64/aarch64.md (mov): Pass multi-instruction
CONST_INTs to aarch64_expand_mov_immediate when called after RA.

gcc/testsuite/
PR target/98136
* g++.dg/pr98136.C: New test.
---
 gcc/config/aarch64/aarch64.md  | 17 +
 gcc/testsuite/g++.dg/pr98136.C | 26 ++
 2 files changed, 39 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr98136.C

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index dd1dc2bd7a8..a398c3d31ce 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1241,10 +1241,19 @@ (define_expand "mov"
 if (GET_CODE (operands[0]) == MEM && operands[1] != const0_rtx)
   operands[1] = force_reg (mode, operands[1]);
 
-/* FIXME: RR we still need to fix up what we are doing with
-   symbol_refs and other types of constants.  */
-if (CONSTANT_P (operands[1])
-&& !CONST_INT_P (operands[1]))
+/* Lower moves of symbolic constants into individual instructions.
+   Doing this now is sometimes necessary for correctness, since some
+   sequences require temporary pseudo registers.  Lowering now is also
+   often better for optimization, since more RTL passes get the
+   chance to optimize the individual instructions.
+
+   When called after RA, also split multi-instruction moves into
+   smaller pieces now, since we can't be sure that sure that there
+   will be a following split pass.  */
+if (CONST_INT_P (operands[1])
+   ? (reload_completed
+  && !aarch64_mov_imm_operand (operands[1], mode))
+   : CONSTANT_P (operands[1]))
  {
aarch64_expand_mov_immediate (operands[0], operands[1]);
DONE;
diff --git a/gcc/testsuite/g++.dg/pr98136.C b/gcc/testsuite/g++.dg/pr98136.C
new file mode 100644
index 000..f3c27f6b09f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr98136.C
@@ -0,0 +1,26 @@
+// { dg-do compile { target { ilp32 || lp64 } } }
+
+struct AddIn
+{
+virtual ~AddIn() {}
+virtual void AddInCall()=0;
+};
+
+struct Base
+{
+char b[32*1024*1024];   // Anything bigger than 16mb causes internal 
compiler error
+virtual ~Base() {}
+};
+
+struct Deriv : public Base,
+public AddIn
+{
+void AddInCall() {}
+};
+
+int main (int argc, char **argv)
+{
+Deriv deriv;
+deriv.AddInCall();
+return 0;
+}


Re: [PATCH] Aarch64: Prevent use of SIMD fcvtz[su] instruction variant with "nosimd"

2021-03-30 Thread Richard Sandiford via Gcc-patches
mihailo.stojano...@typhoon-hil.com writes:
> From: Mihailo Stojanovic 
>
> Hi all,
>
> Currently, SF->SI and DF->DI conversions on Aarch64 with the "nosimd"
> flag provided sometimes cause the emitting of a vector variant of the
> fcvtz[su] instruction (e.g. fcvtzu s0, s0).
>
> This modifies the corresponding pattern to only select the vector
> variant of the instruction when generating code with SIMD enabled.
>
> Tested on aarch64-linux-gnu.

Thanks for the patch and sorry for the slow review.

This distinction between FP and SIMD probably isn't useful in practice,
but given that it exists in both GCC and binutils, I agree they need
to be consistent.  I've pushed the patch to trunk.

Thanks,
Richard

>
> gcc/ChangeLog:
>
> * gcc/config/aarch64/aarch64.md
> (_trunc2): Set the "arch"
> attribute to disambiguate between SIMD and FP variants of the
> instruction.
>
> gcc/testsuite/ChangeLog:
>
> * gcc/testsuite/gcc.target/aarch64/fcvt_nosimd.c: New test.
> ---
>  gcc/config/aarch64/aarch64.md |  3 ++-
>  .../gcc.target/aarch64/fcvt_nosimd.c  | 23 +++
>  2 files changed, 25 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/fcvt_nosimd.c
>
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index b2abb5b5b3c..dd1dc2bd7a8 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -5989,7 +5989,8 @@
>"@
> fcvtz\t%0, %1
> fcvtz\t%0, %1"
> -  [(set_attr "type" "neon_fp_to_int_s,f_cvtf2i")]
> +  [(set_attr "type" "neon_fp_to_int_s,f_cvtf2i")
> +   (set_attr "arch" "simd,fp")]
>  )
>  
>  ;; Convert HF -> SI or DI
> diff --git a/gcc/testsuite/gcc.target/aarch64/fcvt_nosimd.c 
> b/gcc/testsuite/gcc.target/aarch64/fcvt_nosimd.c
> new file mode 100644
> index 000..7b2ab65e307
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/fcvt_nosimd.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=armv8-a+nosimd" } */
> +
> +#include 
> +
> +uint64_t test_double_to_uint64(double x) {
> +  return (uint64_t)x;
> +}
> +
> +int64_t test_double_to_int64(double x) {
> +  return (int64_t)x;
> +}
> +
> +uint32_t test_float_to_uint32(float x) {
> +  return (uint32_t)x;
> +}
> +
> +int32_t test_float_to_int32(float x) {
> +  return (int32_t)x;
> +}
> +
> +/* { dg-final { scan-assembler-not {\tfcvtz[su]\td[0-9]*, d[0-9]*} } } */
> +/* { dg-final { scan-assembler-not {\tfcvtz[su]\ts[0-9]*, s[0-9]*} } } */


[PATCH] testsuite: Disable zero-scratch-regs-{8,9,10,11}.c on s390* [PR97680]

2021-03-30 Thread Jakub Jelinek via Gcc-patches
Hi!

These test FAIL on s390*:
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:
 In function 'foo8':
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
 sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
compiler exited with status 1
FAIL: c-c++-common/zero-scratch-regs-10.c  -Wc++-compat  (test for excess 
errors)
Excess errors:
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
 sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:
 In function 'foo8':
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
 sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
compiler exited with status 1
FAIL: c-c++-common/zero-scratch-regs-11.c  -Wc++-compat  (test for excess 
errors)
Excess errors:
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
 sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
In file included from 
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-8.c:5:
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:
 In function 'foo':
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
 sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
compiler exited with status 1
FAIL: c-c++-common/zero-scratch-regs-8.c  -Wc++-compat  (test for excess errors)
Excess errors:
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
 sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:
 In function 'foo':
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
 sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
compiler exited with status 1
FAIL: c-c++-common/zero-scratch-regs-9.c  -Wc++-compat  (test for excess errors)
Excess errors:
/builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
 sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
Like on powerpc or arm, they need backend support which isn't there and
likely should be added for GCC 12.

Ok to skip the test on s390* until then?

2021-03-30  Jakub Jelinek  

PR testsuite/97680
* c-c++-common/zero-scratch-regs-8.c: Skip on s390.
* c-c++-common/zero-scratch-regs-9.c: Likewise.
* c-c++-common/zero-scratch-regs-10.c: Likewise.
* c-c++-common/zero-scratch-regs-11.c: Likewise.

--- gcc/testsuite/c-c++-common/zero-scratch-regs-8.c.jj 2020-11-11 
01:46:03.392696119 +0100
+++ gcc/testsuite/c-c++-common/zero-scratch-regs-8.c2021-03-30 
12:32:11.099667255 +0200
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
+/* { dg-skip-if "not implemented" { s390*-*-* } } */
 /* { dg-options "-O2 -fzero-call-used-regs=all-arg" } */
 
 #include "zero-scratch-regs-1.c"
--- gcc/testsuite/c-c++-common/zero-scratch-regs-9.c.jj 2020-11-11 
01:46:03.392696119 +0100
+++ gcc/testsuite/c-c++-common/zero-scratch-regs-9.c2021-03-30 
12:32:26.707493760 +0200
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
+/* { dg-skip-if "not implemented" { s390*-*-* } } */
 /* { dg-options "-O2 -fzero-call-used-regs=all" } */
 
 #include "zero-scratch-regs-1.c"
--- gcc/testsuite/c-c++-common/zero-scratch-regs-10.c.jj2021-03-18 
15:32:56.459617723 +0100
+++ gcc/testsuite/c-c++-common/zero-scratch-regs-10.c   2021-03-30 
12:31:56.468829910 +0200
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
+/* { dg-skip-if "not implemented" { s390*-*-* } } */
 /* { dg-skip-if "not implemented" { arm*-*-* } } */
 /* { dg-options "-O2" } */
 
--- gcc/testsuite/c-c++-common/zero-scratch-regs-11.c.jj2020-11-11 
01:46:03.392696119 +0100
+++ gcc/testsuite/c-c++-common/zero-scratch-regs-11.c   2021-03-30 
12:32:46.012279152 +0200
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
+/* { dg-skip-if "not implemented" { s390*-*-* } } */
 /* { dg-options "-O2 -fzero-call-used-regs=all" } */
 
 #include "zero-scratch-regs-10.c"

Jakub



[PATCH] c++: Fix ICE on PTRMEM_CST in lambda in inline var initializer [PR99790]

2021-03-30 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs (since the addition of inline var support),
because the lambda contains PTRMEM_CST but finish_function is called for the
lambda quite early during parsing it (from finish_lambda_function) when
the containing class is still incomplete.  That means that during
genericization cplus_expand_constant keeps the PTRMEM_CST unmodified, but
later nothing lowers it when the class is finalized.
Using sizeof etc. on the class in such contexts is rejected by both g++ and
clang++, and when the PTRMEM_CST appears e.g. in static var initializers
rather than in functions, we handle it correctly because c_parse_final_cleanups
-> lower_var_init will handle those cplus_expand_constant when all classes
are already finalized.

The following patch fixes it by calling cplus_expand_constant again during
gimplification, as we are now unconditionally unit at a time, I'd think
everything that could be completed will be before we start gimplification.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-30  Jakub Jelinek  

PR c++/99790
* cp-gimplify.c (cp_gimplify_expr): Handle PTRMEM_CST.

* g++.dg/cpp1z/pr99790.C: New test.

--- gcc/cp/cp-gimplify.c.jj 2021-03-20 17:01:59.791040946 +0100
+++ gcc/cp/cp-gimplify.c2021-03-29 14:27:37.532223156 +0200
@@ -660,6 +660,14 @@ cp_gimplify_expr (tree *expr_p, gimple_s
   ret = GS_UNHANDLED;
   break;
 
+case PTRMEM_CST:
+  *expr_p = cplus_expand_constant (*expr_p);
+  if (TREE_CODE (*expr_p) == PTRMEM_CST)
+   ret = GS_ERROR;
+  else
+   ret = GS_OK;
+  break;
+
 case RETURN_EXPR:
   if (TREE_OPERAND (*expr_p, 0)
  && (TREE_CODE (TREE_OPERAND (*expr_p, 0)) == INIT_EXPR
--- gcc/testsuite/g++.dg/cpp1z/pr99790.C.jj 2021-03-29 14:38:35.887814018 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/pr99790.C2021-03-29 14:37:41.828422188 
+0200
@@ -0,0 +1,9 @@
+// PR c++/99790
+// { dg-do compile { target c++17 } }
+
+struct A;
+struct B { void (*fn) (A *); };
+template 
+int foo (const T &);
+struct A { int a; static constexpr B b{[] (A *n) { n->*::a = 2; }}; };
+int a = foo (A::b);

Jakub



Re: [committed] testsuite: Expect a warning on aarch64 for declare-simd-coarray-lib.f90 [PR93660]

2021-03-30 Thread Christophe Lyon via Gcc-patches
On Mon, 29 Mar 2021 at 17:09, Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> aarch64 currently doesn't support declare simd where the return value and 
> arguments
> have different sizes and warns about that case.  This change adds a dg-warning
> for that case like various other tests have already.
>
> Committed to trunk.
>

Thanks,

I pushed it to gcc-10 too.

Christophe

> 2021-03-29  Jakub Jelinek  
>
> PR fortran/93660
> * gfortran.dg/gomp/declare-simd-coarray-lib.f90: Expect a mixed size
> declare simd warning on aarch64.
>
> --- gcc/testsuite/gfortran.dg/gomp/declare-simd-coarray-lib.f90.jj  
> 2021-03-23 16:14:01.842299433 +0100
> +++ gcc/testsuite/gfortran.dg/gomp/declare-simd-coarray-lib.f90 2021-03-29 
> 17:01:29.885703123 +0200
> @@ -5,7 +5,7 @@
>  ! Failed as TREE_TYPE(fndecl) did not include the
>  ! hidden caf_token/caf_offset arguments.
>  !
> -integer function f(x)
> +integer function f(x)  ! { dg-warning "GCC does not currently support mixed 
> size types for 'simd' functions" "" { target aarch64*-*-* } }
> integer :: x[*]
> !$omp declare simd
> f = x[1]
>
> Jakub
>


Re: [PATCH] x86: Define __rdtsc and __rdtscp as macros

2021-03-30 Thread Uros Bizjak via Gcc-patches
On Fri, Mar 26, 2021 at 6:24 PM H.J. Lu  wrote:
>
> On Fri, Mar 26, 2021 at 5:09 AM Richard Biener
>  wrote:
> >
> > On Fri, Mar 26, 2021 at 11:26 AM Jakub Jelinek  wrote:
> > >
> > > On Fri, Mar 26, 2021 at 11:13:21AM +0100, Richard Biener wrote:
> > > > On Fri, Mar 26, 2021 at 9:34 AM Jakub Jelinek via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > On Thu, Mar 25, 2021 at 11:36:37AM -0700, H.J. Lu via Gcc-patches 
> > > > > wrote:
> > > > > > How can we move forward with it?  I'd like to resolve it in GCC 11.
> > > > >
> > > > > I think it is too late for GCC 11 for this.
> > > > > Especially if the solution would be that we change the behavior of 
> > > > > existing
> > > > > attribute, we would need enough time to test everything in the wild 
> > > > > that
> > > > > we don't break it badly,
> > > >
> > > > But isn't the suggested change only going to make programs we reject now
> > > > with an error accepted or ICEing?  Thus, no program that works right now
> > > > should break.
> > >
> > > That is true, but even
> > > accepts-invalid
> > > and
> > > ice-on-invalid-code
> > > would be important regressions.
> > > Changing the always_inline attribute behavior without at least avoiding
> > > the first of those for our intrinsics would be bad, and we need to look 
> > > what
> > > people use always_inline in the wild for and what are their expectations.
> > > And for the intrinsics we need something maintainable, we have > 5000
> > > intrinsics on i386 alone, > 4000 on aarch64, > 7000 on arm, > 600 on 
> > > rs6000,
> > > > 100 on sparc, I bet most of them rely on the current behavior.
> > > I think the world doesn't end if we do it for GCC 12 only, do it right for
> > > everything we are aware of and have many months to figure out what impact 
> > > it
> > > will have on programs in the wild.
> >
> > As said, my opinion is that this fallout doesn't "exist" in the wild
> > since it can
> > only exist for code we reject right now which in my definition of
> > "out in the wild" makes it not exist.  I consider only code accepted by
> > the compiler as valid "out in the wild" example.
> >
> > See also the behavior of always-inline with regard to the optimize 
> > attribute.
> >
> > So yes, a better solution would be nice but I can't see any since the
> > underlying issue is known since a long time and thus the pragmatic
> > solution is the best (IMHO), also from a QOI perspective.  For intrinsics
> > it also avoids differences with -O0 vs -O with what we accept and reject.
>
> Here is a simple patch for GCC 11 by defining __rdtsc and __rdtscp
> as macros.   OK for master?

I don't want to step on anyone's toes by approving this approach, so
I'd like to ask Richard and Jakub if they agree with the solution.

Uros,