Re: [PATCH] c++: do_class_deduction and dependent init [PR93383]

2021-04-21 Thread Patrick Palka via Gcc-patches
On Wed, 21 Apr 2021, Jason Merrill wrote:

> On 4/12/21 1:20 PM, Patrick Palka wrote:
> > Here we're crashing during deduction for a template placeholder from a
> > dependent initializer because one of the initializer's elements has an
> > empty TREE_TYPE, something which resolve_args and later unify_one_argument
> > don't expect.  And if the deduction from a dependent initializer
> > otherwise fails, we prematurely issue an error rather than reattempting
> > the deduction at instantiation time.
> > 
> > This patch makes do_class_deduction more tolerant about dependent
> > initializers, in a manner similar to what do_auto_deduction does: if
> > deduction from a dependent initializer fails, just return the original
> > placeholder unchanged.
> 
> Why doesn't the type_dependent_expression_p check in do_auto_deduction catch
> this already?

That check applies only when context != adc_unify, but here we have
context == adc_unify since we're being called from
convert_template_argument.

And currently, when 'auto' deduction fails for a dependent initializer,
do_auto_deduction will just silently return the original placeholder:

  int val = type_unification_real (tparms, targs, parms, , 1, 0,
   DEDUCE_CALL,
   NULL, /*explain_p=*/false);
  if (val > 0)
{
  if (processing_template_decl)
/* Try again at instantiation time.  */
return type;

so I suppose this patch just makes do_class_deduction behave more
similarly to do_auto_deduction in this situation.

> 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, and also tested on
> > cmcstl2 and range-v3, and on all the testcases in PR93383, does this
> > look OK for trunk?
> > 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/89565
> > PR c++/93383
> > PR c++/99200
> > * pt.c (do_class_deduction): If an argument has no type, don't
> > attempt deduction.  If deduction fails and the initializer is
> > type-dependent, try again at instantiation time.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/89565
> > PR c++/93383
> > PR c++/99200
> > * g++.dg/cpp2a/nontype-class39.C: Remove dg-ice.
> > * g++.dg/cpp2a/nontype-class44.C: New test.
> > * g++.dg/cpp2a/nontype-class45.C: New test.
> > ---
> >   gcc/cp/pt.c  | 11 +++
> >   gcc/testsuite/g++.dg/cpp2a/nontype-class39.C |  1 -
> >   gcc/testsuite/g++.dg/cpp2a/nontype-class44.C | 11 +++
> >   gcc/testsuite/g++.dg/cpp2a/nontype-class45.C | 32 
> >   4 files changed, 54 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class44.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class45.C
> > 
> > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> > index 0ce7fa359c1..612feac7976 100644
> > --- a/gcc/cp/pt.c
> > +++ b/gcc/cp/pt.c
> > @@ -29334,6 +29334,13 @@ do_class_deduction (tree ptype, tree tmpl, tree
> > init,
> > else
> >   args = make_tree_vector_single (init);
> >   +  /* If an argument is missing its type, we can't possibly deduce from
> > this
> > + (type-dependent) initializer ahead of time.  */
> > +  if (processing_template_decl)
> > +for (tree arg : *args)
> > +  if (!TREE_TYPE (arg))
> > +   return ptype;
> > /* Do this now to avoid problems with erroneous args later on.  */
> > args = resolve_args (args, complain);
> > if (args == NULL)
> > @@ -29419,6 +29426,10 @@ do_class_deduction (tree ptype, tree tmpl, tree
> > init,
> >   if (call == error_mark_node)
> >   {
> > +  if (type_dependent_expression_p (init))
> > +   /* Try again at instantiation time.  */
> > +   return ptype;
> > +
> > if (complain & tf_warning_or_error)
> > {
> >   error ("class template argument deduction failed:");
> > diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
> > b/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
> > index f5f79a71ec2..9b4da4f02ea 100644
> > --- a/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
> > +++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
> > @@ -1,6 +1,5 @@
> >   // PR c++/89565
> >   // { dg-do compile { target c++20 } }
> > -// { dg-ice "resolve_args" }
> > template 
> >   struct N{};
> > diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class44.C
> > b/gcc/testsuite/g++.dg/cpp2a/nontype-class44.C
> > new file mode 100644
> > index 000..d91e800424f
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class44.C
> > @@ -0,0 +1,11 @@
> > +// PR c++/93383
> > +// { dg-do compile { target c++20 } }
> > +
> > +template  struct A {};
> > +
> > +template  struct B {
> > +  void foo(B<+a>);
> > +  void bar(B);
> > +  template  using type = B;
> > +  template  static inline auto y = A{0}; // { dg-error "deduction|no
> > match" }
> > +};
> > diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class45.C
> > b/gcc/testsuite/g++.dg/cpp2a/nontype-class45.C
> > new file mode 

Re: [PATCH] c++: do_class_deduction and dependent init [PR93383]

2021-04-21 Thread Jason Merrill via Gcc-patches

On 4/12/21 1:20 PM, Patrick Palka wrote:

Here we're crashing during deduction for a template placeholder from a
dependent initializer because one of the initializer's elements has an
empty TREE_TYPE, something which resolve_args and later unify_one_argument
don't expect.  And if the deduction from a dependent initializer
otherwise fails, we prematurely issue an error rather than reattempting
the deduction at instantiation time.

This patch makes do_class_deduction more tolerant about dependent
initializers, in a manner similar to what do_auto_deduction does: if
deduction from a dependent initializer fails, just return the original
placeholder unchanged.


Why doesn't the type_dependent_expression_p check in do_auto_deduction 
catch this already?



Bootstrapped and regtested on x86_64-pc-linux-gnu, and also tested on
cmcstl2 and range-v3, and on all the testcases in PR93383, does this
look OK for trunk?

gcc/cp/ChangeLog:

PR c++/89565
PR c++/93383
PR c++/99200
* pt.c (do_class_deduction): If an argument has no type, don't
attempt deduction.  If deduction fails and the initializer is
type-dependent, try again at instantiation time.

gcc/testsuite/ChangeLog:

PR c++/89565
PR c++/93383
PR c++/99200
* g++.dg/cpp2a/nontype-class39.C: Remove dg-ice.
* g++.dg/cpp2a/nontype-class44.C: New test.
* g++.dg/cpp2a/nontype-class45.C: New test.
---
  gcc/cp/pt.c  | 11 +++
  gcc/testsuite/g++.dg/cpp2a/nontype-class39.C |  1 -
  gcc/testsuite/g++.dg/cpp2a/nontype-class44.C | 11 +++
  gcc/testsuite/g++.dg/cpp2a/nontype-class45.C | 32 
  4 files changed, 54 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class44.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class45.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 0ce7fa359c1..612feac7976 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29334,6 +29334,13 @@ do_class_deduction (tree ptype, tree tmpl, tree init,
else
  args = make_tree_vector_single (init);
  
+  /* If an argument is missing its type, we can't possibly deduce from this

+ (type-dependent) initializer ahead of time.  */
+  if (processing_template_decl)
+for (tree arg : *args)
+  if (!TREE_TYPE (arg))
+   return ptype;
/* Do this now to avoid problems with erroneous args later on.  */
args = resolve_args (args, complain);
if (args == NULL)
@@ -29419,6 +29426,10 @@ do_class_deduction (tree ptype, tree tmpl, tree init,
  
if (call == error_mark_node)

  {
+  if (type_dependent_expression_p (init))
+   /* Try again at instantiation time.  */
+   return ptype;
+
if (complain & tf_warning_or_error)
{
  error ("class template argument deduction failed:");
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
index f5f79a71ec2..9b4da4f02ea 100644
--- a/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
@@ -1,6 +1,5 @@
  // PR c++/89565
  // { dg-do compile { target c++20 } }
-// { dg-ice "resolve_args" }
  
  template 

  struct N{};
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class44.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class44.C
new file mode 100644
index 000..d91e800424f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class44.C
@@ -0,0 +1,11 @@
+// PR c++/93383
+// { dg-do compile { target c++20 } }
+
+template  struct A {};
+
+template  struct B {
+  void foo(B<+a>);
+  void bar(B);
+  template  using type = B;
+  template  static inline auto y = A{0}; // { dg-error "deduction|no 
match" }
+};
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class45.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class45.C
new file mode 100644
index 000..e7addf5f291
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class45.C
@@ -0,0 +1,32 @@
+// PR c++/99200
+// { dg-do compile { target c++20 } }
+
+template 
+struct A
+{
+  constexpr A (const char ()[N]) { for (int i = 0; i < N; i++) v[i] = s[i]; 
v[N] = 0; }
+  char v[N + 1];
+};
+
+template 
+struct B
+{
+  constexpr operator const char *() { return s.v; }
+};
+
+template 
+const char *
+foo ()
+{
+  return B<__PRETTY_FUNCTION__>{};
+}
+
+template 
+const char *
+bar ()
+{
+  return B<__FUNCTION__>{};
+}
+
+auto a = foo  ();
+auto b = bar  ();





Re: [PATCH] [libstdc++] Fix test timeout in stop_calback/destroy.cc

2021-04-21 Thread Thomas Rodgers

On 2021-04-21 14:22, Jakub Jelinek wrote:

On Wed, Apr 21, 2021 at 07:23:30PM +0100, Jonathan Wakely via 
Gcc-patches wrote:



We should do this for gcc-11 too if an RM approves it, since acquire()
is currently broken.


Ok, but please commit it soon, we'll need to do a RC2 tomorrow or on 
Friday

and then ideally no changes at all.

Jakub


Backported to releases/gcc-11

Note, there is a second patch that I just submitted that addresses a 
related issue with counting_semaphore::release() on non-Futex platforms 
that needs to also be back ported to gcc-11, otherwise release() is 
broken on not-linux platforms.


[PATCH] [libstdc++] Fix "bare" notifications dropped by waiters check

2021-04-21 Thread Thomas Rodgers
From: Thomas Rodgers 

NOTE - This patch also needs to be backported to gcc-11 in order for
semaphore release() to work correctly on non-futex platforms.

Tested sparc-sun-solaris2.11

For types that track whether or not there extant waiters (e.g.
semaphore) internally, the __atomic_notify_address_bare() call was
introduced to avoid the overhead of loading the atomic count of
waiters. For platforms that don't have Futex, however, there was
still a check for waiters, and seeing that there are none (because
in the bare case, the count is not incremented), the notification
is dropped. This commit addresses that case.

libstdc++-v3/ChangeLog:
* include/bits/atomic_wait.h: Always notify waiters in the
in the case of 'bare' address notification.
---
 libstdc++-v3/include/bits/atomic_wait.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_wait.h 
b/libstdc++-v3/include/bits/atomic_wait.h
index 0ac5575190c..984ed70f16c 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -226,9 +226,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 
   void
-  _M_notify(const __platform_wait_t* __addr, bool __all) noexcept
+  _M_notify(const __platform_wait_t* __addr, bool __all, bool __bare) 
noexcept
   {
-   if (!_M_waiting())
+   if (!(__bare || _M_waiting()))
  return;
 
 #ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
@@ -304,11 +304,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  }
 
void
-   _M_notify(bool __all)
+   _M_notify(bool __all, bool __bare = false)
{
  if (_M_addr == &_M_w._M_ver)
__atomic_fetch_add(_M_addr, 1, __ATOMIC_ACQ_REL);
- _M_w._M_notify(_M_addr, __all);
+ _M_w._M_notify(_M_addr, __all, __bare);
}
 
template

Re: [PATCH v2] c++: Prevent bogus -Wtype-limits warning with NTTP [PR100161]

2021-04-21 Thread Jason Merrill via Gcc-patches

On 4/21/21 1:30 PM, Marek Polacek wrote:

On Tue, Apr 20, 2021 at 10:40:52PM -0400, Jason Merrill wrote:

On 4/20/21 8:42 PM, Marek Polacek wrote:

Recently, we made sure that we never call value_dependent_expression_p
on an expression that isn't potential_constant_expression.  That caused
this bogus warning with a non-type template parameter, something that
users don't want to see.

The problem is that in tsubst_copy_and_build/LE_EXPR 't' is "i < n",
which, due to 'i', is not p_c_e, therefore we call t_d_e_p.  But the
type of 'n' isn't dependent, so we think the whole 't' expression is
not dependent.  It seems we need to test both op0 and op1 separately
to suppress this warning.  I use a lambda so as not to repeat the
check.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11.2?
Think I also want to backport to 10.4, but then the patch can't use
a lambda.


It should be straightforward to rewrite the lambda as a local class; let's
use that version for all the branches.


I was thinking that I'd use a lambda for GCC 11 and 12 and for GCC 10
play some functor games, but I can use the following for all three, if
you want.


Either is fine with me.


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11.2/10.4?


OK.


-- >8 --
Recently, we made sure that we never call value_dependent_expression_p
on an expression that isn't potential_constant_expression.  That caused
this bogus warning with a non-type template parameter, something that
users don't want to see.

The problem is that in tsubst_copy_and_build/LE_EXPR 't' is "i < n",
which, due to 'i', is not p_c_e, therefore we call t_d_e_p.  But the
type of 'n' isn't dependent, so we think the whole 't' expression is
not dependent.  It seems we need to test both op0 and op1 separately
to suppress this warning.

gcc/cp/ChangeLog:

PR c++/100161
* pt.c (tsubst_copy_and_build) : Test op0 and
op1 separately for value- or type-dependence.

gcc/testsuite/ChangeLog:

PR c++/100161
* g++.dg/warn/Wtype-limits6.C: New test.
---
  gcc/cp/pt.c   | 26 +++
  gcc/testsuite/g++.dg/warn/Wtype-limits6.C | 17 +++
  2 files changed, 34 insertions(+), 9 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wtype-limits6.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7bcbe6dc3ce..c1acfc498b1 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19906,15 +19906,23 @@ tsubst_copy_and_build (tree t,
  case MEMBER_REF:
  case DOTSTAR_EXPR:
{
-   /* If T was type-dependent, suppress warnings that depend on the range
-  of the types involved.  */
-   ++processing_template_decl;
-   const bool was_dep = (potential_constant_expression (t)
- ? value_dependent_expression_p (t)
- : type_dependent_expression_p (t));
-   --processing_template_decl;
-   tree op0 = RECUR (TREE_OPERAND (t, 0));
-   tree op1 = RECUR (TREE_OPERAND (t, 1));
+   /* If either OP0 or OP1 was value- or type-dependent, suppress
+  warnings that depend on the range of the types involved.  */
+   tree op0 = TREE_OPERAND (t, 0);
+   tree op1 = TREE_OPERAND (t, 1);
+   struct {
+ inline bool operator()(tree t) const {
+   ++processing_template_decl;
+   bool r = (potential_constant_expression (t)
+ ? value_dependent_expression_p (t)
+ : type_dependent_expression_p (t));
+   --processing_template_decl;
+   return r;
+ };
+   } dep_p;
+   const bool was_dep = dep_p (op0) || dep_p (op1);
+   op0 = RECUR (op0);
+   op1 = RECUR (op1);
  
  	warning_sentinel s1(warn_type_limits, was_dep);

warning_sentinel s2(warn_div_by_zero, was_dep);
diff --git a/gcc/testsuite/g++.dg/warn/Wtype-limits6.C 
b/gcc/testsuite/g++.dg/warn/Wtype-limits6.C
new file mode 100644
index 000..9d5886d5323
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wtype-limits6.C
@@ -0,0 +1,17 @@
+// PR c++/100161
+// { dg-additional-options "-Wtype-limits" }
+
+void f(unsigned);
+
+template
+void g()
+{
+for (unsigned i = 0; i < n; i++) { // { dg-bogus "always false" }
+f(i);
+}
+}
+
+void h()
+{
+g<0>();
+}

base-commit: c2fc1702cb3a3d5cc9c40de47f63b4c8f3f1d09c





Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute

2021-04-21 Thread H.J. Lu via Gcc-patches
On Wed, Apr 21, 2021 at 4:24 PM Martin Sebor  wrote:
>
> On 4/21/21 2:58 PM, H.J. Lu wrote:
> > On Wed, Apr 21, 2021 at 10:09 AM Martin Sebor  wrote:
> >>
> >> On 4/14/21 4:39 PM, H.J. Lu wrote:
> >>> commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
> >>> Author: H.J. Lu 
> >>> Date:   Fri Aug 21 09:42:49 2020 -0700
> >>>
> >>>   x86: Add target("general-regs-only") function attribute
> >>>
> >>> is incomplete since it is impossible to call integer intrinsics from
> >>> a function with general-regs-only target attribute.
> >>>
> >>> 1. Add general_regs_only function attribute to inform the compiler that
> >>> functions use only general purpose registers.  When making inlining
> >>> decisions on such functions, non-GPR compiler options are excluded.
> >>> 2. Add general_regs_only attribute to x86 intrinsics which use only
> >>> general purpose registers.
> >>>
> >> ...
> >>> --- a/gcc/doc/extend.texi
> >>> +++ b/gcc/doc/extend.texi
> >>> @@ -7066,6 +7066,11 @@ On x86 targets, the @code{fentry_section} 
> >>> attribute sets the name
> >>>of the section to record function entry instrumentation calls in when
> >>>enabled with @option{-pg -mrecord-mcount}
> >>>
> >>> +@item general_regs_only
> >>> +@cindex @code{general_regs_only} function attribute, x86
> >>> +The @code{general_regs_only} attribute on functions is used to
> >>> +inform the compiler that functions use only general purpose registers.
> >>
> >> I'll just reiterate basically the same comment as before: it's not
> >> clear from the very brief description above what the requirements
> >> are for using the attribute.  I'm guessing it can be applied to
> >> any function (inline or otherwise) but only has any effect when
> >> the function is actually inlined and otherwise doesn't constrain
> >> what the function can do.  (Whatever the constraints are, I think
> >> the manual should spell them out, and likewise for its effects.)
> >
> > That is correct.
> >
> >> Similarly it's not clear what should be expected when the function
> >> does use some other register.  Ideally, I think GCC would check and
> >> issue a nice error message whether or not the function is inlined
> >> or called.  I suspect that might only be possible for inline
> >> functions that are actually called and for which the back end must
> >> emit code.
> >
> > This is what GCC does today:
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99744
>
> Yes, that's the rather obscure error I think I commented on before
> and suggested should be improved.  Based on r99744-3.c I don't think
> this has changed in the improved patch.

My goal is to fix the inline failures, not to improve the compiler error
message.

> >
> >> Other than that, I'd suggest to improve the phrasing a bit:
> >>
> >> The @code{general_regs_only} function attribute indicates that
> >> the function uses only general purpose registers... [text
> >> explaining constraints and errors follows].
> >>
> >> Martin
> >
> > How about this
> >
> > @item general_regs_only
> > @cindex @code{general_regs_only} function attribute, x86
> > The @code{general_regs_only} attribute on functions is used to inform
> > the compiler that functions use only general purpose registers.  It
> > can be used together with the @code{always_inline} attribute to avoid
> > inlining failure when there is a mismatch in compiler vector options.
>
> Without an article the part "that functions use only general purpose
> registers" is unclear and/or grammatically incorrect.  What functions?
> If the function the attribute is applied to, it needs an article, e.g.,
> "the function" or "a function", and singular.  (Otherwise it could be
> read as talking about the functions called from the one with
> the attribute, or some other functions altogether).
>
> I tried to correct that above but, if you prefer, the following would
> be closer to your phrasing but more correct/accurate:
>
>The @code{general_regs_only} function attribute informs
>the compiler that the function uses only general purpose
>registers.
>
> I don't understand what the second sentence is trying to say, and
> without a better error message for the problem in r99744, I suspect
> few users will either.  I am suggesting to explain in the text you
> are adding, under what conditions inlining might fail without
> the attribute, and what effect the attribute has on the function
> that prevents the inlining failure.

How about this?

@item general_regs_only
@cindex @code{general_regs_only} function attribute, x86
The @code{general_regs_only} function attribute informs the compiler
that the function uses only general purpose registers.  When the
compiler inlines a function with the @code{always_inline} attribute,
target-specific compilation options may lead to inline failures.
The @code{general_regs_only} attribute, if applicable, can be used
together with the @code{always_inline} attribute to reduce inlining
failure.

> (If we can't explain what the effect is then 

Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute

2021-04-21 Thread Martin Sebor via Gcc-patches

On 4/21/21 2:58 PM, H.J. Lu wrote:

On Wed, Apr 21, 2021 at 10:09 AM Martin Sebor  wrote:


On 4/14/21 4:39 PM, H.J. Lu wrote:

commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
Author: H.J. Lu 
Date:   Fri Aug 21 09:42:49 2020 -0700

  x86: Add target("general-regs-only") function attribute

is incomplete since it is impossible to call integer intrinsics from
a function with general-regs-only target attribute.

1. Add general_regs_only function attribute to inform the compiler that
functions use only general purpose registers.  When making inlining
decisions on such functions, non-GPR compiler options are excluded.
2. Add general_regs_only attribute to x86 intrinsics which use only
general purpose registers.


...

--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -7066,6 +7066,11 @@ On x86 targets, the @code{fentry_section} attribute sets 
the name
   of the section to record function entry instrumentation calls in when
   enabled with @option{-pg -mrecord-mcount}

+@item general_regs_only
+@cindex @code{general_regs_only} function attribute, x86
+The @code{general_regs_only} attribute on functions is used to
+inform the compiler that functions use only general purpose registers.


I'll just reiterate basically the same comment as before: it's not
clear from the very brief description above what the requirements
are for using the attribute.  I'm guessing it can be applied to
any function (inline or otherwise) but only has any effect when
the function is actually inlined and otherwise doesn't constrain
what the function can do.  (Whatever the constraints are, I think
the manual should spell them out, and likewise for its effects.)


That is correct.


Similarly it's not clear what should be expected when the function
does use some other register.  Ideally, I think GCC would check and
issue a nice error message whether or not the function is inlined
or called.  I suspect that might only be possible for inline
functions that are actually called and for which the back end must
emit code.


This is what GCC does today:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99744


Yes, that's the rather obscure error I think I commented on before
and suggested should be improved.  Based on r99744-3.c I don't think
this has changed in the improved patch.




Other than that, I'd suggest to improve the phrasing a bit:

The @code{general_regs_only} function attribute indicates that
the function uses only general purpose registers... [text
explaining constraints and errors follows].

Martin


How about this

@item general_regs_only
@cindex @code{general_regs_only} function attribute, x86
The @code{general_regs_only} attribute on functions is used to inform
the compiler that functions use only general purpose registers.  It
can be used together with the @code{always_inline} attribute to avoid
inlining failure when there is a mismatch in compiler vector options.


Without an article the part "that functions use only general purpose
registers" is unclear and/or grammatically incorrect.  What functions?
If the function the attribute is applied to, it needs an article, e.g.,
"the function" or "a function", and singular.  (Otherwise it could be
read as talking about the functions called from the one with
the attribute, or some other functions altogether).

I tried to correct that above but, if you prefer, the following would
be closer to your phrasing but more correct/accurate:

  The @code{general_regs_only} function attribute informs
  the compiler that the function uses only general purpose
  registers.

I don't understand what the second sentence is trying to say, and
without a better error message for the problem in r99744, I suspect
few users will either.  I am suggesting to explain in the text you
are adding, under what conditions inlining might fail without
the attribute, and what effect the attribute has on the function
that prevents the inlining failure.

(If we can't explain what the effect is then I wonder why
the attribute is being added at all instead of teaching GCC to
always behave as if the attribute were there when its absence
would otherwise lead to an error.)

Martin


[PATCH v11] Practical improvement to libgcc complex divide

2021-04-21 Thread Patrick McGehearty via Gcc-patches
Changes in this version from Version 10:
(thanks to Joseph Myers for catching these issues)

In gcc/c-family/c-cppbuiltin.c
Fixed three cases where XALLOCAVEC argument contained ...LIBGCC_...
but should have contained ...LIBGCC__...

These off by one errors managed to not fail previously due to the limited
lifetime of the resulting allocation and the fact that no additional
variables were placed on the stack during that lifetime. But it was
an error waiting to fail at the time of some future code or compiler
change.

In libgcc/config/rs6000/_divkc3.c
Added an #ifndef __LONG_DOUBLE_IEEE128__
and *LIBGCC_KF* values for the RBIG, RMIN, RMIN2, RMINSCAL, and RMAX2 defines.

- - - - - -

Correctness and performance test programs used during development of
this project may be found in the attachment to:
https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg254210.html

Summary of Purpose

This patch to libgcc/libgcc2.c __divdc3 provides an
opportunity to gain important improvements to the quality of answers
for the default complex divide routine (half, float, double, extended,
long double precisions) when dealing with very large or very small exponents.

The current code correctly implements Smith's method (1962) [2]
further modified by c99's requirements for dealing with NaN (not a
number) results. When working with input values where the exponents
are greater than *_MAX_EXP/2 or less than -(*_MAX_EXP)/2, results are
substantially different from the answers provided by quad precision
more than 1% of the time. This error rate may be unacceptable for many
applications that cannot a priori restrict their computations to the
safe range. The proposed method reduces the frequency of
"substantially different" answers by more than 99% for double
precision at a modest cost of performance.

Differences between current gcc methods and the new method will be
described. Then accuracy and performance differences will be discussed.

Background

This project started with an investigation related to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714.  Study of Beebe[1]
provided an overview of past and recent practice for computing complex
divide. The current glibc implementation is based on Robert Smith's
algorithm [2] from 1962.  A google search found the paper by Baudin
and Smith [3] (same Robert Smith) published in 2012. Elen Kalda's
proposed patch [4] is based on that paper.

I developed two sets of test data by randomly distributing values over
a restricted range and the full range of input values. The current
complex divide handled the restricted range well enough, but failed on
the full range more than 1% of the time. Baudin and Smith's primary
test for "ratio" equals zero reduced the cases with 16 or more error
bits by a factor of 5, but still left too many flawed answers. Adding
debug print out to cases with substantial errors allowed me to see the
intermediate calculations for test values that failed. I noted that
for many of the failures, "ratio" was a subnormal. Changing the
"ratio" test from check for zero to check for subnormal reduced the 16
bit error rate by another factor of 12. This single modified test
provides the greatest benefit for the least cost, but the percentage
of cases with greater than 16 bit errors (double precision data) is
still greater than 0.027% (2.7 in 10,000).

Continued examination of remaining errors and their intermediate
computations led to the various tests of input value tests and scaling
to avoid under/overflow. The current patch does not handle some of the
rare and most extreme combinations of input values, but the random
test data is only showing 1 case in 10 million that has an error of
greater than 12 bits. That case has 18 bits of error and is due to
subtraction cancellation. These results are significantly better
than the results reported by Baudin and Smith.

Support for half, float, double, extended, and long double precision
is included as all are handled with suitable preprocessor symbols in a
single source routine. Since half precision is computed with float
precision as per current libgcc practice, the enhanced algorithm
provides no benefit for half precision and would cost performance.
Further investigation showed changing the half precision algorithm
to use the simple formula (real=a*c+b*d imag=b*c-a*d) caused no
loss of precision and modest improvement in performance.

The existing constants for each precision:
float: FLT_MAX, FLT_MIN;
double: DBL_MAX, DBL_MIN;
extended and/or long double: LDBL_MAX, LDBL_MIN
are used for avoiding the more common overflow/underflow cases.  This
use is made generic by defining appropriate __LIBGCC2_* macros in
c-cppbuiltin.c.

Tests are added for when both parts of the denominator have exponents
small enough to allow shifting any subnormal values to normal values
all input values could be scaled up without risking overflow. That
gained a clear improvement in accuracy. Similarly, when either
numerator was subnormal and the other 

[PATCH 3/3] VAX: Accept ASHIFT in address expressions

2021-04-21 Thread Maciej W. Rozycki
Fix regressions:

FAIL: gcc.c-torture/execute/20090113-2.c   -O1  (internal compiler error)
FAIL: gcc.c-torture/execute/20090113-2.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/execute/20090113-3.c   -O1  (internal compiler error)
FAIL: gcc.c-torture/execute/20090113-3.c   -O1  (test for excess errors)

triggering if LRA is used rather than old reload and caused by:

(plus:SI (plus:SI (mult:SI (reg:SI 30 [ _10 ])
(const_int 4 [0x4]))
(reg/f:SI 26 [ _6 ]))
(const_int 12 [0xc]))

coming from:

(insn 58 57 59 10 (set (reg:SI 33 [ _13 ])
(zero_extract:SI (mem:SI (plus:SI (plus:SI (mult:SI (reg:SI 30 [ _10 ])
(const_int 4 [0x4]))
(reg/f:SI 26 [ _6 ]))
(const_int 12 [0xc])) [4 _6->bits[_10]+0 S4 A32])
(reg:QI 56)
(reg:SI 53))) 
".../gcc/testsuite/gcc.c-torture/execute/20090113-2.c":64:12 490 
{*extzv_non_const}
 (expr_list:REG_DEAD (reg:QI 56)
(expr_list:REG_DEAD (reg:SI 53)
(expr_list:REG_DEAD (reg:SI 30 [ _10 ])
(expr_list:REG_DEAD (reg/f:SI 26 [ _6 ])
(nil))

being converted into:

(plus:SI (plus:SI (ashift:SI (reg:SI 30 [ _10 ])
(const_int 2 [0x2]))
(reg/f:SI 26 [ _6 ]))
(const_int 12 [0xc]))

which is an rtx the VAX backend currently does not recognize as a valid 
machine address, although apparently it is only inside MEM rtx's that 
indexed addressing is supposed to be canonicalized to a MULT rather than 
ASHIFT form.  Handle the ASHIFT form too throughout the backend then.

The change appears to also improve code generation with old reload and 
code size stats are as follows, collected from 18153 executables built 
in `check-c' GCC testing:

  samples average  median
--
regressions47  0.702%  0.521%
unchanged   17503  0.000%  0.000%
progressions  603 -0.920% -0.403%
--
total   18153 -0.029%  0.000%

with a small number of outliers (over 5% size change):

old new change  %change filename

18851645-240   -12.7320 pr53505.exe
13311221-110-8.2644 pr89634.exe
15531473-80 -5.1513 stdatomic-vm.exe
14131341-72 -5.0955 pr45830.exe
14151343-72 -5.0883 stdatomic-vm.exe
25765   24463   -1302   -5.0533 strlen-5.exe
25765   24463   -1302   -5.0533 strlen-5.exe
25765   24463   -1302   -5.0533 strlen-5.exe
11911131-60 -5.0377 20050527-1.exe

(all changes on the expansion side are below 5%).

gcc/
* config/vax/vax.c (print_operand_address, vax_address_cost_1)
(index_term_p): Handle ASHIFT too.
---
 gcc/config/vax/vax.c |   34 +-
 1 file changed, 21 insertions(+), 13 deletions(-)

gcc-vax-legitimate-address-ashift.diff
Index: gcc/gcc/config/vax/vax.c
===
--- gcc.orig/gcc/config/vax/vax.c
+++ gcc/gcc/config/vax/vax.c
@@ -333,12 +333,12 @@ print_operand_address (FILE * file, rtx
 
 case PLUS:
   /* There can be either two or three things added here.  One must be a
-REG.  One can be either a REG or a MULT of a REG and an appropriate
-constant, and the third can only be a constant or a MEM.
+REG.  One can be either a REG or a MULT/ASHIFT of a REG and an
+appropriate constant, and the third can only be a constant or a MEM.
 
 We get these two or three things and put the constant or MEM in
-OFFSET, the MULT or REG in IREG, and the REG in BREG.  If we have
-a register and can't tell yet if it is a base or index register,
+OFFSET, the MULT/ASHIFT or REG in IREG, and the REG in BREG.  If we
+have a register and can't tell yet if it is a base or index register,
 put it into REG1.  */
 
   reg1 = 0; ireg = 0; breg = 0; offset = 0;
@@ -355,12 +355,14 @@ print_operand_address (FILE * file, rtx
  offset = XEXP (addr, 1);
  addr = XEXP (addr, 0);
}
-  else if (GET_CODE (XEXP (addr, 1)) == MULT)
+  else if (GET_CODE (XEXP (addr, 1)) == MULT
+  || GET_CODE (XEXP (addr, 1)) == ASHIFT)
{
  ireg = XEXP (addr, 1);
  addr = XEXP (addr, 0);
}
-  else if (GET_CODE (XEXP (addr, 0)) == MULT)
+  else if (GET_CODE (XEXP (addr, 0)) == MULT
+  || GET_CODE (XEXP (addr, 0)) == ASHIFT)
{
  ireg = XEXP (addr, 0);
  addr = XEXP (addr, 1);
@@ -385,7 +387,7 @@ print_operand_address (FILE * file, rtx
  else
reg1 = addr;
}
-  else if (GET_CODE (addr) == MULT)
+  else if (GET_CODE (addr) == MULT || GET_CODE (addr) == ASHIFT)
ireg = addr;
   else
{
@@ -416,7 +418,8 @@ print_operand_address (FILE * file, rtx
}
  

[PATCH 2/3] VAX: Fix ill-formed `jbbi' insn operands

2021-04-21 Thread Maciej W. Rozycki
The insn has extraneous operand #3 that is aliased in RTL to operand #0 
with a constraint.  The operands specify a single-bit field in memory 
that the machine instruction produced boths reads for the purpose of 
determining whether to branch or not and either clears or sets according 
to the machine operation selected with the `ccss' iterator.  The caller 
of the insn is supposed to supply the same rtx for both operands.

This odd arrangement happens to work with old reload, but breaks with 
libatomic if LRA is used instead:

.../libatomic/flag.c: In function 'atomic_flag_test_and_set':
.../libatomic/flag.c:36:1: error: unable to generate reloads for:
   36 | }
  | ^
(jump_insn 7 6 19 2 (unspec_volatile [
(set (pc)
(if_then_else (eq (zero_extract:SI (mem/v:QI (reg:SI 27) [-1  
S1 A8])
(const_int 1 [0x1])
(const_int 0 [0]))
(const_int 1 [0x1]))
(label_ref:SI 25)
(pc)))
(set (zero_extract:SI (mem/v:QI (reg:SI 28) [-1  S1 A8])
(const_int 1 [0x1])
(const_int 0 [0]))
(const_int 1 [0x1]))
] 100) ".../libatomic/flag.c":35:10 669 {jbbssiqi}
 (nil)
 -> 25)
during RTL pass: reload
.../libatomic/flag.c:36:1: internal compiler error: in curr_insn_transform, at 
lra-constraints.c:4098
0x1112c587 _fatal_insn(char const*, rtx_def const*, char const*, int, char 
const*)
.../gcc/rtl-error.c:108
0x10ee6563 curr_insn_transform
.../gcc/lra-constraints.c:4098
0x10eeaf87 lra_constraints(bool)
.../gcc/lra-constraints.c:5133
0x10ec97e3 lra(_IO_FILE*)
.../gcc/lra.c:2336
0x10e4633f do_reload
.../gcc/ira.c:5827
0x10e46b27 execute
.../gcc/ira.c:6013
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

Switch to using `match_dup' as expected then for a machine instruction 
that in its encoding only has one actual operand in for the single-bit 
field.

gcc/
* config/vax/builtins.md (jbbi): Remove operand #3.
(sync_lock_test_and_set): Adjust accordingly.
(sync_lock_release): Likewise.
---
 gcc/config/vax/builtins.md |   10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

gcc-vax-jbbxxi-ops.diff
Index: gcc/gcc/config/vax/builtins.md
===
--- gcc.orig/gcc/config/vax/builtins.md
+++ gcc/gcc/config/vax/builtins.md
@@ -174,8 +174,7 @@
 
   label = gen_label_rtx ();
   emit_move_insn (operands[0], const1_rtx);
-  emit_jump_insn (gen_jbbssi (operands[1], const0_rtx, label,
-   operands[1]));
+  emit_jump_insn (gen_jbbssi (operands[1], const0_rtx, label));
   emit_move_insn (operands[0], const0_rtx);
   emit_label (label);
   DONE;
@@ -193,8 +192,7 @@
 FAIL;
 
   label = gen_label_rtx ();
-  emit_jump_insn (gen_jbbcci (operands[0], const0_rtx, label,
-   operands[0]));
+  emit_jump_insn (gen_jbbcci (operands[0], const0_rtx, label));
   emit_label (label);
   DONE;
 }")
@@ -204,13 +202,13 @@
 [(set (pc)
  (if_then_else
(eq (zero_extract:SI
- (match_operand:VAXint 0 "any_memory_operand" "")
+ (match_operand:VAXint 0 "any_memory_operand" "+")
  (const_int 1)
  (match_operand:SI 1 "general_operand" "nrmT"))
(const_int bit))
(label_ref (match_operand 2 "" ""))
(pc)))
- (set (zero_extract:SI (match_operand:VAXint 3 "any_memory_operand" "+0")
+ (set (zero_extract:SI (match_dup 0)
   (const_int 1)
   (match_dup 1))
  (const_int bit))]


[PATCH 1/3] VAX: Remove dead `adjacent_operands_p' function

2021-04-21 Thread Maciej W. Rozycki
This function has never been used and it is unclear what its intended 
purpose was.

gcc/
* config/vax/vax-protos.h (adjacent_operands_p): Remove 
prototype.
* config/vax/vax.c (adjacent_operands_p): Remove.
---
 gcc/config/vax/vax-protos.h |1 
 gcc/config/vax/vax.c|   73 
 2 files changed, 74 deletions(-)

gcc-vax-adjacent_operands.diff
Index: gcc/gcc/config/vax/vax-protos.h
===
--- gcc.orig/gcc/config/vax/vax-protos.h
+++ gcc/gcc/config/vax/vax-protos.h
@@ -24,7 +24,6 @@ extern void vax_expand_prologue (void);
 extern bool vax_acceptable_pic_operand_p (rtx, bool, bool);
 extern machine_mode vax_select_cc_mode (enum rtx_code, rtx, rtx);
 extern const char *cond_name (rtx);
-extern bool adjacent_operands_p (rtx, rtx, machine_mode);
 extern const char *rev_cond_name (rtx);
 extern void print_operand_address (FILE *, rtx);
 extern void print_operand (FILE *, rtx, int);
Index: gcc/gcc/config/vax/vax.c
===
--- gcc.orig/gcc/config/vax/vax.c
+++ gcc/gcc/config/vax/vax.c
@@ -2108,79 +2108,6 @@ vax_expand_addsub_di_operands (rtx * ope
 }
 }
 
-bool
-adjacent_operands_p (rtx lo, rtx hi, machine_mode mode)
-{
-  HOST_WIDE_INT lo_offset;
-  HOST_WIDE_INT hi_offset;
-
-  if (GET_CODE (lo) != GET_CODE (hi))
-return false;
-
-  if (REG_P (lo))
-return mode == SImode && REGNO (lo) + 1 == REGNO (hi);
-  if (CONST_INT_P (lo))
-return INTVAL (hi) == 0 && UINTVAL (lo) < 64;
-  if (CONST_INT_P (lo))
-return mode != SImode;
-
-  if (!MEM_P (lo))
-return false;
-
-  if (MEM_VOLATILE_P (lo) || MEM_VOLATILE_P (hi))
-return false;
-
-  lo = XEXP (lo, 0);
-  hi = XEXP (hi, 0);
-
-  if (GET_CODE (lo) == POST_INC /* || GET_CODE (lo) == PRE_DEC */)
-return rtx_equal_p (lo, hi);
-
-  switch (GET_CODE (lo))
-{
-case REG:
-case SYMBOL_REF:
-  lo_offset = 0;
-  break;
-case CONST:
-  lo = XEXP (lo, 0);
-  /* FALLTHROUGH */
-case PLUS:
-  if (!CONST_INT_P (XEXP (lo, 1)))
-   return false;
-  lo_offset = INTVAL (XEXP (lo, 1));
-  lo = XEXP (lo, 0);
-  break;
-default:
-  return false;
-}
-
-  switch (GET_CODE (hi))
-{
-case REG:
-case SYMBOL_REF:
-  hi_offset = 0;
-  break;
-case CONST:
-  hi = XEXP (hi, 0);
-  /* FALLTHROUGH */
-case PLUS:
-  if (!CONST_INT_P (XEXP (hi, 1)))
-   return false;
-  hi_offset = INTVAL (XEXP (hi, 1));
-  hi = XEXP (hi, 0);
-  break;
-default:
-  return false;
-}
-
-  if (GET_CODE (lo) == MULT || GET_CODE (lo) == PLUS)
-return false;
-
-  return rtx_equal_p (lo, hi)
-&& hi_offset - lo_offset == GET_MODE_SIZE (mode);
-}
-
 /* Output assembler code for a block containing the constant parts
of a trampoline, leaving space for the variable parts.  */
 


[PATCH 0/3] VAX backend preparatory updates for switching to LRA

2021-04-21 Thread Maciej W. Rozycki
Hi,

 According to the plan discussed in the context of the recent switch to
MODE_CC of the VAX backend I have been looking into switching the backend 
to LRA as well.

 It has turned out quite straightforward itself, with just a couple of 
minor issues triggered with a flip to LRA, one causing a build failure 
with target libatomic and another causing a C testsuite regression.  Also 
I have come across a piece of dead code which has never ever been used for 
anything and it is unclear to me what its intended purpose was.

 I have come up with this small patch series then, bundled together for 
easier reference although the individual changes are independent from each 
other.

 I think 3/3 is worth backporting to GCC 11 at one point, perhaps 11.2, so 
that it can be easily picked downstream, as it improves code generation 
with old reload and we may not have another major release still using it.

 OTOH switching to LRA regresses code generation seriously, by making the 
indexed and indirect VAX address modes severely underutilised, so while 
with these changes in place the backend can be switched to LRA with just a 
trivial to remove the redefinition of TARGET_LRA_P, I think it is not yet 
the right time to do it.

 It is not a hard show-stopper though, so while I plan to look into LRA 
now to figure out what is missing there that the old reload has to satisfy 
the VAX backend, the switch to LRA can now be made anytime if so required 
and I am preempted for whatever reason (and nobody else gets to it).

 Questions, comments, OK to apply?

  Maciej


Re: [PATCH] [libstdc++] Fix test timeout in stop_calback/destroy.cc

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 07:23:30PM +0100, Jonathan Wakely via Gcc-patches wrote:
> We should do this for gcc-11 too if an RM approves it, since acquire()
> is currently broken.

Ok, but please commit it soon, we'll need to do a RC2 tomorrow or on Friday
and then ideally no changes at all.

Jakub



[patch,committed] libgomp.fortran/depobj-1.f90: Fix omp_depend_kind (was: [r12-20 Regression] FAIL: libgomp.fortran/depobj-1.f90 -O (test for excess errors) on Linux/x86_64)

2021-04-21 Thread Tobias Burnus

Contrary to gcc/testsuite/*/gomp/, using omp_lib in libgomp is possible
and should be done – which is the proper way.

Committed as r12-52-g0c0bdcc60cfa0e87b3b9abc1460a7073b199d38a

Tobias

PS: omp_depend_kind = 2*c_intptr_t, which nowadays usually is 16,
but on 32bit systems it is usually 8.
It did fail for two reasons: first, integer(kind=16) is not supported
on -m32 x86; secondly, the FE expects a kind 8/8-byte integer kind in this case.

On 21.04.21 19:44, sunil.k.pandey wrote:

On Linux/x86_64,

a61c4964cd71446232d62ec9b10a7d45b440dd9f is the first bad commit
commit a61c4964cd71446232d62ec9b10a7d45b440dd9f
Author: Tobias Burnus 
Date:   Wed Apr 21 10:58:29 2021 +0200

 Fortran/OpenMP: Add 'omp depobj' and 'depend(mutexinoutset:'

caused

FAIL: libgomp.fortran/depobj-1.f90   -O  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-20/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/depobj-1.f90 --target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/depobj-1.f90 --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf


Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute

2021-04-21 Thread H.J. Lu via Gcc-patches
On Wed, Apr 21, 2021 at 10:09 AM Martin Sebor  wrote:
>
> On 4/14/21 4:39 PM, H.J. Lu wrote:
> > commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
> > Author: H.J. Lu 
> > Date:   Fri Aug 21 09:42:49 2020 -0700
> >
> >  x86: Add target("general-regs-only") function attribute
> >
> > is incomplete since it is impossible to call integer intrinsics from
> > a function with general-regs-only target attribute.
> >
> > 1. Add general_regs_only function attribute to inform the compiler that
> > functions use only general purpose registers.  When making inlining
> > decisions on such functions, non-GPR compiler options are excluded.
> > 2. Add general_regs_only attribute to x86 intrinsics which use only
> > general purpose registers.
> >
> ...
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -7066,6 +7066,11 @@ On x86 targets, the @code{fentry_section} attribute 
> > sets the name
> >   of the section to record function entry instrumentation calls in when
> >   enabled with @option{-pg -mrecord-mcount}
> >
> > +@item general_regs_only
> > +@cindex @code{general_regs_only} function attribute, x86
> > +The @code{general_regs_only} attribute on functions is used to
> > +inform the compiler that functions use only general purpose registers.
>
> I'll just reiterate basically the same comment as before: it's not
> clear from the very brief description above what the requirements
> are for using the attribute.  I'm guessing it can be applied to
> any function (inline or otherwise) but only has any effect when
> the function is actually inlined and otherwise doesn't constrain
> what the function can do.  (Whatever the constraints are, I think
> the manual should spell them out, and likewise for its effects.)

That is correct.

> Similarly it's not clear what should be expected when the function
> does use some other register.  Ideally, I think GCC would check and
> issue a nice error message whether or not the function is inlined
> or called.  I suspect that might only be possible for inline
> functions that are actually called and for which the back end must
> emit code.

This is what GCC does today:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99744

> Other than that, I'd suggest to improve the phrasing a bit:
>
>The @code{general_regs_only} function attribute indicates that
>the function uses only general purpose registers... [text
>explaining constraints and errors follows].
>
> Martin

How about this

@item general_regs_only
@cindex @code{general_regs_only} function attribute, x86
The @code{general_regs_only} attribute on functions is used to inform
the compiler that functions use only general purpose registers.  It
can be used together with the @code{always_inline} attribute to avoid
inlining failure when there is a mismatch in compiler vector options.

Thanks.

-- 
H.J.


[PATCH] testsuite/arm: Fix and rename arm_qbit_ok into arm_sat_ok effective-target

2021-04-21 Thread Christophe Lyon via Gcc-patches
The acle/saturation.c test uses __[su]sat() and
__saturation_occurred() intrinsics but __[su]sat() are defined in
acle.h if __ARM_FEATURE_SAT true, while __saturation_occurred()
depends on __ARM_FEATURE_QBIT.

QBIT is a v5te feature, while SAT is available since v6, so the test
really needs __ARM_FEATURE_SAT, to have both available.

This patch renames arm_qbit_ok into arm_sat_ok and checks
__ARM_FEATURE_SAT. It updates acle/saturation.c accordingly.

This enables the test to pass on arm-eabi with default cpu/fpu/mode,
where arm_qbit previously used -march=armv5te instead of armv6 now.

2021-04-22  Christophe Lyon  

gcc/
* doc/sourcebuild.texi (arm_qbit_ok): Rename into...
(arm_sat_ok): ...this.

gcc/testsuite/
* gcc.target/arm/acle/saturation.c: Use arm_sat_ok effective
target.
* lib/target-supports.exp
(check_effective_target_arm_qbit_ok_nocache): Rename into...
(check_effective_target_arm_sat_ok_nocache): ... this. Check
__ARM_FEATURE_SAT and use armv6.
---
 gcc/doc/sourcebuild.texi   |  6 ++--
 gcc/testsuite/gcc.target/arm/acle/saturation.c |  4 +--
 gcc/testsuite/lib/target-supports.exp  | 42 +-
 3 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index b5bdd4f..4d9ec3c 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2041,9 +2041,9 @@ ARM Target supports options suitable for accessing the 
SIMD32 intrinsics from
 @code{arm_acle.h}.
 Some multilibs may be incompatible with these options.
 
-@item arm_qbit_ok
-@anchor{arm_qbit_ok}
-ARM Target supports options suitable for accessing the Q-bit manipulation
+@item arm_sat_ok
+@anchor{arm_sat_ok}
+ARM Target supports options suitable for accessing the saturation
 intrinsics from @code{arm_acle.h}.
 Some multilibs may be incompatible with these options.
 
diff --git a/gcc/testsuite/gcc.target/arm/acle/saturation.c 
b/gcc/testsuite/gcc.target/arm/acle/saturation.c
index 0b3fe51..a9f99e5 100644
--- a/gcc/testsuite/gcc.target/arm/acle/saturation.c
+++ b/gcc/testsuite/gcc.target/arm/acle/saturation.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_qbit_ok } */
-/* { dg-add-options arm_qbit } */
+/* { dg-require-effective-target arm_sat_ok } */
+/* { dg-add-options arm_sat } */
 
 #include 
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index a522da3..5fab170 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -4168,24 +4168,24 @@ proc add_options_for_arm_simd32 { flags } {
 return "$flags $et_arm_simd32_flags"
 }
 
-# Return 1 if this is an ARM target supporting the saturation intrinsics
-# from arm_acle.h.  Some multilibs may be incompatible with these options.
-# Also set et_arm_qbit_flags to the best options to add.
-# arm_acle.h includes stdint.h which can cause trouble with incompatible
-# -mfloat-abi= options.
-
-proc check_effective_target_arm_qbit_ok_nocache { } {
-global et_arm_qbit_flags
-set et_arm_qbit_flags ""
-foreach flags {"" "-march=armv5te" "-march=armv5te -mfloat-abi=softfp" 
"-march=armv5te -mfloat-abi=hard"} {
-  if { [check_no_compiler_messages_nocache et_arm_qbit_flags object {
+# Return 1 if this is an ARM target supporting the __ssat and __usat
+# saturation intrinsics from arm_acle.h.  Some multilibs may be
+# incompatible with these options.  Also set et_arm_sat_flags to the
+# best options to add.  arm_acle.h includes stdint.h which can cause
+# trouble with incompatible -mfloat-abi= options.
+
+proc check_effective_target_arm_sat_ok_nocache { } {
+global et_arm_sat_flags
+set et_arm_sat_flags ""
+foreach flags {"" "-march=armv6" "-march=armv6 -mfloat-abi=softfp" 
"-march=armv6 -mfloat-abi=hard -mfpu=vfp"} {
+  if { [check_no_compiler_messages_nocache et_arm_sat_flags object {
#include 
int dummy;
-   #ifndef __ARM_FEATURE_QBIT
-   #error not QBIT
+   #ifndef __ARM_FEATURE_SAT
+   #error not SAT
#endif
   } "$flags"] } {
-   set et_arm_qbit_flags $flags
+   set et_arm_sat_flags $flags
return 1
   }
 }
@@ -4193,17 +4193,17 @@ proc check_effective_target_arm_qbit_ok_nocache { } {
   return 0
 }
 
-proc check_effective_target_arm_qbit_ok { } {
-return [check_cached_effective_target et_arm_qbit_flags \
-   check_effective_target_arm_qbit_ok_nocache]
+proc check_effective_target_arm_sat_ok { } {
+return [check_cached_effective_target et_arm_sat_flags \
+   check_effective_target_arm_sat_ok_nocache]
 }
 
-proc add_options_for_arm_qbit { flags } {
-if { ! [check_effective_target_arm_qbit_ok] } {
+proc add_options_for_arm_sat { flags } {
+if { ! [check_effective_target_arm_sat_ok] } {
return "$flags"
 }
-global et_arm_qbit_flags
-return 

Re: [PATCH 3/3] Use startswith in targets.

2021-04-21 Thread Iain Sandoe

Martin Liska  wrote:



gcc/config/darwin-c.c |   9 +-
gcc/config/darwin.c   | 141 +


The darwin changes are OK, looks like a nice cleanup,
thanks
Iain



Re: [PATCH] [libstdc++] Fix test timeout in stop_calback/destroy.cc

2021-04-21 Thread Thomas Rodgers

On 2021-04-21 11:23, Jonathan Wakely wrote:


On 21/04/21 10:10 -0700, Thomas Rodgers wrote:


[...snip...]


Please commit your patch to trunk, since that's what you had in your
original patch before I asked you to change it (causing the bug).

We should do this for gcc-11 too if an RM approves it, since acquire()
is currently broken.


Tested x86_64-pc-linux-gnu, committed to master.


Re: [PATCH 3/3] Use startswith in targets.

2021-04-21 Thread Paul Koning via Gcc-patches



> On Mar 19, 2021, at 5:21 AM, Martin Liska  wrote:
> 
> 
> gcc/ChangeLog:
> 
>   ...
>   * config/pdp11/pdp11.c (pdp11_output_ident): Likewise.

pdp11 is ok.  Thanks.

paul




Re: [PATCH] Use hardware_concurrency only if _GLIBCXX_HAS_GTHREADS

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 08:53:54PM +0100, Jonathan Wakely wrote:
> > > What would be IMHO a good idea would be to use configure test for
> > > #include 
> > > int t = std::thread::hardware_concurrency ();
> > > and in that case use that as a fallback to the previous implementation,
> > > that will be strictly an improvement.
> >
> > That does not make sense to me. Original motivation was removal of the 
> > complicated
> > implementation with a C++ standard function.

The reason it is larger and more complicated is that it answers a different
question than std::thread::hardware_concurrency().
hardware_concurrency() says how many threads the hw has available, if we
wanted just that, you could just keep the sysconf call.
The current implementation answers the question how many threads can this
process and its children be scheduled on.
Consider some NUMA box with hundreds of threads, but where it might
be desirable to keep specific compilations limited to some cores and do
other compilations or whatever else on other cores.

Sure, hardware_concurrency() answer is better than no information about the
concurrency at all.  But I don't really see the costs of the current
implementation, it is < 100 lines of code that have been written once and
it is unlikely they will need changing - at this point the interfaces are
fairly stable, haven't been changed on the glibc side for years.  It is true
that when it was introduced in glibc, there have been 2 big changes to it
soon.

Jakub



Re: [pushed] Darwin, X86 : Fix bootstrap break from flags changes.

2021-04-21 Thread Iain Sandoe

Martin Liška  wrote:


On 4/21/21 9:50 PM, Iain Sandoe wrote:




If you could take a quick look it would be appreciated - I don’t have  
much state

for the patch you applied.


The patch looks correct to me.


thanks!
Iain



Re: Patch freeze request

2021-04-21 Thread David Edelsohn via Gcc-patches
On Wed, Apr 21, 2021 at 3:42 PM Martin Liška  wrote:
>
> On 4/21/21 6:03 PM, David Edelsohn via Gcc-patches wrote:
> > I am requesting a freeze on non-bug fix patches to trunk.
> >
> > In the GCC 12 announcement, Jakub stated:
> >
> > "The trunk has branched for the GCC 11 release and is now open
> > again for general development, stage 1.  Please consider not
> > disrupting it too much during the RC phase of GCC 11 so it
> > is possible to test important fixes for 11.1 on it."
> >
> > Currently people are committing disruptive patches to GCC 12,
> > including patches that are breaking bootstrap.  This situation is
> > making it more difficult to test the patches that are necessary for
> > the GCC 11.1 Release.
> >
> > Currently GCC 11.1 Release is blocked and GCC 12 bootstrap is broken
> > for some targets that are affected by the problems in GCC 11.1.  All
> > of the GCC 12 patches not related to bug fixes are further
> > exacerbating the problems and delaying the release of GCC 11.1.
> >
> > Thanks, David
> >
>
> Hello.
>
> I'm sorry for the change that broke bootstrap. I reverted the revision
> in f2b4f212a97ae137c2a8ecafe7ed2cb5b5016b6b.

Thanks!
- David


Re: [PATCH] Use hardware_concurrency only if _GLIBCXX_HAS_GTHREADS

2021-04-21 Thread Tom Tromey
> "Jakub" == Jakub Jelinek via Gcc-patches  writes:

Jakub> What would be IMHO a good idea would be to use configure test for
Jakub> #include 
Jakub> int t = std::thread::hardware_concurrency ();
Jakub> and in that case use that as a fallback to the previous implementation,
Jakub> that will be strictly an improvement.

FWIW, gdb had to do this.  The check is in gdbsupport/common.m4.
I've appended it for convenience.

Tom

  # Check for std::thread.  This does not work on some platforms, like
  # mingw and DJGPP.
  AC_LANG_PUSH([C++])
  AX_PTHREAD([threads=yes], [threads=no])
  if test "$threads" = "yes"; then
save_LIBS="$LIBS"
LIBS="$PTHREAD_LIBS $LIBS"
save_CXXFLAGS="$CXXFLAGS"
CXXFLAGS="$PTHREAD_CFLAGS $save_CXXFLAGS"
AC_CACHE_CHECK([for std::thread],
   gdb_cv_cxx_std_thread,
   [AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
[[#include 
  void callback() { }]],
[[std::thread t(callback);]])],
  gdb_cv_cxx_std_thread=yes,
  gdb_cv_cxx_std_thread=no)])

# This check must be here, while LIBS includes any necessary
# threading library.
AC_CHECK_FUNCS([pthread_sigmask pthread_setname_np])

LIBS="$save_LIBS"
CXXFLAGS="$save_CXXFLAGS"
  fi
  if test "$gdb_cv_cxx_std_thread" = "yes"; then
AC_DEFINE(CXX_STD_THREAD, 1,
  [Define to 1 if std::thread works.])
  fi
  AC_LANG_POP


Re: [pushed] Darwin, X86 : Fix bootstrap break from flags changes.

2021-04-21 Thread Martin Liška
On 4/21/21 9:50 PM, Iain Sandoe wrote:
> Hi,
> 
> @Martin

Hello.

Sorry for the troubles I caused.

> I am not sure from the commit messages for this series whether my change
> below is complete (it is enough to make bootstrap succeed so I have
> applied it anyway).
> 
> you mention TARGET_* but that is too general - however i386/darwin.h does
> have some other ISA-specific TARGET_* settings (mostly from well before
> my time) so I cannot be 100% sure if more should be removed/changed.
> 
> If you could take a quick look it would be appreciated - I don’t have much 
> state
> for the patch you applied.

The patch looks correct to me.

Thanks,
Martin

> 
> thanks
> Iain
> 
> =
> 
> The changes from r12-36-g1751bec027f030515889fcf4baa9c91501aafc85
> did not remove the uses of TARGET_ISA_* from i386/darwin.h.
> 
> Fixed thus.
> 
> gcc/ChangeLog:
> 
> * config/i386/darwin.h (TARGET_64BIT): Remove definition
> based on TARGET_ISA_64BIT.
> (TARGET_64BIT_P): Remove definition based on
> TARGET_ISA_64BIT_P().
> ---
> gcc/config/i386/darwin.h | 5 -
> 1 file changed, 5 deletions(-)
> 
> diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h
> index c81db9bf09f..2657dfe266e 100644
> --- a/gcc/config/i386/darwin.h
> +++ b/gcc/config/i386/darwin.h
> @@ -25,11 +25,6 @@ along with GCC; see the file COPYING3.  If not see
> #undef DARWIN_X86
> #define DARWIN_X86 1
> 
> -#undef TARGET_64BIT
> -#define TARGET_64BIT TARGET_ISA_64BIT
> -#undef TARGET_64BIT_P
> -#define TARGET_64BIT_P(x) TARGET_ISA_64BIT_P(x)
> -
> #ifdef IN_LIBGCC2
> #undef TARGET_64BIT
> #ifdef __x86_64__



Re: [PATCH] Use hardware_concurrency only if _GLIBCXX_HAS_GTHREADS

2021-04-21 Thread Jonathan Wakely via Gcc-patches
On Wed, 21 Apr 2021 at 20:41, Martin Liška  wrote:
>
> On 4/21/21 9:15 PM, Jakub Jelinek wrote:
> > On Wed, Apr 21, 2021 at 08:28:55PM +0200, Jakub Jelinek via Gcc-patches 
> > wrote:
> >>> There's a patch attempt for the problem with 
> >>> std::thread::hardware_concurrency where
> >>> it's used only if _GLIBCXX_HAS_GTHREADS is set.
> >>>
> >>> Does it help?
> >>> Thanks,
> >>> Martin
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>> PR bootstrap/100186
> >>> * lto-wrapper.c: Use hardware_concurrency only if
> >>> _GLIBCXX_HAS_GTHREADS.
> >>> ---
> >>>  gcc/lto-wrapper.c | 6 +-
> >>>  1 file changed, 5 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
> >>> index 6ba401007f6..8a85b3e93a8 100644
> >>> --- a/gcc/lto-wrapper.c
> >>> +++ b/gcc/lto-wrapper.c
> >>> @@ -1285,7 +1285,11 @@ run_gcc (unsigned argc, char *argv[])
> >>>static char current_dir[] = { '.', DIR_SEPARATOR, '\0' };
> >>>
> >>>/* Number of CPUs that can be used for parallel LTRANS phase.  */
> >>> -  unsigned long nthreads_var = std::thread::hardware_concurrency ();
> >>> +  unsigned long nthreads_var = 0;
> >>> +
> >>> +#ifdef _GLIBCXX_HAS_GTHREADS
> >>> +  nthreads_var = std::thread::hardware_concurrency ();
> >>> +#endif
> >>
> >> _GLIBCXX_HAS_GTHREADS is a libstdc++ internal macro, it shouldn't be used
> >> outside of libstdc++.
> >> And, when using some other compiler or standard C++ library, nthreads_var
> >> will be always 0.  That isn't an improvement.
> >
> > What would be IMHO a good idea would be to use configure test for
> > #include 
> > int t = std::thread::hardware_concurrency ();
> > and in that case use that as a fallback to the previous implementation,
> > that will be strictly an improvement.
>
> That does not make sense to me. Original motivation was removal of the 
> complicated
> implementation with a C++ standard function.
>
> @Jonathan:
>
> What standard says about usage of std::thread::hardware_concurrency() [1]?
> The function is defined in C++11, how can one detect if threading is enabled
> in the C++ library and use it?

A configure test.

The standard says it's required to exist, but libstdc++ has
traditionally not defined things that are not supportable on the
target. So for targets with no multithreading, we didn't define
anything in . Even though that has changed in current releases
(so that std::thread exists even if it's useless), that won't help you
with previous releases. You need a configure test.

As I said in the PR, opinions differ on whether it's better to define
everything (even if it's useless) or not define them. Arguably, users
on small embedded targets do not want the library to waste space
defining a useless std::thread class that just throws an exception if
you ever try to create a new thread (but that's what the standard
requires).


[pushed] Darwin, X86 : Fix bootstrap break from flags changes.

2021-04-21 Thread Iain Sandoe

Hi,

@Martin
I am not sure from the commit messages for this series whether my change
below is complete (it is enough to make bootstrap succeed so I have
applied it anyway).

you mention TARGET_* but that is too general - however i386/darwin.h does
have some other ISA-specific TARGET_* settings (mostly from well before
my time) so I cannot be 100% sure if more should be removed/changed.

If you could take a quick look it would be appreciated - I don’t have much  
state

for the patch you applied.

thanks
Iain

=

The changes from r12-36-g1751bec027f030515889fcf4baa9c91501aafc85
did not remove the uses of TARGET_ISA_* from i386/darwin.h.

Fixed thus.

gcc/ChangeLog:

* config/i386/darwin.h (TARGET_64BIT): Remove definition
based on TARGET_ISA_64BIT.
(TARGET_64BIT_P): Remove definition based on
TARGET_ISA_64BIT_P().
---
 gcc/config/i386/darwin.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h
index c81db9bf09f..2657dfe266e 100644
--- a/gcc/config/i386/darwin.h
+++ b/gcc/config/i386/darwin.h
@@ -25,11 +25,6 @@ along with GCC; see the file COPYING3.  If not see
 #undef DARWIN_X86
 #define DARWIN_X86 1

-#undef TARGET_64BIT
-#define TARGET_64BIT TARGET_ISA_64BIT
-#undef TARGET_64BIT_P
-#define TARGET_64BIT_P(x) TARGET_ISA_64BIT_P(x)
-
 #ifdef IN_LIBGCC2
 #undef TARGET_64BIT
 #ifdef __x86_64__
--
2.24.1




Re: Patch freeze request

2021-04-21 Thread Martin Liška
On 4/21/21 6:03 PM, David Edelsohn via Gcc-patches wrote:
> I am requesting a freeze on non-bug fix patches to trunk.
> 
> In the GCC 12 announcement, Jakub stated:
> 
> "The trunk has branched for the GCC 11 release and is now open
> again for general development, stage 1.  Please consider not
> disrupting it too much during the RC phase of GCC 11 so it
> is possible to test important fixes for 11.1 on it."
> 
> Currently people are committing disruptive patches to GCC 12,
> including patches that are breaking bootstrap.  This situation is
> making it more difficult to test the patches that are necessary for
> the GCC 11.1 Release.
> 
> Currently GCC 11.1 Release is blocked and GCC 12 bootstrap is broken
> for some targets that are affected by the problems in GCC 11.1.  All
> of the GCC 12 patches not related to bug fixes are further
> exacerbating the problems and delaying the release of GCC 11.1.
> 
> Thanks, David
> 

Hello.

I'm sorry for the change that broke bootstrap. I reverted the revision
in f2b4f212a97ae137c2a8ecafe7ed2cb5b5016b6b.

Cheers,
Martin


Re: [PATCH] Use hardware_concurrency only if _GLIBCXX_HAS_GTHREADS

2021-04-21 Thread Martin Liška
On 4/21/21 9:15 PM, Jakub Jelinek wrote:
> On Wed, Apr 21, 2021 at 08:28:55PM +0200, Jakub Jelinek via Gcc-patches wrote:
>>> There's a patch attempt for the problem with 
>>> std::thread::hardware_concurrency where
>>> it's used only if _GLIBCXX_HAS_GTHREADS is set.
>>>
>>> Does it help?
>>> Thanks,
>>> Martin
>>>
>>> gcc/ChangeLog:
>>>
>>> PR bootstrap/100186
>>> * lto-wrapper.c: Use hardware_concurrency only if
>>> _GLIBCXX_HAS_GTHREADS.
>>> ---
>>>  gcc/lto-wrapper.c | 6 +-
>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
>>> index 6ba401007f6..8a85b3e93a8 100644
>>> --- a/gcc/lto-wrapper.c
>>> +++ b/gcc/lto-wrapper.c
>>> @@ -1285,7 +1285,11 @@ run_gcc (unsigned argc, char *argv[])
>>>static char current_dir[] = { '.', DIR_SEPARATOR, '\0' };
>>>  
>>>/* Number of CPUs that can be used for parallel LTRANS phase.  */
>>> -  unsigned long nthreads_var = std::thread::hardware_concurrency ();
>>> +  unsigned long nthreads_var = 0;
>>> +
>>> +#ifdef _GLIBCXX_HAS_GTHREADS
>>> +  nthreads_var = std::thread::hardware_concurrency ();
>>> +#endif
>>
>> _GLIBCXX_HAS_GTHREADS is a libstdc++ internal macro, it shouldn't be used
>> outside of libstdc++.
>> And, when using some other compiler or standard C++ library, nthreads_var
>> will be always 0.  That isn't an improvement.
> 
> What would be IMHO a good idea would be to use configure test for
> #include 
> int t = std::thread::hardware_concurrency ();
> and in that case use that as a fallback to the previous implementation,
> that will be strictly an improvement.

That does not make sense to me. Original motivation was removal of the 
complicated
implementation with a C++ standard function.

@Jonathan:

What standard says about usage of std::thread::hardware_concurrency() [1]?
The function is defined in C++11, how can one detect if threading is enabled
in the C++ library and use it?

[1] https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency

> 
>   Jakub
> 



Re: [PATCH] Use hardware_concurrency only if _GLIBCXX_HAS_GTHREADS

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 03:25:26PM -0400, David Edelsohn wrote:
> > What would be IMHO a good idea would be to use configure test for
> > #include 
> > int t = std::thread::hardware_concurrency ();
> > and in that case use that as a fallback to the previous implementation,
> > that will be strictly an improvement.
> 
> Would it be good enough to add a compile-time test for GCC VERSION >=
> 10 and fallback to the value 1 otherwise?  If one performs the

No, that will make all cross-compilers LTO non-parallelized.
Adding a configure check for it is very easy.

Jakub



Re: [PATCH] Use hardware_concurrency only if _GLIBCXX_HAS_GTHREADS

2021-04-21 Thread David Edelsohn via Gcc-patches
On Wed, Apr 21, 2021 at 3:16 PM Jakub Jelinek  wrote:
>
> On Wed, Apr 21, 2021 at 08:28:55PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > > There's a patch attempt for the problem with 
> > > std::thread::hardware_concurrency where
> > > it's used only if _GLIBCXX_HAS_GTHREADS is set.
> > >
> > > Does it help?
> > > Thanks,
> > > Martin
> > >
> > > gcc/ChangeLog:
> > >
> > > PR bootstrap/100186
> > > * lto-wrapper.c: Use hardware_concurrency only if
> > > _GLIBCXX_HAS_GTHREADS.
> > > ---
> > >  gcc/lto-wrapper.c | 6 +-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
> > > index 6ba401007f6..8a85b3e93a8 100644
> > > --- a/gcc/lto-wrapper.c
> > > +++ b/gcc/lto-wrapper.c
> > > @@ -1285,7 +1285,11 @@ run_gcc (unsigned argc, char *argv[])
> > >static char current_dir[] = { '.', DIR_SEPARATOR, '\0' };
> > >
> > >/* Number of CPUs that can be used for parallel LTRANS phase.  */
> > > -  unsigned long nthreads_var = std::thread::hardware_concurrency ();
> > > +  unsigned long nthreads_var = 0;
> > > +
> > > +#ifdef _GLIBCXX_HAS_GTHREADS
> > > +  nthreads_var = std::thread::hardware_concurrency ();
> > > +#endif
> >
> > _GLIBCXX_HAS_GTHREADS is a libstdc++ internal macro, it shouldn't be used
> > outside of libstdc++.
> > And, when using some other compiler or standard C++ library, nthreads_var
> > will be always 0.  That isn't an improvement.
>
> What would be IMHO a good idea would be to use configure test for
> #include 
> int t = std::thread::hardware_concurrency ();
> and in that case use that as a fallback to the previous implementation,
> that will be strictly an improvement.

Would it be good enough to add a compile-time test for GCC VERSION >=
10 and fallback to the value 1 otherwise?  If one performs the
recommended bootstrap of the compiler, stage2 and stage3 will pick up
 from the newly built libstdc++ that always defines
std::thread and will provide the expected feature.  The problem is the
stage1 compiler that may be built with an older compiler or building
GCC without performing a three-stage bootstrap.

Thanks, David


Re: [PATCH] Use std::thread::hardware_concurrency in lto-wrapper.c.

2021-04-21 Thread Martin Liška
On 4/21/21 8:27 PM, Jakub Jelinek wrote:
> On Wed, Apr 21, 2021 at 08:06:26PM +0200, Martin Liška wrote:
>> On 4/21/21 7:56 PM, Jakub Jelinek wrote:
>>> On Wed, Apr 21, 2021 at 11:50:02AM -0400, David Edelsohn via Gcc-patches 
>>> wrote:
 This patch broke bootstrap on AIX.

 std::thread is not provided in all instances.  GCC is not compiled
 multi-threaded by default.

 error: `std::thread' has not been declared.
>>>
>>> In addition to that, I'm not really convinced it is a good idea,
>>> e.g. for Linux I believe the previous implementation was much better
>>> (as it counted only the cores/threads available to gcc subprocesses,
>>> rather than all available on the system, so e.g. when running some
>>> compilation job under taskset with std::thread::hardware_concurrency it
>>> will oversubscribe the available CPUs).
>>
>> Well, the previous implementation was over-complicated! I think we can live
>> with an implementation that covers 99% of the cases. It's parallel level
>> for LTO LTRANS run.
> 
> No, it was the standard way how to query that information.
> E.g. libgomp does that too.

I would not call it standard. A normal user can't never handle all the corner
cases that are in the implementation.

Anyway, that's why we have a C++ ISO standard and so 
std::thread::hardware_concurrency
function implementation can (and likely should) handle all this. Or?

Martin

> 
>   Jakub
> 



Re: [PATCH] c++: do_class_deduction and dependent init [PR93383]

2021-04-21 Thread Patrick Palka via Gcc-patches
On Mon, 12 Apr 2021, Patrick Palka wrote:

> Here we're crashing during deduction for a template placeholder from a
> dependent initializer because one of the initializer's elements has an
> empty TREE_TYPE, something which resolve_args and later unify_one_argument
> don't expect.  And if the deduction from a dependent initializer
> otherwise fails, we prematurely issue an error rather than reattempting
> the deduction at instantiation time.
> 
> This patch makes do_class_deduction more tolerant about dependent
> initializers, in a manner similar to what do_auto_deduction does: if
> deduction from a dependent initializer fails, just return the original
> placeholder unchanged.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, and also tested on
> cmcstl2 and range-v3, and on all the testcases in PR93383, does this
> look OK for trunk?

Ping.  This patch apparently also fixes the ICE reported in PR95291 and
its related/duplicate PRs.

> 
> gcc/cp/ChangeLog:
> 
>   PR c++/89565
>   PR c++/93383
>   PR c++/99200
>   * pt.c (do_class_deduction): If an argument has no type, don't
>   attempt deduction.  If deduction fails and the initializer is
>   type-dependent, try again at instantiation time.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c++/89565
>   PR c++/93383
>   PR c++/99200
>   * g++.dg/cpp2a/nontype-class39.C: Remove dg-ice.
>   * g++.dg/cpp2a/nontype-class44.C: New test.
>   * g++.dg/cpp2a/nontype-class45.C: New test.
> ---
>  gcc/cp/pt.c  | 11 +++
>  gcc/testsuite/g++.dg/cpp2a/nontype-class39.C |  1 -
>  gcc/testsuite/g++.dg/cpp2a/nontype-class44.C | 11 +++
>  gcc/testsuite/g++.dg/cpp2a/nontype-class45.C | 32 
>  4 files changed, 54 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class44.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class45.C
> 
> diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> index 0ce7fa359c1..612feac7976 100644
> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -29334,6 +29334,13 @@ do_class_deduction (tree ptype, tree tmpl, tree init,
>else
>  args = make_tree_vector_single (init);
>  
> +  /* If an argument is missing its type, we can't possibly deduce from this
> + (type-dependent) initializer ahead of time.  */
> +  if (processing_template_decl)
> +for (tree arg : *args)
> +  if (!TREE_TYPE (arg))
> + return ptype;
> +
>/* Do this now to avoid problems with erroneous args later on.  */
>args = resolve_args (args, complain);
>if (args == NULL)
> @@ -29419,6 +29426,10 @@ do_class_deduction (tree ptype, tree tmpl, tree init,
>  
>if (call == error_mark_node)
>  {
> +  if (type_dependent_expression_p (init))
> + /* Try again at instantiation time.  */
> + return ptype;
> +
>if (complain & tf_warning_or_error)
>   {
> error ("class template argument deduction failed:");
> diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C 
> b/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
> index f5f79a71ec2..9b4da4f02ea 100644
> --- a/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
> +++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
> @@ -1,6 +1,5 @@
>  // PR c++/89565
>  // { dg-do compile { target c++20 } }
> -// { dg-ice "resolve_args" }
>  
>  template 
>  struct N{};
> diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class44.C 
> b/gcc/testsuite/g++.dg/cpp2a/nontype-class44.C
> new file mode 100644
> index 000..d91e800424f
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class44.C
> @@ -0,0 +1,11 @@
> +// PR c++/93383
> +// { dg-do compile { target c++20 } }
> +
> +template  struct A {};
> +
> +template  struct B {
> +  void foo(B<+a>);
> +  void bar(B);
> +  template  using type = B;
> +  template  static inline auto y = A{0}; // { dg-error "deduction|no 
> match" }
> +};
> diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class45.C 
> b/gcc/testsuite/g++.dg/cpp2a/nontype-class45.C
> new file mode 100644
> index 000..e7addf5f291
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class45.C
> @@ -0,0 +1,32 @@
> +// PR c++/99200
> +// { dg-do compile { target c++20 } }
> +
> +template 
> +struct A
> +{
> +  constexpr A (const char ()[N]) { for (int i = 0; i < N; i++) v[i] = 
> s[i]; v[N] = 0; }
> +  char v[N + 1];
> +};
> +
> +template 
> +struct B
> +{
> +  constexpr operator const char *() { return s.v; }
> +};
> +
> +template 
> +const char *
> +foo ()
> +{ 
> +  return B<__PRETTY_FUNCTION__>{};
> +}
> +
> +template 
> +const char *
> +bar ()
> +{ 
> +  return B<__FUNCTION__>{};
> +}
> +
> +auto a = foo  ();
> +auto b = bar  ();
> -- 
> 2.31.1.272.g89b43f80a5
> 
> 



Re: [PATCH] Use hardware_concurrency only if _GLIBCXX_HAS_GTHREADS

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 08:28:55PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > There's a patch attempt for the problem with 
> > std::thread::hardware_concurrency where
> > it's used only if _GLIBCXX_HAS_GTHREADS is set.
> > 
> > Does it help?
> > Thanks,
> > Martin
> > 
> > gcc/ChangeLog:
> > 
> > PR bootstrap/100186
> > * lto-wrapper.c: Use hardware_concurrency only if
> > _GLIBCXX_HAS_GTHREADS.
> > ---
> >  gcc/lto-wrapper.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
> > index 6ba401007f6..8a85b3e93a8 100644
> > --- a/gcc/lto-wrapper.c
> > +++ b/gcc/lto-wrapper.c
> > @@ -1285,7 +1285,11 @@ run_gcc (unsigned argc, char *argv[])
> >static char current_dir[] = { '.', DIR_SEPARATOR, '\0' };
> >  
> >/* Number of CPUs that can be used for parallel LTRANS phase.  */
> > -  unsigned long nthreads_var = std::thread::hardware_concurrency ();
> > +  unsigned long nthreads_var = 0;
> > +
> > +#ifdef _GLIBCXX_HAS_GTHREADS
> > +  nthreads_var = std::thread::hardware_concurrency ();
> > +#endif
> 
> _GLIBCXX_HAS_GTHREADS is a libstdc++ internal macro, it shouldn't be used
> outside of libstdc++.
> And, when using some other compiler or standard C++ library, nthreads_var
> will be always 0.  That isn't an improvement.

What would be IMHO a good idea would be to use configure test for
#include 
int t = std::thread::hardware_concurrency ();
and in that case use that as a fallback to the previous implementation,
that will be strictly an improvement.

Jakub



Re: [PATCH] Use hardware_concurrency only if _GLIBCXX_HAS_GTHREADS

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 08:28:56PM +0200, Jakub Jelinek wrote:
> On Wed, Apr 21, 2021 at 08:04:18PM +0200, Martin Liška wrote:
> > Hello.
> > 
> > There's a patch attempt for the problem with 
> > std::thread::hardware_concurrency where
> > it's used only if _GLIBCXX_HAS_GTHREADS is set.
> > 
> > Does it help?
> > Thanks,
> > Martin
> > 
> > gcc/ChangeLog:
> > 
> > PR bootstrap/100186
> > * lto-wrapper.c: Use hardware_concurrency only if
> > _GLIBCXX_HAS_GTHREADS.
> > ---
> >  gcc/lto-wrapper.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
> > index 6ba401007f6..8a85b3e93a8 100644
> > --- a/gcc/lto-wrapper.c
> > +++ b/gcc/lto-wrapper.c
> > @@ -1285,7 +1285,11 @@ run_gcc (unsigned argc, char *argv[])
> >static char current_dir[] = { '.', DIR_SEPARATOR, '\0' };
> >  
> >/* Number of CPUs that can be used for parallel LTRANS phase.  */
> > -  unsigned long nthreads_var = std::thread::hardware_concurrency ();
> > +  unsigned long nthreads_var = 0;
> > +
> > +#ifdef _GLIBCXX_HAS_GTHREADS
> > +  nthreads_var = std::thread::hardware_concurrency ();
> > +#endif
> 
> _GLIBCXX_HAS_GTHREADS is a libstdc++ internal macro, it shouldn't be used
> outside of libstdc++.
> And, when using some other compiler or standard C++ library, nthreads_var
> will be always 0.  That isn't an improvement.

Not to mention that it won't work properly for system GCC 4.8 - 9.x
where defined(_GLIBCXX_HAS_GTHREADS) doesn't imply std::thread is available.

Jakub



Re: Patch freeze request

2021-04-21 Thread Iain Sandoe via Gcc-patches

David Edelsohn via Gcc-patches  wrote:


I am requesting a freeze on non-bug fix patches to trunk.

In the GCC 12 announcement, Jakub stated:

"The trunk has branched for the GCC 11 release and is now open
again for general development, stage 1.  Please consider not
disrupting it too much during the RC phase of GCC 11 so it
is possible to test important fixes for 11.1 on it."

Currently people are committing disruptive patches to GCC 12,
including patches that are breaking bootstrap.  This situation is
making it more difficult to test the patches that are necessary for
the GCC 11.1 Release.

Currently GCC 11.1 Release is blocked and GCC 12 bootstrap is broken
for some targets that are affected by the problems in GCC 11.1.  All
of the GCC 12 patches not related to bug fixes are further
exacerbating the problems and delaying the release of GCC 11.1.


+1
Darwin is also bootstrap broken (probably from the flags changes, but
not had a chance to analyse yet).

Iain




Re: [PATCH] Use std::thread::hardware_concurrency in lto-wrapper.c.

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 08:06:26PM +0200, Martin Liška wrote:
> On 4/21/21 7:56 PM, Jakub Jelinek wrote:
> > On Wed, Apr 21, 2021 at 11:50:02AM -0400, David Edelsohn via Gcc-patches 
> > wrote:
> >> This patch broke bootstrap on AIX.
> >>
> >> std::thread is not provided in all instances.  GCC is not compiled
> >> multi-threaded by default.
> >>
> >> error: `std::thread' has not been declared.
> > 
> > In addition to that, I'm not really convinced it is a good idea,
> > e.g. for Linux I believe the previous implementation was much better
> > (as it counted only the cores/threads available to gcc subprocesses,
> > rather than all available on the system, so e.g. when running some
> > compilation job under taskset with std::thread::hardware_concurrency it
> > will oversubscribe the available CPUs).
> 
> Well, the previous implementation was over-complicated! I think we can live
> with an implementation that covers 99% of the cases. It's parallel level
> for LTO LTRANS run.

No, it was the standard way how to query that information.
E.g. libgomp does that too.

Jakub



Re: [PATCH] Use hardware_concurrency only if _GLIBCXX_HAS_GTHREADS

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 08:04:18PM +0200, Martin Liška wrote:
> Hello.
> 
> There's a patch attempt for the problem with 
> std::thread::hardware_concurrency where
> it's used only if _GLIBCXX_HAS_GTHREADS is set.
> 
> Does it help?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
>   PR bootstrap/100186
>   * lto-wrapper.c: Use hardware_concurrency only if
>   _GLIBCXX_HAS_GTHREADS.
> ---
>  gcc/lto-wrapper.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
> index 6ba401007f6..8a85b3e93a8 100644
> --- a/gcc/lto-wrapper.c
> +++ b/gcc/lto-wrapper.c
> @@ -1285,7 +1285,11 @@ run_gcc (unsigned argc, char *argv[])
>static char current_dir[] = { '.', DIR_SEPARATOR, '\0' };
>  
>/* Number of CPUs that can be used for parallel LTRANS phase.  */
> -  unsigned long nthreads_var = std::thread::hardware_concurrency ();
> +  unsigned long nthreads_var = 0;
> +
> +#ifdef _GLIBCXX_HAS_GTHREADS
> +  nthreads_var = std::thread::hardware_concurrency ();
> +#endif

_GLIBCXX_HAS_GTHREADS is a libstdc++ internal macro, it shouldn't be used
outside of libstdc++.
And, when using some other compiler or standard C++ library, nthreads_var
will be always 0.  That isn't an improvement.

Jakub



Re: [PATCH] [libstdc++] Fix test timeout in stop_calback/destroy.cc

2021-04-21 Thread Jonathan Wakely via Gcc-patches

On 21/04/21 10:10 -0700, Thomas Rodgers wrote:

From: Thomas Rodgers 

A change was made to __atomic_semaphore::_S_do_try_acquire() to
(ideally) let the compare_exchange reload the value of __old rather than
always reloading it twice. This causes _M_acquire to spin indefinitely
if the value of __old is already 0.

libstdc++/ChangeLog:
* include/bits/semaphore_base.h: Always reload __old in
__atomic_semaphore::_S_do_try_acquire().
* testsuite/30_threads/stop_token/stop_callback/destroy.cc
re-enable testcase.
---
libstdc++-v3/include/bits/semaphore_base.h   | 16 ++--
.../stop_token/stop_callback/destroy.cc  |  2 --
2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/include/bits/semaphore_base.h 
b/libstdc++-v3/include/bits/semaphore_base.h
index 35469e443b0..84b33423fff 100644
--- a/libstdc++-v3/include/bits/semaphore_base.h
+++ b/libstdc++-v3/include/bits/semaphore_base.h
@@ -196,9 +196,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__atomic_semaphore& operator=(const __atomic_semaphore&) = delete;

static _GLIBCXX_ALWAYS_INLINE bool
-_S_do_try_acquire(__detail::__platform_wait_t* __counter,
- __detail::__platform_wait_t& __old) noexcept
+_S_do_try_acquire(__detail::__platform_wait_t* __counter) noexcept
{
+  auto __old = __atomic_impl::load(__counter, memory_order::acquire);
  if (__old == 0)
return false;

@@ -211,18 +211,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_GLIBCXX_ALWAYS_INLINE void
_M_acquire() noexcept
{
-  auto __old = __atomic_impl::load(&_M_counter, memory_order::acquire);
  auto const __pred =
-   [this, &__old] { return _S_do_try_acquire(>_M_counter, __old); };
+   [this] { return _S_do_try_acquire(>_M_counter); };
  std::__atomic_wait_address_bare(&_M_counter, __pred);
}

bool
_M_try_acquire() noexcept
{
-  auto __old = __atomic_impl::load(&_M_counter, memory_order::acquire);
  auto const __pred =
-   [this, &__old] { return _S_do_try_acquire(>_M_counter, __old); };
+   [this] { return _S_do_try_acquire(>_M_counter); };
  return std::__detail::__atomic_spin(__pred);
}

@@ -231,9 +229,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _M_try_acquire_until(const chrono::time_point<_Clock,
   _Duration>& __atime) noexcept
  {
-   auto __old = __atomic_impl::load(&_M_counter, memory_order_relaxed);
auto const __pred =
- [this, &__old] { return _S_do_try_acquire(>_M_counter, __old); 
};
+ [this] { return _S_do_try_acquire(>_M_counter); };

return __atomic_wait_address_until_bare(&_M_counter, __pred, __atime);
  }
@@ -243,9 +240,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _M_try_acquire_for(const chrono::duration<_Rep, _Period>& __rtime)
noexcept
  {
-   auto __old = __atomic_impl::load(&_M_counter, memory_order_relaxed);
auto const __pred =
- [this, &__old] { return _S_do_try_acquire(>_M_counter, __old); 
};
+ [this] { return _S_do_try_acquire(>_M_counter); };

return __atomic_wait_address_for_bare(&_M_counter, __pred, __rtime);
  }


As discussed on IRC (and repeating here for my benefit when I forget
how this works again) the bug is in _M_acquire() because my suggestion
to not refresh the __old value with an atomic_load on every loop means
we fail to meet the requirement in the second bullet here:

  -15- Effects: Repeatedly performs the following steps, in order:
  (15.1) — Evaluates try_acquire. If the result is true, returns.
  (15.2) — Blocks on *this until counter is greater than zero.

If we don't re-load the value, then we always see the initial zero,
and every call to _S_do_try_acquire() returns false immediately.

An alternative fix would be to keep the caching of the value in the
predicate for the try_acquire{,_for_until} functions, but refresh it
for acquire, like so:

--- a/libstdc++-v3/include/bits/semaphore_base.h
+++ b/libstdc++-v3/include/bits/semaphore_base.h
@@ -223,9 +223,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_ALWAYS_INLINE void
 _M_acquire() noexcept
 {
-  auto __old = __atomic_impl::load(&_M_counter, memory_order::acquire);
-  auto const __pred =
-   [this, &__old] { return _S_do_try_acquire(>_M_counter, __old); };
+  auto const __pred = [this] {
+   auto __old = __atomic_impl::load(&_M_counter, 
memory_order::acquire);
+   return _S_do_try_acquire(>_M_counter, __old);
+  };
   std::__atomic_wait_address_bare(&_M_counter, __pred);
 }
 
This fixes the stop_callback/destroy.cc hang for me on x86-32.


Refreshing the value on every loop for try_acquire{,_for,_until} is
not necessary for correctness, because the standard says:

  An implementation should ensure that try_acquire does not consistently
  return false in the absence of contending semaphore operations.

But refreshing it does reduce the 

Re: [committed] aarch64: Always use .init/.fini_array for GNU/Linux

2021-04-21 Thread Joseph Myers
On Wed, 21 Apr 2021, Richard Sandiford wrote:

> which AFAICT would test preprocessing for the host rather than the target.

Ah, good point.  (GCC_GLIBC_VERSION_GTE_IFELSE can be used to deal with 
glibc version conditionals in a way that works, given appropriate 
configure options, even when the target headers aren't available, though 
at present the gcc_AC_INITFINI_ARRAY call comes before the glibc version 
information has been computed.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [committed] aarch64: Always use .init/.fini_array for GNU/Linux

2021-04-21 Thread Richard Sandiford via Gcc-patches
Joseph Myers  writes:
> On Wed, 21 Apr 2021, Andreas Schwab wrote:
>
>> On Apr 21 2021, Richard Sandiford via Gcc-patches wrote:
>> 
>> > However, AArch64 postdates glibc and binutils support for
>> > .init/fini_array by some distance, so it's safe to hard-code the
>> > result to "yes" for cross compilers.
>> 
>> Should there be an automatism for any other new architecture added?

I guess it depends on whether we want this to be done automatically
only for GNU/Linux or for other OSes too.  I'm a bit nervous about
enabling it for aarch64*-elf by default.

I guess we could have:

x86_64*-linux-gnu)
...all other current linux-gnu targets except aarch64
...
  ...=no
  ;;

*-*-linux-gnu*)
  ...=yes
  ;;

*)
  ...=no
  ;;

Then other targets that are similarly “new enough” can be removed
from the first list.  New architectures would automatically be “yes”.

> See what I said in 
> .  
> I think the configure test is valid for cross compilation and doesn't need 
> to check build = host = target at all.

Are you sure?  The default (non-ia64) case includes:

AC_PREPROC_IFELSE([AC_LANG_SOURCE([
#ifndef __ELF__
# error Not an ELF OS
#endif
#include 
#if defined __GLIBC_PREREQ
# if __GLIBC_PREREQ (2, 4)
# else
#  error GLIBC 2.4 required
# endif
#else
# if defined __sun__ && defined __svr4__
   /* Solaris ld.so.1 supports .init_array/.fini_array since Solaris 8.  */
# else
#  error The C library not known to support .init_array/.fini_array
# endif
#endif
])],, [gcc_cv_initfini_array=no]);;

which AFAICT would test preprocessing for the host rather than the target.

Thanks,
Richard


Re: [PATCH] Use std::thread::hardware_concurrency in lto-wrapper.c.

2021-04-21 Thread Martin Liška
On 4/21/21 7:56 PM, Jakub Jelinek wrote:
> On Wed, Apr 21, 2021 at 11:50:02AM -0400, David Edelsohn via Gcc-patches 
> wrote:
>> This patch broke bootstrap on AIX.
>>
>> std::thread is not provided in all instances.  GCC is not compiled
>> multi-threaded by default.
>>
>> error: `std::thread' has not been declared.
> 
> In addition to that, I'm not really convinced it is a good idea,
> e.g. for Linux I believe the previous implementation was much better
> (as it counted only the cores/threads available to gcc subprocesses,
> rather than all available on the system, so e.g. when running some
> compilation job under taskset with std::thread::hardware_concurrency it
> will oversubscribe the available CPUs).

Well, the previous implementation was over-complicated! I think we can live
with an implementation that covers 99% of the cases. It's parallel level
for LTO LTRANS run.

Martin

> 
> Sure, for some non-Linux targets you might want to copy what is done in
> libstdc++-v3/src/c++11/thread.cc, but I think it is better to copy/adjust the
> relevant configure bits and the thread.cc code is very short...
> 
>   Jakub
> 



[PATCH] Use hardware_concurrency only if _GLIBCXX_HAS_GTHREADS

2021-04-21 Thread Martin Liška
Hello.

There's a patch attempt for the problem with std::thread::hardware_concurrency 
where
it's used only if _GLIBCXX_HAS_GTHREADS is set.

Does it help?
Thanks,
Martin

gcc/ChangeLog:

PR bootstrap/100186
* lto-wrapper.c: Use hardware_concurrency only if
_GLIBCXX_HAS_GTHREADS.
---
 gcc/lto-wrapper.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 6ba401007f6..8a85b3e93a8 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -1285,7 +1285,11 @@ run_gcc (unsigned argc, char *argv[])
   static char current_dir[] = { '.', DIR_SEPARATOR, '\0' };
 
   /* Number of CPUs that can be used for parallel LTRANS phase.  */
-  unsigned long nthreads_var = std::thread::hardware_concurrency ();
+  unsigned long nthreads_var = 0;
+
+#ifdef _GLIBCXX_HAS_GTHREADS
+  nthreads_var = std::thread::hardware_concurrency ();
+#endif
 
   /* Get the driver and options.  */
   collect_gcc = getenv ("COLLECT_GCC");
-- 
2.31.1



Re: [committed] aarch64: Always use .init/.fini_array for GNU/Linux

2021-04-21 Thread Joseph Myers
On Wed, 21 Apr 2021, Andreas Schwab wrote:

> On Apr 21 2021, Richard Sandiford via Gcc-patches wrote:
> 
> > However, AArch64 postdates glibc and binutils support for
> > .init/fini_array by some distance, so it's safe to hard-code the
> > result to "yes" for cross compilers.
> 
> Should there be an automatism for any other new architecture added?

See what I said in 
.  
I think the configure test is valid for cross compilation and doesn't need 
to check build = host = target at all.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Use std::thread::hardware_concurrency in lto-wrapper.c.

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 11:50:02AM -0400, David Edelsohn via Gcc-patches wrote:
> This patch broke bootstrap on AIX.
> 
> std::thread is not provided in all instances.  GCC is not compiled
> multi-threaded by default.
> 
> error: `std::thread' has not been declared.

In addition to that, I'm not really convinced it is a good idea,
e.g. for Linux I believe the previous implementation was much better
(as it counted only the cores/threads available to gcc subprocesses,
rather than all available on the system, so e.g. when running some
compilation job under taskset with std::thread::hardware_concurrency it
will oversubscribe the available CPUs).

Sure, for some non-Linux targets you might want to copy what is done in
libstdc++-v3/src/c++11/thread.cc, but I think it is better to copy/adjust the
relevant configure bits and the thread.cc code is very short...

Jakub



[r12-20 Regression] FAIL: libgomp.fortran/depobj-1.f90 -O (test for excess errors) on Linux/x86_64

2021-04-21 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

a61c4964cd71446232d62ec9b10a7d45b440dd9f is the first bad commit
commit a61c4964cd71446232d62ec9b10a7d45b440dd9f
Author: Tobias Burnus 
Date:   Wed Apr 21 10:58:29 2021 +0200

Fortran/OpenMP: Add 'omp depobj' and 'depend(mutexinoutset:'

caused

FAIL: libgomp.fortran/depobj-1.f90   -O  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-20/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/depobj-1.f90 
--target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/depobj-1.f90 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [Patch] libgomp/testsuite: Fix checks for dg-excess-errors

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 07:04:17PM +0200, Tobias Burnus wrote:
> OK for mainline and GCC 11?

Ok for 12 and 11.2 (i.e. after 11.1 GA).

> libgomp/testsuite: Fix checks for dg-excess-errors
> 
> For the tests modified below, the effective target line has to be effective
> when compiling for an offload target, except that variable-not-offloaded.c
> would compile with unified-share memory and pr86416-*.c if long 
> double/float128
> is supported.
> The previous check used a run-time device ability check. This new variant
> now enables those dg- lines when _compiling_ for nvptx or gcn.
> 
> libgomp/ChangeLog:
> 
>   * testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type):
>   New, based on check_effective_target_offload_target_nvptx.
>   (check_effective_target_offload_target_nvptx): Call it.
>   (check_effective_target_offload_target_amdgcn): New.
>   * testsuite/libgomp.c-c++-common/function-not-offloaded.c:
>   Require target offload_target_nvptx || offload_target_amdgcn.
>   * testsuite/libgomp.c-c++-common/variable-not-offloaded.c: Likewise.
>   * testsuite/libgomp.c/pr86416-1.c: Likewise.
>   * testsuite/libgomp.c/pr86416-2.c: Likewise.

Jakub



Re: [PATCH v2] c++: Prevent bogus -Wtype-limits warning with NTTP [PR100161]

2021-04-21 Thread Marek Polacek via Gcc-patches
On Tue, Apr 20, 2021 at 10:40:52PM -0400, Jason Merrill wrote:
> On 4/20/21 8:42 PM, Marek Polacek wrote:
> > Recently, we made sure that we never call value_dependent_expression_p
> > on an expression that isn't potential_constant_expression.  That caused
> > this bogus warning with a non-type template parameter, something that
> > users don't want to see.
> > 
> > The problem is that in tsubst_copy_and_build/LE_EXPR 't' is "i < n",
> > which, due to 'i', is not p_c_e, therefore we call t_d_e_p.  But the
> > type of 'n' isn't dependent, so we think the whole 't' expression is
> > not dependent.  It seems we need to test both op0 and op1 separately
> > to suppress this warning.  I use a lambda so as not to repeat the
> > check.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11.2?
> > Think I also want to backport to 10.4, but then the patch can't use
> > a lambda.
> 
> It should be straightforward to rewrite the lambda as a local class; let's
> use that version for all the branches.

I was thinking that I'd use a lambda for GCC 11 and 12 and for GCC 10
play some functor games, but I can use the following for all three, if
you want.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11.2/10.4?

-- >8 --
Recently, we made sure that we never call value_dependent_expression_p
on an expression that isn't potential_constant_expression.  That caused
this bogus warning with a non-type template parameter, something that
users don't want to see.

The problem is that in tsubst_copy_and_build/LE_EXPR 't' is "i < n",
which, due to 'i', is not p_c_e, therefore we call t_d_e_p.  But the
type of 'n' isn't dependent, so we think the whole 't' expression is
not dependent.  It seems we need to test both op0 and op1 separately
to suppress this warning.

gcc/cp/ChangeLog:

PR c++/100161
* pt.c (tsubst_copy_and_build) : Test op0 and
op1 separately for value- or type-dependence.

gcc/testsuite/ChangeLog:

PR c++/100161
* g++.dg/warn/Wtype-limits6.C: New test.
---
 gcc/cp/pt.c   | 26 +++
 gcc/testsuite/g++.dg/warn/Wtype-limits6.C | 17 +++
 2 files changed, 34 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wtype-limits6.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 7bcbe6dc3ce..c1acfc498b1 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19906,15 +19906,23 @@ tsubst_copy_and_build (tree t,
 case MEMBER_REF:
 case DOTSTAR_EXPR:
   {
-   /* If T was type-dependent, suppress warnings that depend on the range
-  of the types involved.  */
-   ++processing_template_decl;
-   const bool was_dep = (potential_constant_expression (t)
- ? value_dependent_expression_p (t)
- : type_dependent_expression_p (t));
-   --processing_template_decl;
-   tree op0 = RECUR (TREE_OPERAND (t, 0));
-   tree op1 = RECUR (TREE_OPERAND (t, 1));
+   /* If either OP0 or OP1 was value- or type-dependent, suppress
+  warnings that depend on the range of the types involved.  */
+   tree op0 = TREE_OPERAND (t, 0);
+   tree op1 = TREE_OPERAND (t, 1);
+   struct {
+ inline bool operator()(tree t) const {
+   ++processing_template_decl;
+   bool r = (potential_constant_expression (t)
+ ? value_dependent_expression_p (t)
+ : type_dependent_expression_p (t));
+   --processing_template_decl;
+   return r;
+ };
+   } dep_p;
+   const bool was_dep = dep_p (op0) || dep_p (op1);
+   op0 = RECUR (op0);
+   op1 = RECUR (op1);
 
warning_sentinel s1(warn_type_limits, was_dep);
warning_sentinel s2(warn_div_by_zero, was_dep);
diff --git a/gcc/testsuite/g++.dg/warn/Wtype-limits6.C 
b/gcc/testsuite/g++.dg/warn/Wtype-limits6.C
new file mode 100644
index 000..9d5886d5323
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wtype-limits6.C
@@ -0,0 +1,17 @@
+// PR c++/100161
+// { dg-additional-options "-Wtype-limits" }
+
+void f(unsigned);
+
+template
+void g()
+{
+for (unsigned i = 0; i < n; i++) { // { dg-bogus "always false" }
+f(i);
+}
+}
+
+void h()
+{
+g<0>();
+}

base-commit: c2fc1702cb3a3d5cc9c40de47f63b4c8f3f1d09c
-- 
2.30.2



[PATCH] [libstdc++] Fix test timeout in stop_calback/destroy.cc

2021-04-21 Thread Thomas Rodgers
From: Thomas Rodgers 

A change was made to __atomic_semaphore::_S_do_try_acquire() to
(ideally) let the compare_exchange reload the value of __old rather than
always reloading it twice. This causes _M_acquire to spin indefinitely
if the value of __old is already 0.

libstdc++/ChangeLog:
* include/bits/semaphore_base.h: Always reload __old in
__atomic_semaphore::_S_do_try_acquire().
* testsuite/30_threads/stop_token/stop_callback/destroy.cc
re-enable testcase.
---
 libstdc++-v3/include/bits/semaphore_base.h   | 16 ++--
 .../stop_token/stop_callback/destroy.cc  |  2 --
 2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/include/bits/semaphore_base.h 
b/libstdc++-v3/include/bits/semaphore_base.h
index 35469e443b0..84b33423fff 100644
--- a/libstdc++-v3/include/bits/semaphore_base.h
+++ b/libstdc++-v3/include/bits/semaphore_base.h
@@ -196,9 +196,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __atomic_semaphore& operator=(const __atomic_semaphore&) = delete;
 
 static _GLIBCXX_ALWAYS_INLINE bool
-_S_do_try_acquire(__detail::__platform_wait_t* __counter,
- __detail::__platform_wait_t& __old) noexcept
+_S_do_try_acquire(__detail::__platform_wait_t* __counter) noexcept
 {
+  auto __old = __atomic_impl::load(__counter, memory_order::acquire);
   if (__old == 0)
return false;
 
@@ -211,18 +211,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_ALWAYS_INLINE void
 _M_acquire() noexcept
 {
-  auto __old = __atomic_impl::load(&_M_counter, memory_order::acquire);
   auto const __pred =
-   [this, &__old] { return _S_do_try_acquire(>_M_counter, __old); };
+   [this] { return _S_do_try_acquire(>_M_counter); };
   std::__atomic_wait_address_bare(&_M_counter, __pred);
 }
 
 bool
 _M_try_acquire() noexcept
 {
-  auto __old = __atomic_impl::load(&_M_counter, memory_order::acquire);
   auto const __pred =
-   [this, &__old] { return _S_do_try_acquire(>_M_counter, __old); };
+   [this] { return _S_do_try_acquire(>_M_counter); };
   return std::__detail::__atomic_spin(__pred);
 }
 
@@ -231,9 +229,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_try_acquire_until(const chrono::time_point<_Clock,
   _Duration>& __atime) noexcept
   {
-   auto __old = __atomic_impl::load(&_M_counter, memory_order_relaxed);
auto const __pred =
- [this, &__old] { return _S_do_try_acquire(>_M_counter, __old); 
};
+ [this] { return _S_do_try_acquire(>_M_counter); };
 
return __atomic_wait_address_until_bare(&_M_counter, __pred, __atime);
   }
@@ -243,9 +240,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_try_acquire_for(const chrono::duration<_Rep, _Period>& __rtime)
noexcept
   {
-   auto __old = __atomic_impl::load(&_M_counter, memory_order_relaxed);
auto const __pred =
- [this, &__old] { return _S_do_try_acquire(>_M_counter, __old); 
};
+ [this] { return _S_do_try_acquire(>_M_counter); };
 
return __atomic_wait_address_for_bare(&_M_counter, __pred, __rtime);
   }
diff --git 
a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/destroy.cc 
b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/destroy.cc
index c2cfba027cb..061ed448c33 100644
--- a/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/destroy.cc
+++ b/libstdc++-v3/testsuite/30_threads/stop_token/stop_callback/destroy.cc
@@ -21,8 +21,6 @@
 // { dg-require-effective-target pthread }
 // { dg-require-gthreads "" }
 
-// { dg-skip-if "FIXME: times out" { *-*-* } }
-
 #include 
 #include 
 #include 
-- 
2.30.2



Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute

2021-04-21 Thread Martin Sebor via Gcc-patches

On 4/14/21 4:39 PM, H.J. Lu wrote:

commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
Author: H.J. Lu 
Date:   Fri Aug 21 09:42:49 2020 -0700

 x86: Add target("general-regs-only") function attribute

is incomplete since it is impossible to call integer intrinsics from
a function with general-regs-only target attribute.

1. Add general_regs_only function attribute to inform the compiler that
functions use only general purpose registers.  When making inlining
decisions on such functions, non-GPR compiler options are excluded.
2. Add general_regs_only attribute to x86 intrinsics which use only
general purpose registers.


...

--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -7066,6 +7066,11 @@ On x86 targets, the @code{fentry_section} attribute sets 
the name
  of the section to record function entry instrumentation calls in when
  enabled with @option{-pg -mrecord-mcount}
  
+@item general_regs_only

+@cindex @code{general_regs_only} function attribute, x86
+The @code{general_regs_only} attribute on functions is used to
+inform the compiler that functions use only general purpose registers.


I'll just reiterate basically the same comment as before: it's not
clear from the very brief description above what the requirements
are for using the attribute.  I'm guessing it can be applied to
any function (inline or otherwise) but only has any effect when
the function is actually inlined and otherwise doesn't constrain
what the function can do.  (Whatever the constraints are, I think
the manual should spell them out, and likewise for its effects.)

Similarly it's not clear what should be expected when the function
does use some other register.  Ideally, I think GCC would check and
issue a nice error message whether or not the function is inlined
or called.  I suspect that might only be possible for inline
functions that are actually called and for which the back end must
emit code.

Other than that, I'd suggest to improve the phrasing a bit:

  The @code{general_regs_only} function attribute indicates that
  the function uses only general purpose registers... [text
  explaining constraints and errors follows].

Martin


[Patch] libgomp/testsuite: Fix checks for dg-excess-errors

2021-04-21 Thread Tobias Burnus

This was brought up by Richard when testing libgomp with the GCC 11 distribution
compiler, which has both nvptx and gcn enabled – but no offloading device was
available.

This lead to fails for:
* testsuite/libgomp.c-c++-common/function-not-offloaded.c:
  → function (on purpose) not marked for offline
  → issue: dg-excess-errors by lto1/mkoffload not ignored
* testsuite/libgomp.c-c++-common/variable-not-offloaded.c
  → likewise for variables, except that this would succeed
  with unified-shared memory
  → issue: dg-excess-errors by lto1/mkoffload not ignored
* libgomp/testsuite/libgomp.c/pr86416-1.c + *-2.c
  → expected fail when offloading compiler does not support
  long double or float128
  → issue: error output but dg-error/dg-excess-error not active

The reason for those fails is that used effective target checks
do not cover those cases correctly. Namely:

* offload_device - check whether an offloading device is
  available & used at run time.
  → problem: compilation was done for the device but here
  no device is available
* offload_device_nonshared_as
  → Likewise and same issue, except that also 0 is returned
  if the device has unified shared memory.

Solution: Explicitly check for nvptx/gcn offload compilation
(but ignore device availability).
[Uses 'gcc -v |grep ^OFFLOAD_TARGET_NAMES=', which honors
-foffload=, including -foffload=disable.]

OK for mainline and GCC 11?


Tested without offloading support & with nvptx &
offload support (with an offloading device available).

@Richard: I would be happy if you could confirm that this fixes
your issue.

Note: I don't know what's needed for *intelmic* - but I wouldn't
be surprised if the current settings were wrong as well.
(long double/float128 supported? unified shared memory? Only
needed for function-not-offloaded.c? - if the latter, a check
whether any OFFLOAD_TARGET_NAMES=* exists would be one option.)

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
libgomp/testsuite: Fix checks for dg-excess-errors

For the tests modified below, the effective target line has to be effective
when compiling for an offload target, except that variable-not-offloaded.c
would compile with unified-share memory and pr86416-*.c if long double/float128
is supported.
The previous check used a run-time device ability check. This new variant
now enables those dg- lines when _compiling_ for nvptx or gcn.

libgomp/ChangeLog:

	* testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type):
	New, based on check_effective_target_offload_target_nvptx.
	(check_effective_target_offload_target_nvptx): Call it.
	(check_effective_target_offload_target_amdgcn): New.
	* testsuite/libgomp.c-c++-common/function-not-offloaded.c:
	Require target offload_target_nvptx || offload_target_amdgcn.
	* testsuite/libgomp.c-c++-common/variable-not-offloaded.c: Likewise.
	* testsuite/libgomp.c/pr86416-1.c: Likewise.
	* testsuite/libgomp.c/pr86416-2.c: Likewise.

 libgomp/testsuite/lib/libgomp.exp| 20 
 .../libgomp.c-c++-common/function-not-offloaded.c|  2 +-
 .../libgomp.c-c++-common/variable-not-offloaded.c|  4 ++--
 libgomp/testsuite/libgomp.c/pr86416-1.c  |  4 ++--
 libgomp/testsuite/libgomp.c/pr86416-2.c  |  4 ++--
 5 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp
index 14dcfdfd00a..467cb3cf255 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -340,8 +340,10 @@ proc offload_target_to_openacc_device_type { offload_target } {
 }
 }
 
-# Return 1 if compiling for offload target nvptx.
-proc check_effective_target_offload_target_nvptx { } {
+# Return 1 if compiling for the specified offload target
+# Takes -foffload=... into account by checking OFFLOAD_TARGET_NAMES=
+# in the -v compiler output.
+proc libgomp_check_effective_target_offload_target { target_name } {
 # Consider all actual options, including the flags passed to
 # 'gcc-dg-runtest', or 'gfortran-dg-runtest' (see the 'libgomp.*/*.exp'
 # files; in particular, '-foffload', 'libgomp.oacc-*/*.exp'), which don't
@@ -354,13 +356,23 @@ proc check_effective_target_offload_target_nvptx { } {
 set gcc_output [libgomp_target_compile "" "" "none" $options]
 if [regexp "(?n)^OFFLOAD_TARGET_NAMES=(.*)" $gcc_output dummy offload_targets] {
 	verbose "compiling for offload targets: $offload_targets"
-	return [string match "*:nvptx*:*" ":$offload_targets:"]
+	return [string match "*:$target_name*:*" ":$offload_targets:"]
 }
 
-verbose "not compiling for any offload targets"
+verbose "not compiling for $target_name offload target"
 return 0
 }
 
+# Return 1 if compiling for offload target nvptx.
+proc check_effective_target_offload_target_nvptx { } {
+return 

Re: [PATCH][libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

2021-04-21 Thread Alexander Monakov via Gcc-patches
On Wed, 21 Apr 2021, Tom de Vries wrote:

> > I don't think implementing futex_wait is possible on nvptx.
> > 
> 
> Well, I gave it a try, attached below.  Can you explain why you think
> it's not possible, or pinpoint a problem in the implementation?

Responding only to this for now. When I said futex_wait I really meant
Linux futex wait, where the API is tied to a 32-bit futex control word
and nothing else. Your implementation works with a gomp_barrier_t that
includes more than one field. It would be confusing to call it a
"futex wait", it is not a 1:1 replacement.

(i.e. unlike a proper futex, it can work only for gomp_barrier_t objects)

Alexander


Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute

2021-04-21 Thread Martin Sebor via Gcc-patches

On 4/21/21 1:30 AM, Uros Bizjak wrote:

On Thu, Apr 15, 2021 at 12:39 AM H.J. Lu  wrote:


commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
Author: H.J. Lu 
Date:   Fri Aug 21 09:42:49 2020 -0700

 x86: Add target("general-regs-only") function attribute

is incomplete since it is impossible to call integer intrinsics from
a function with general-regs-only target attribute.

1. Add general_regs_only function attribute to inform the compiler that
functions use only general purpose registers.  When making inlining
decisions on such functions, non-GPR compiler options are excluded.
2. Add general_regs_only attribute to x86 intrinsics which use only
general purpose registers.


I'd like to ask Richard and Jakub if they agree with the approach.

On a related note, can we declare default attributes like clang does, e.g.:

/* Define the default attributes for the functions.  */
#define __DEFAULT_FN_ATTRS __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
#define __DEFAULT_FN_ATTRS_GRO __attribute__((__gnu_inline__,
__always_inline__, __general_regs_only, __artificial__))

and use these defines throughout header files?


FWIW, the sequence of attributes contributes measurably to
the amount of time it takes to parse all the declarations.  Since
they're the same for most of the thousands of functions declared
in these headers it might be worth considering either adding
a new attribute that "expands" to all of these, or using some
other mechanism to speed things up.  (This came up recenetly
in pr100099.)

Martin



Uros.



gcc/

 PR target/99744
 * config/i386/i386-options.c (ix86_attribute_table): Add
 general_regs_only.
 * config/i386/i386.c (ix86_can_inline_p): Exclude non-integer
 target options if callee has general_regs_only attribute.
 * config/i386/adxintrin.h: Add general_regs_only attribute to
 intrinsics which use only general purpose registers.
 * config/i386/bmiintrin.h: Likewise.
 * config/i386/bmi2intrin.h: Likewise.
 * config/i386/cetintrin.h: Likewise.
 * config/i386/cldemoteintrin.h: Likewise.
 * config/i386/clflushoptintrin.h: Likewise.
 * config/i386/clwbintrin.h: Likewise.
 * config/i386/clzerointrin.h: Likewise.
 * config/i386/enqcmdintrin.h: Likewise.
 * config/i386/fxsrintrin.h: Likewise.
 * config/i386/hresetintrin.h: Likewise.
 * config/i386/ia32intrin.h: Likewise.
 * config/i386/lwpintrin.h: Likewise.
 * config/i386/lzcntintrin.h: Likewise.
 * config/i386/movdirintrin.h: Likewise.
 * config/i386/mwaitxintrin.h: Likewise.
 * config/i386/pconfigintrin.h: Likewise.
 * config/i386/pkuintrin.h: Likewise.
 * config/i386/popcntintrin.h: Likewise.
 * config/i386/rdseedintrin.h: Likewise.
 * config/i386/rtmintrin.h: Likewise.
 * config/i386/serializeintrin.h: Likewise.
 * config/i386/sgxintrin.h: Likewise.
 * config/i386/tbmintrin.h: Likewise.
 * config/i386/tsxldtrkintrin.h: Likewise.
 * config/i386/uintrintrin.h: Likewise.
 * config/i386/waitpkgintrin.h: Likewise.
 * config/i386/wbnoinvdintrin.h: Likewise.
 * config/i386/x86gprintrin.h: Likewise.
 * config/i386/xsavecintrin.h: Likewise.
 * config/i386/xsaveintrin.h: Likewise.
 * config/i386/xsaveoptintrin.h: Likewise.
 * config/i386/xsavesintrin.h: Likewise.
 * config/i386/xtestintrin.h: Likewise.
 * doc/extend.texi: Document general_regs_only function attribute.

gcc/testsuite/

 PR target/99744
 * gcc.target/i386/pr99744-3.c: New test.
 * gcc.target/i386/pr99744-4.c: Likewise.
---
  gcc/config/i386/adxintrin.h   |  18 +-
  gcc/config/i386/bmi2intrin.h  |  24 +-
  gcc/config/i386/bmiintrin.h   |  92 --
  gcc/config/i386/cetintrin.h   |  33 +-
  gcc/config/i386/cldemoteintrin.h  |   3 +-
  gcc/config/i386/clflushoptintrin.h|   3 +-
  gcc/config/i386/clwbintrin.h  |   3 +-
  gcc/config/i386/clzerointrin.h|   4 +-
  gcc/config/i386/enqcmdintrin.h|   6 +-
  gcc/config/i386/fxsrintrin.h  |  12 +-
  gcc/config/i386/hresetintrin.h|   3 +-
  gcc/config/i386/i386-options.c|   2 +
  gcc/config/i386/i386.c|  29 +-
  gcc/config/i386/ia32intrin.h  |  82 +++--
  gcc/config/i386/lwpintrin.h   |  24 +-
  gcc/config/i386/lzcntintrin.h |  20 +-
  gcc/config/i386/movdirintrin.h|   9 +-
  gcc/config/i386/mwaitxintrin.h|   8 +-
  gcc/config/i386/pconfigintrin.h   |   3 +-
  gcc/config/i386/pkuintrin.h   |   6 +-
  gcc/config/i386/popcntintrin.h|   8 +-
  gcc/config/i386/rdseedintrin.h|   9 +-
  gcc/config/i386/rtmintrin.h   |   9 

Re: [PATCH] Use STATIC_ASSERT for OVL_OP_MAX.

2021-04-21 Thread Martin Sebor via Gcc-patches

On 4/21/21 2:15 AM, Martin Liška wrote:

Hello.

It's addressing the following Clang warning:
cp/lex.c:170:45: warning: result of comparison of constant 64 with expression 
of type 'enum ovl_op_code' is always true 
[-Wtautological-constant-out-of-range-compare]

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/cp/ChangeLog:

* cp-tree.h (STATIC_ASSERT): Prefer static assert.
* lex.c (init_operators): Remove run-time check.
---
  gcc/cp/cp-tree.h | 3 +++
  gcc/cp/lex.c | 2 --
  2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 81ff375f8a5..a8f72448ea9 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5916,6 +5916,9 @@ enum ovl_op_code {
OVL_OP_MAX
  };
  
+/* Make sure it fits in lang_decl_fn::operator_code. */

+STATIC_ASSERT (OVL_OP_MAX < (1 << 6));
+


I wonder if there's a way to test this directly by something like

   static_assert (number-of-bits (ovl_op_info_t::ovl_op_code)
  <= number-of-bits (lang_decl_fn::operator_code));

Also, since we are now compiling in C++ 11 mode, would using
static_assert be appropriate?

Martin



  struct GTY(()) ovl_op_info_t {
/* The IDENTIFIER_NODE for the operator.  */
tree identifier;
diff --git a/gcc/cp/lex.c b/gcc/cp/lex.c
index 73e14b8394c..43abd019e6e 100644
--- a/gcc/cp/lex.c
+++ b/gcc/cp/lex.c
@@ -166,8 +166,6 @@ init_operators (void)
  
if (op_ptr->name)

{
- /* Make sure it fits in lang_decl_fn::operator_code. */
- gcc_checking_assert (op_ptr->ovl_op_code < (1 << 6));
  tree ident = set_operator_ident (op_ptr);
  if (unsigned index = IDENTIFIER_CP_INDEX (ident))
{





Re: [PATCH][libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

2021-04-21 Thread Tom de Vries
On 4/20/21 6:11 PM, Alexander Monakov wrote:
> Hello Tom,
> 
> Thank you for the investigation and the detailed writeup. It was difficult for
> me to infer the internal API contracts here (and still is),

Hi Alexander,

thanks for the review.

Yep, same here.

> sorry about the
> mistake.
> 
> Most importantly: does GCN handle this, and if yes, how? I think the solution
> should be the same for config/gcn and config/nvptx (I guess this is a question
> for Andrew).
> 

I looked into gcn/bar.c at gomp_team_barrier_wait_end and found:
...
  int retry = 100;
  do
{
  if (retry-- == 0)
{
  /* It really shouldn't happen that barriers get out of sync,
 but
 if they do then this will loop until they realign, so we
 need
 to avoid an infinite loop where the thread just isn't
 there.  */
  const char msg[]
 = ("Barrier sync failed (another thread died?);"
" aborting.");
  write (2, msg, sizeof (msg)-1);
  abort();
...
which doesn't look promising.

> Some comments inline below:
> 
> On Tue, 20 Apr 2021, Tom de Vries wrote:
> 
>> Hi,
>>
>> Consider the following omp fragment.
>> ...
>>   #pragma omp target
>>   #pragma omp parallel num_threads (2)
>>   #pragma omp task
>> ;
>> ...
>>
>> This hangs at -O0 for nvptx.
>>
>> Investigating the behaviour gives us the following trace of events:
>> - both threads execute GOMP_task, where they:
>>   - deposit a task, and
>>   - execute gomp_team_barrier_wake
>> - thread 1 executes gomp_team_barrier_wait_end and, not being the last 
>> thread,
>>   proceeds to wait at the team barrier
> 
> Shouldn't it try to handle deposited tasks before suspending on the barrier?
> 
> I guess you are describing what the code does, I'm just commenting that I'm
> confused why it behaves so.
> 

Ack.  Yeah, sorry I've got no idea about how openmp internals are
supposed to function.

>> - thread 0 executes gomp_team_barrier_wait_end and, being the last thread, it
>>   calls gomp_barrier_handle_tasks, where it:
>>   - executes both tasks and marks the team barrier done
>>   - executes a gomp_team_barrier_wake which wakes up thread 1
>> - thread 1 exits the team barrier
> 
> Up to this point it looks reasonable.
> 
>> - thread 0 returns from gomp_barrier_handle_tasks and goes to wait at
>>   the team barrier.
> 
> At this point the code should realize that the team barrier was already 
> released
> and not attempt to wait on it again. Maybe by inspecting the generation 
> counter?
> 

Perhaps we can indeed piece together a fix like that.

The problem for me is that writing this sort of fix requires a good
understanding of the semantics of the various fields of gomp_barrier_t,
and I don't have that.

> I may be wrong though, I don't understand the overall flow well enough yet.
> 
>> - thread 0 hangs.
>>
>> To understand why there is a hang here, it's good to understand how things
>> are setup for nvptx.  The libgomp/config/nvptx/bar.c implementation is
>> a copy of the libgomp/config/linux/bar.c implementation, with uses of both
>> futex_wake and do_wait replaced with uses of nvptx insn bar.sync:
>> ...
>>   if (bar->total > 1)
>> asm ("bar.sync 1, %0;" : : "r" (32 * bar->total));
>> ...
>>
>> The point where thread 0 goes to wait at the team barrier, corresponds in
>> the linux implementation with a do_wait.  In the linux case, the call to
>> do_wait doesn't hang, because it's waiting for bar->generation to become
>> a certain value, and if bar->generation already has that value, it just
>> proceeds, without any need for coordination with other threads.
>>
>> In the nvtpx case, the bar.sync waits until thread 1 joins it in the same
>> logical barrier, which never happens: thread 1 is lingering in the
>> thread pool at the thread pool barrier (using a different logical barrier),
>> waiting to join a new team.
>>
>> The easiest way to fix this is to revert to the posix implementation for
>> bar.{c,h}.
>>
>> Another way would be to revert to the linux implementation for bar.{c,h},
>> and implement the primitives futex_wait and do_wait using nvptx insns.
> 
> I don't think implementing futex_wait is possible on nvptx.
> 

Well, I gave it a try, attached below.  Can you explain why you think
it's not possible, or pinpoint a problem in the implementation?

[ The benefit of this specific approach for me is separation of
concerns: we copy a working solution as fully as possible, and isolate
the nvptx-specific code to two functions.  This requires us to
understand and provide the semantics of these two functions, and nothing
more. ]

Thanks,
- Tom
[libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end

Consider the following omp fragment.
...
  #pragma omp target
  #pragma omp parallel num_threads (2)
  #pragma omp task
;
...

This hangs at -O0 for nvptx.

Investigating the behaviour gives us the following trace of events:
- both threads execute GOMP_task, 

Patch freeze request

2021-04-21 Thread David Edelsohn via Gcc-patches
I am requesting a freeze on non-bug fix patches to trunk.

In the GCC 12 announcement, Jakub stated:

"The trunk has branched for the GCC 11 release and is now open
again for general development, stage 1.  Please consider not
disrupting it too much during the RC phase of GCC 11 so it
is possible to test important fixes for 11.1 on it."

Currently people are committing disruptive patches to GCC 12,
including patches that are breaking bootstrap.  This situation is
making it more difficult to test the patches that are necessary for
the GCC 11.1 Release.

Currently GCC 11.1 Release is blocked and GCC 12 bootstrap is broken
for some targets that are affected by the problems in GCC 11.1.  All
of the GCC 12 patches not related to bug fixes are further
exacerbating the problems and delaying the release of GCC 11.1.

Thanks, David


Re: [PATCH] libstdc++: Install libstdc++*-gdb.py more robustly [PR 99453]

2021-04-21 Thread Jonathan Wakely via Gcc-patches

On 03/04/21 14:56 -0400, Philippe Blain wrote:

In order for GDB to auto-load the pretty printers, they must be installed
as "libstdc++.$ext-gdb.py", where 'libstdc++.$ext' is the name of the
object file that is loaded by GDB [1], i.e. the libstdc++ shared library.

The approach taken in libstdc++-v3/python/Makefile.am is to loop over
files matching 'libstdc++*' in $(DESTDIR)$(toolexeclibdir) and choose
the last file matching that glob that is not a symlink, the Libtool
'*.la' file or a Python file.

That works fine for ELF targets where the matching names are:

 libstdc++.a
 libstdc++.so
 libstdc++.so.6
 libstdc++.so.6.0.29

But not for macOS with:

 libstdc++.6.dylib
 libstdc++.a

Or MinGW with:

 libstdc++-6.dll
 libstdc++.dll.a

Try to make a better job at installing the pretty printers with the
correct name by copying the approach taken by isl [2], that is, using
a sed invocation on the the Libtool-generated 'libstdc++.la' to read the
correct name for the current platform.

[1] https://sourceware.org/gdb/onlinedocs/gdb/objfile_002dgdbdotext-file.html
[2] https://repo.or.cz/isl.git/blob/HEAD:/Makefile.am#l611

libstdc++-v3/
PR libstdc++/99453
* python/Makefile.am: Install libstdc++*-gdb.py more robustly
* python/Makefile.in: Regenerate

Co-authored-by: Jonathan Wakely 
---


Hi Philippe,

Now that gcc-11 has branched from trunk we can go ahead with this.
The change is small enough to not require a copyright assignment, so
I've pushed your patch to trunk.

I would expect this to get backported to the release branches once
it's had some time to bake on the trunk.



Notes:
   Hello, this is my first patch to this project.

   This patch aims to install the GDB Python file for libstdc++ more robustly so
   that it is automatically loaded on more platforms. I tested that it gets
   installed with the correct name on Ubuntu (x86_64-pc-linux-gnu) and macOS
   (x86_64-apple-darwin19.6.0). I did not succeed in building GCC on Windows, 
I'm
   very new to this platform.


I checked my x86_64-w64-mingw32 cross-compiler and I see this in
libstdc++.la:

# The name that we can dlopen(3).
dlname='libstdc++-6.dll'

# Names of this library.
library_names='libstdc++.dll.a'

# The name of the static archive.
old_library='libstdc++.a'


That means this patch won't have any effect on Windows, because it
already installs the hook file as libstdc++.dll.a-gdb.py

I don't know if that is correct, or whether it should be installed as
libstdc++-6.dll-gdb.py, but either way this change isn't going to make
it any worse. If we need to do something different for Windows we can
improve it further later.

Thanks for the fix!




   Here are examples of successful builds on these two platforms (you must be
   logged in on GitHub to see the details):

   Ubuntu: 
https://github.com/phil-blain/gcc/runs/2103367197?check_suite_focus=true#step:6:5
   macOS: 
https://github.com/phil-blain/gcc/runs/2103367199?check_suite_focus=true#step:6:5

   Fetch-It-Via: git fetch https://github.com/phil-blain/gcc 
libstdcxx-pretty-printers-install-filename




libstdc++-v3/python/Makefile.am | 20 
1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/libstdc++-v3/python/Makefile.am b/libstdc++-v3/python/Makefile.am
index 01517a2a5..0c2b207b8 100644
--- a/libstdc++-v3/python/Makefile.am
+++ b/libstdc++-v3/python/Makefile.am
@@ -44,21 +44,9 @@ gdb.py: hook.in Makefile
install-data-local: gdb.py
@$(mkdir_p) $(DESTDIR)$(toolexeclibdir)
## We want to install gdb.py as SOMETHING-gdb.py.  SOMETHING is the
-## full name of the final library.  We want to ignore symlinks, the
-## .la file, and any previous -gdb.py file.  This is inherently
-## fragile, but there does not seem to be a better option, because
-## libtool hides the real names from us.
-   @here=`pwd`; cd $(DESTDIR)$(toolexeclibdir); \
- for file in libstdc++.*; do \
-   case $$file in \
- *-gdb.py) ;; \
- *.la) ;; \
- *) if test -h $$file; then \
-  continue; \
-fi; \
-libname=$$file;; \
-   esac; \
- done; \
-   cd $$here; \
+## full name of the final library.  We use the libtool .la file to get
+## the correct name.
+   @libname=`sed -ne "/^library_names=/{s/.*='//;s/'$$//;s/ .*//;p;}" \
+ $(DESTDIR)$(toolexeclibdir)/libstdc++.la`; \
echo " $(INSTALL_DATA) gdb.py 
$(DESTDIR)$(toolexeclibdir)/$$libname-gdb.py"; \
$(INSTALL_DATA) gdb.py $(DESTDIR)$(toolexeclibdir)/$$libname-gdb.py

base-commit: 6e885ad3287388192e52e9b524dbaa408507c0a4
--
2.31.1





Re: [PATCH] Use std::thread::hardware_concurrency in lto-wrapper.c.

2021-04-21 Thread David Edelsohn via Gcc-patches
This patch broke bootstrap on AIX.

std::thread is not provided in all instances.  GCC is not compiled
multi-threaded by default.

error: `std::thread' has not been declared.

Thanks, David


[PATCH][GCC 10] aarch64: Fix SVE ACLE builtins with LTO [PR99216]

2021-04-21 Thread Alex Coplan via Gcc-patches
Hi,

Here is a backport of my fix for PR99216. The only change w.r.t the
original patch is a bump of lto-streamer.h:LTO_minor_version.

Bootstrapped and regtested on aarch64-linux-gnu, no issues.

OK for GCC 10 branch?

Thanks,
Alex

---

As discussed in the PR, we currently have two different numbering
schemes for SVE builtins: one for C, and one for C++. This is
problematic for LTO, where we end up getting confused about which
intrinsic we're talking about. This patch inserts placeholders into the
registered_functions vector to ensure that there is a consistent
numbering scheme for both C and C++.

This version uses integer_zero_node as a placeholder node instead of
building a function decl. This is safe because the node is only returned
by the TARGET_BUILTIN_DECL hook, which (on AArch64) is only used for
validation when builtin decls are streamed into lto1.

gcc/ChangeLog:

PR target/99216
* config/aarch64/aarch64-sve-builtins.cc
(function_builder::add_function): Add placeholder_p argument, use
placeholder decls if this is set.
(function_builder::add_unique_function): Instead of conditionally adding
direct overloads, unconditionally add either a direct overload or a
placeholder.
(function_builder::add_overloaded_function): Set placeholder_p if we're
using C++ overloads. Use the obstack for string storage instead
of relying on the tree nodes.
(function_builder::add_overloaded_functions): Don't return early for
m_direct_overloads: we need to add placeholders.
* config/aarch64/aarch64-sve-builtins.h
(function_builder::add_function): Add placeholder_p argument.
* lto-streamer.h (LTO_minor_version): Bump.

gcc/testsuite/ChangeLog:

PR target/99216
* g++.target/aarch64/sve/pr99216.C: New test.
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc
index d534ca923d9..336a1db662b 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -995,12 +995,29 @@ registered_function &
 function_builder::add_function (const function_instance ,
 const char *name, tree fntype, tree attrs,
 uint64_t required_extensions,
-bool overloaded_p)
+bool overloaded_p,
+bool placeholder_p)
 {
   unsigned int code = vec_safe_length (registered_functions);
   code = (code << AARCH64_BUILTIN_SHIFT) | AARCH64_BUILTIN_SVE;
-  tree decl = simulate_builtin_function_decl (input_location, name, fntype,
-	  code, NULL, attrs);
+
+  /* We need to be able to generate placeholders to enusre that we have a
+ consistent numbering scheme for function codes between the C and C++
+ frontends, so that everything ties up in LTO.
+
+ Currently, tree-streamer-in.c:unpack_ts_function_decl_value_fields
+ validates that tree nodes returned by TARGET_BUILTIN_DECL are non-NULL and
+ some node other than error_mark_node. This is a holdover from when builtin
+ decls were streamed by code rather than by value.
+
+ Ultimately, we should be able to remove this validation of BUILT_IN_MD
+ nodes and remove the target hook. For now, however, we need to appease the
+ validation and return a non-NULL, non-error_mark_node node, so we
+ arbitrarily choose integer_zero_node.  */
+  tree decl = placeholder_p
+? integer_zero_node
+: simulate_builtin_function_decl (input_location, name, fntype,
+  code, NULL, attrs);
 
   registered_function  = *ggc_alloc  ();
   rfn.instance = instance;
@@ -1032,7 +1049,7 @@ function_builder::add_unique_function (const function_instance ,
 	   argument_types.address ());
   tree attrs = get_attributes (instance);
   registered_function  = add_function (instance, name, fntype, attrs,
-	   required_extensions, false);
+	   required_extensions, false, false);
 
   /* Enter the function into the hash table.  */
   hashval_t hash = instance.hash ();
@@ -1043,16 +1060,14 @@ function_builder::add_unique_function (const function_instance ,
 
   /* Also add the function under its overloaded alias, if we want
  a separate decl for each instance of an overloaded function.  */
-  if (m_direct_overloads || force_direct_overloads)
+  char *overload_name = get_name (instance, true);
+  if (strcmp (name, overload_name) != 0)
 {
-  char *overload_name = get_name (instance, true);
-  if (strcmp (name, overload_name) != 0)
-	{
-	  /* Attribute lists shouldn't be shared.  */
-	  tree attrs = get_attributes (instance);
-	  add_function (instance, overload_name, fntype, attrs,
-			required_extensions, false);
-	}
+  /* Attribute lists shouldn't be shared.  */
+  tree attrs = get_attributes (instance);
+  bool placeholder_p = !(m_direct_overloads || force_direct_overloads);
+  add_function (instance, overload_name, fntype, attrs,
+		required_extensions, false, placeholder_p);
 }
 
   obstack_free 

Re: [PATCH] Call toplev::finalize in CHECKING_P mode.

2021-04-21 Thread Jeff Law via Gcc-patches



On 4/21/2021 1:28 AM, Martin Liška wrote:

Hi.

The sanity check calls toplev.finalize () in order to simulate JIT mode
in normal compilation mode.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR jit/98615
* main.c (main): Call toplev::finalize in CHECKING_P mode.
* ipa-modref.c (ipa_modref_c_finalize): summaries are NULL
when incremental LTO linking happens.


OK

jeff




Re: [PATCH] LTO: fallback to -flto=N if -flto=jobserver does not work.

2021-04-21 Thread Jeff Law via Gcc-patches



On 4/21/2021 5:50 AM, Martin Liška wrote:

When -flto=jobserver is used and we cannot detect job server, then we can
still fallbackto -flto=N mode.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* lto-wrapper.c (run_gcc): When -flto=jobserver is used, but the
makeserver cannot be detected, then use -flto=N fallback.


OK

jeff




Re: [PATCH] Avoid -latomic for amdgcn offloading

2021-04-21 Thread Jeff Law via Gcc-patches



On 4/21/2021 6:56 AM, Richard Biener wrote:

libatomic isn't built for amdgcn but reduction-16.c adds it
via -foffload=-latomic when offloading for nvptx is enabled.
The following avoids linker errors when offloading to amdgcn is enabled
as well.

Tested on x86_64-unknown-linux-gnu, OK for trunk and GCC 11 branch?

Thanks.
Richard.

2021-04-21  Richard Biener  

libgomp/
* testsuite/libgomp.c-c++-common/reduction-16.c: Use -latomic
only on nvptx-none.


OK


Jeff



Re: [PATCH] testsuite: Add -fchecking to dg-ice tests

2021-04-21 Thread Marek Polacek via Gcc-patches
On Wed, Apr 21, 2021 at 12:52:39PM +0200, Jakub Jelinek wrote:
> Hi!
> 
> In --enable-checking=release builds (which is the default on release
> branches), I'm getting various extra FAILs that don't appear in
> --enable-checking=yes builds.
> 
> XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (internal compiler error)
> FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (test for excess errors)
> XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (internal compiler error)
> FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (test for excess errors)
> XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (internal compiler error)
> FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (test for excess errors)
> FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++14 (test for excess errors)
> FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++17 (test for excess errors)
> FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++2a (test for excess errors)
> FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++14 (test for excess errors)
> FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++17 (test for excess errors)
> FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++2a (test for excess errors)
> FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++17 (test for excess errors)
> FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++2a (test for excess errors)
> FAIL: g++.dg/cpp2a/nontype-class39.C  -std=c++2a (test for excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++14 (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++17 (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++2a (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++98 (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++14 (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++17 (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++2a (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++98 (test for 
> excess errors)
> 
> These are tests that have dg-ice and most of those ICEs are checking ICEs
> which go away in release checking when -fno-checking is the default.

Ha, I hadn't considered this.  Perhaps we should be adding -fchecking
along with dg-ice...

Thanks for fixing this.

Marek



[PATCH][pushed] Fix clang warning (-Wstring-plus-int)

2021-04-21 Thread Martin Liška
This fixes:

lto-plugin.c:642:7: warning: adding 'int' to a string does not append to the 
string [-Wstring-plus-int]

Pushed as obvious,
thanks,
Martin

lto-plugin/ChangeLog:

* lto-plugin.c (exec_lto_wrapper): Make a temp variable.
---
 lto-plugin/lto-plugin.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
index 32478f070e8..4e90ce03ade 100644
--- a/lto-plugin/lto-plugin.c
+++ b/lto-plugin/lto-plugin.c
@@ -636,10 +636,10 @@ exec_lto_wrapper (char *argv[])
 
   /* Write argv to a file to avoid a command line that is too long
  Save the file locally on save-temps.  */
+  const char *suffix = ".lto_wrapper_args";
+  suffix += skip_in_suffix;
   if (save_temps && link_output_name)
-arguments_file_name = concat (link_output_name,
- ".lto_wrapper_args"
- + skip_in_suffix, NULL);
+arguments_file_name = concat (link_output_name, suffix, NULL);
   else
 arguments_file_name = make_temp_file (".lto_wrapper_args");
   check (arguments_file_name, LDPL_FATAL,
-- 
2.31.1



Re: [committed] aarch64: Always use .init/.fini_array for GNU/Linux

2021-04-21 Thread Andreas Schwab
On Apr 21 2021, Richard Sandiford via Gcc-patches wrote:

> However, AArch64 postdates glibc and binutils support for
> .init/fini_array by some distance, so it's safe to hard-code the
> result to "yes" for cross compilers.

Should there be an automatism for any other new architecture added?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


[committed] aarch64: Always use .init/.fini_array for GNU/Linux

2021-04-21 Thread Richard Sandiford via Gcc-patches
I was wondering why the (now fixed) c-c++-common/attr-retain-[78].c
failures were showing up in the native results for aarch64-linux-gnu
but not in the posted cross results.  It turns out that .init/
.fini_array support is disabled by default for cross builds,
which in turn stops those tests from running.

The test for .init/fini_array support has two parts: one that builds
something with the assembler and linker, and another that compiles
C code and uses preprocessor macros to test the glibc version.
The first test would work with build=host but the second is only
safe for build=target.

However, AArch64 postdates glibc and binutils support for
.init/fini_array by some distance, so it's safe to hard-code the
result to "yes" for cross compilers.

This fixes the only material difference in auto-host.h between
a native and a cross build.

Tested on aarch64-linux-gnu, pushed to trunk.

Richard


gcc/
* acinclude.m4 (gcc_AC_INITFINI_ARRAY): When cross-compiling,
default to yes for aarch64-linux-gnu.
* configure: Regenerate.
---
 gcc/acinclude.m4 | 14 --
 gcc/configure| 14 --
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/gcc/acinclude.m4 b/gcc/acinclude.m4
index 74c18241c91..f9f6a07b040 100644
--- a/gcc/acinclude.m4
+++ b/gcc/acinclude.m4
@@ -422,8 +422,18 @@ changequote([,])dnl
 ])],, [gcc_cv_initfini_array=no]);;
 esac
   else
-AC_MSG_CHECKING(cross compile... guessing)
-gcc_cv_initfini_array=no
+case "${target}" in
+  aarch64*-linux-gnu*)
+   # AArch64 postdates glibc support for .init_array/.fini_array,
+   # so we don't need the preprocessor test above.
+   gcc_cv_initfini_array=yes
+   ;;
+
+  *)
+   AC_MSG_CHECKING(cross compile... guessing)
+   gcc_cv_initfini_array=no
+   ;;
+esac
   fi])
   enable_initfini_array=$gcc_cv_initfini_array
 ])
diff --git a/gcc/configure b/gcc/configure
index e8ecb3b0297..33eae5451ad 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -24132,9 +24132,19 @@ fi
 rm -f conftest.err conftest.i conftest.$ac_ext;;
 esac
   else
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking cross compile... 
guessing" >&5
+case "${target}" in
+  aarch64*-linux-gnu*)
+   # AArch64 postdates glibc support for .init_array/.fini_array,
+   # so we don't need the preprocessor test above.
+   gcc_cv_initfini_array=yes
+   ;;
+
+  *)
+   { $as_echo "$as_me:${as_lineno-$LINENO}: checking cross compile... 
guessing" >&5
 $as_echo_n "checking cross compile... guessing... " >&6; }
-gcc_cv_initfini_array=no
+   gcc_cv_initfini_array=no
+   ;;
+esac
   fi
 fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_initfini_array" >&5


Re: GCC 11.1 Release Candidate available from gcc.gnu.org

2021-04-21 Thread Jonathan Wakely via Gcc-patches

On 21/04/21 13:12 +0100, Jonathan Wakely wrote:

On 21/04/21 12:38 +0100, Jonathan Wakely wrote:

On 20/04/21 22:12 -0700, Thomas Rodgers wrote:

@@ -86,6 +88,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
  }

+_GLIBCXX_ALWAYS_INLINE bool
+_M_try_acquire() noexcept
+{
+  for (;;)
+   {
+ auto __err = sem_trywait(&_M_semaphore);
+ if (__err && (errno == EINTR))
+   continue;
+ else if (__err && (errno == EAGAIN))
+   return false;
+ else if (__err)
+   std::terminate();
+ else
+   break;
+   }
+  return true;
+}
+
  _GLIBCXX_ALWAYS_INLINE void
  _M_release(std::ptrdiff_t __update) noexcept
  {


Please just commit this part to trunk and gcc-11, not the macro
renaming (as that's been fixed by Jakub already).


I think on trunk I'd prefer to do the attached. WDYT?


In fact I think something like this is neded even on gcc-11 branch,
otherwise anything that tries to include  without atomics
or sem_t gets hard errors:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100179

And  includes  which includes , meaning
 is unusable on those targets.

So I think removing this #error is essential:
 

-// Note: the _GLIBCXX_REQUIRE_POSIX_SEMAPHORE macro can be used to force the
-// use of Posix semaphores (sem_t). Doing so however, alters the ABI.
-#if defined __cpp_lib_atomic_wait && !_GLIBCXX_REQUIRE_POSIX_SEMAPHORE
  using __semaphore_impl = __atomic_semaphore;
-#elif _GLIBCXX_HAVE_POSIX_SEMAPHORE
-  using __semaphore_impl = __platform_semaphore;
-#else
-#  error "No suitable semaphore implementation available"
-#endif
+#endif // __cpp_lib_atomic_wait




Re: GCC 11.1 Release Candidate available from gcc.gnu.org

2021-04-21 Thread David Edelsohn via Gcc-patches
Hi, Jonathan

Thanks for the further investigation.  I definitely encountered the
missing _M_try_acquire in __platform_semaphore.

Thanks, David

On Wed, Apr 21, 2021 at 8:12 AM Jonathan Wakely  wrote:
>
> On 21/04/21 12:38 +0100, Jonathan Wakely wrote:
> >On 20/04/21 22:12 -0700, Thomas Rodgers wrote:
> >>@@ -86,6 +88,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>  }
> >>}
> >>
> >>+_GLIBCXX_ALWAYS_INLINE bool
> >>+_M_try_acquire() noexcept
> >>+{
> >>+  for (;;)
> >>+ {
> >>+   auto __err = sem_trywait(&_M_semaphore);
> >>+   if (__err && (errno == EINTR))
> >>+ continue;
> >>+   else if (__err && (errno == EAGAIN))
> >>+ return false;
> >>+   else if (__err)
> >>+ std::terminate();
> >>+   else
> >>+ break;
> >>+ }
> >>+  return true;
> >>+}
> >>+
> >>_GLIBCXX_ALWAYS_INLINE void
> >>_M_release(std::ptrdiff_t __update) noexcept
> >>{
> >
> >Please just commit this part to trunk and gcc-11, not the macro
> >renaming (as that's been fixed by Jakub already).
>
> I think on trunk I'd prefer to do the attached. WDYT?
>
>


Re: [PATCH 3/4] Overhaul in isa_flags and handling it.

2021-04-21 Thread Martin Liška
On 4/21/21 3:34 PM, H.J. Lu wrote:
> On Wed, Apr 21, 2021 at 3:58 AM Uros Bizjak  wrote:
>>
>> On Wed, Apr 21, 2021 at 10:28 AM Martin Liska  wrote:
>>>
>>>
>>> gcc/ChangeLog:
>>>
>>> * config/i386/i386-options.c (TARGET_EXPLICIT_NO_SAHF_P):
>>> Define.
>>> (SET_TARGET_NO_SAHF): Likewise.
>>> (TARGET_EXPLICIT_PREFETCH_SSE_P): Likewise.
>>> (SET_TARGET_PREFETCH_SSE): Likewise.
>>> (TARGET_EXPLICIT_NO_TUNE_P): Likewise.
>>> (SET_TARGET_NO_TUNE): Likewise.
>>> (TARGET_EXPLICIT_NO_80387_P): Likewise.
>>> (SET_TARGET_NO_80387): Likewise.
>>> (DEF_PTA): New.
>>> * config/i386/i386.h (TARGET_*): Remove.
>>> * opth-gen.awk: Generate new used macros.
>>
>> LGTM, but please coordinate the commit with HJ, he has a couple of
>> patches in this area.
>>
> 
> Rebase is needed since my -mwait patch has been checked in.
> 

I've just done that and pushed all patches in the series.

Thank you Uroš for review.

Martin


Re: GCC 11.1 Release Candidate available from gcc.gnu.org

2021-04-21 Thread Thomas Rodgers

On 2021-04-21 05:12, Jonathan Wakely wrote:

On 21/04/21 12:38 +0100, Jonathan Wakely wrote: On 20/04/21 22:12 
-0700, Thomas Rodgers wrote: @@ -86,6 +88,24 @@ 
_GLIBCXX_BEGIN_NAMESPACE_VERSION

}
}

+_GLIBCXX_ALWAYS_INLINE bool
+_M_try_acquire() noexcept
+{
+  for (;;)
+{
+  auto __err = sem_trywait(&_M_semaphore);
+  if (__err && (errno == EINTR))
+continue;
+  else if (__err && (errno == EAGAIN))
+return false;
+  else if (__err)
+std::terminate();
+  else
+break;
+}
+  return true;
+}
+
_GLIBCXX_ALWAYS_INLINE void
_M_release(std::ptrdiff_t __update) noexcept
{
Please just commit this part to trunk and gcc-11, not the macro
renaming (as that's been fixed by Jakub already).


I think on trunk I'd prefer to do the attached. WDYT?

Looks good to me.


Re: [PATCH 2/4] Generate PTA features from a def file.

2021-04-21 Thread Martin Liška
On 4/21/21 3:03 PM, Richard Biener wrote:
> On Wed, Apr 21, 2021 at 2:09 PM Uros Bizjak via Gcc-patches
>  wrote:
>>
>> On Wed, Apr 21, 2021 at 10:28 AM Martin Liska  wrote:
>>>
>>>
>>> gcc/ChangeLog:
>>>
>>> * config/i386/i386.h (PTA_*): Remove.
>>> (enum pta_flag): New.
>>> (DEF_PTA): Generate PTA_* values from i386.def.
>>> * wide-int-bitmask.h (WIDE_INT_BITMASK_FROM_NTH): New macro.
>>> * config/i386/i386.def: New file.
>>
>> Shouldn't this file be rather named i386-isa.def?
>>
>> Otherwise LGTM, but someone have to approve wide-int-bitmask.h change.
> 
> Hmm, such macro isn't "nice" and as it seems you could have inlined it
> at the single use in the source file?

I would use a static method, but that doesn't work with constexpr (at least not
in -std=c++11).

Let's move the macro to the usage place.

Thanks,
Martin

> 
>> Uros.
>>
>>> ---
>>>  gcc/config/i386/i386.def | 110 +++
>>>  gcc/config/i386/i386.h   | 106 ++---
>>>  gcc/wide-int-bitmask.h   |   3 ++
>>>  3 files changed, 128 insertions(+), 91 deletions(-)
>>>  create mode 100644 gcc/config/i386/i386.def
>>>



Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute

2021-04-21 Thread H.J. Lu via Gcc-patches
On Wed, Apr 21, 2021 at 12:30 AM Uros Bizjak  wrote:
>
> On Thu, Apr 15, 2021 at 12:39 AM H.J. Lu  wrote:
> >
> > commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
> > Author: H.J. Lu 
> > Date:   Fri Aug 21 09:42:49 2020 -0700
> >
> > x86: Add target("general-regs-only") function attribute
> >
> > is incomplete since it is impossible to call integer intrinsics from
> > a function with general-regs-only target attribute.
> >
> > 1. Add general_regs_only function attribute to inform the compiler that
> > functions use only general purpose registers.  When making inlining
> > decisions on such functions, non-GPR compiler options are excluded.
> > 2. Add general_regs_only attribute to x86 intrinsics which use only
> > general purpose registers.
>
> I'd like to ask Richard and Jakub if they agree with the approach.

Here is the v5 patch:

https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568407.html

Richard, Jakub,  do they look good to you?

> On a related note, can we declare default attributes like clang does, e.g.:
>
> /* Define the default attributes for the functions.  */
> #define __DEFAULT_FN_ATTRS __attribute__((__gnu_inline__,
> __always_inline__, __artificial__))

This can be defined in x86intrin.h.

> #define __DEFAULT_FN_ATTRS_GRO __attribute__((__gnu_inline__,
> __always_inline__, __general_regs_only, __artificial__))

This can be defined in x86gprintrin.h.

> and use these defines throughout header files?
>
> Uros.
>

Thanks.

-- 
H.J.


[PATCH v5 2/3] x86: Add general_regs_only function attribute

2021-04-21 Thread H.J. Lu via Gcc-patches
commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
Author: H.J. Lu 
Date:   Fri Aug 21 09:42:49 2020 -0700

x86: Add target("general-regs-only") function attribute

is incomplete since it is impossible to call integer intrinsics from
a function with general-regs-only target attribute.

1. Add general_regs_only function attribute to inform the compiler that
functions use only general purpose registers.  When making inlining
decisions on such functions, non-GPR compiler options are excluded.
2. Add general_regs_only attribute to x86 intrinsics which use only
general purpose registers.

gcc/

PR target/99744
* config/i386/i386-options.c (ix86_attribute_table): Add
general_regs_only.
* config/i386/i386.c (ix86_can_inline_p): Exclude non-integer
target options if callee has general_regs_only attribute.
* config/i386/adxintrin.h: Add general_regs_only attribute to
intrinsics which use only general purpose registers.
* config/i386/bmiintrin.h: Likewise.
* config/i386/bmi2intrin.h: Likewise.
* config/i386/cetintrin.h: Likewise.
* config/i386/cldemoteintrin.h: Likewise.
* config/i386/clflushoptintrin.h: Likewise.
* config/i386/clwbintrin.h: Likewise.
* config/i386/clzerointrin.h: Likewise.
* config/i386/enqcmdintrin.h: Likewise.
* config/i386/fxsrintrin.h: Likewise.
* config/i386/hresetintrin.h: Likewise.
* config/i386/ia32intrin.h: Likewise.
* config/i386/lwpintrin.h: Likewise.
* config/i386/lzcntintrin.h: Likewise.
* config/i386/movdirintrin.h: Likewise.
* config/i386/mwaitxintrin.h: Likewise.
* config/i386/pconfigintrin.h: Likewise.
* config/i386/pkuintrin.h: Likewise.
* config/i386/popcntintrin.h: Likewise.
* config/i386/rdseedintrin.h: Likewise.
* config/i386/rtmintrin.h: Likewise.
* config/i386/serializeintrin.h: Likewise.
* config/i386/sgxintrin.h: Likewise.
* config/i386/tbmintrin.h: Likewise.
* config/i386/tsxldtrkintrin.h: Likewise.
* config/i386/uintrintrin.h: Likewise.
* config/i386/waitpkgintrin.h: Likewise.
* config/i386/wbnoinvdintrin.h: Likewise.
* config/i386/x86gprintrin.h: Likewise.
* config/i386/xsavecintrin.h: Likewise.
* config/i386/xsaveintrin.h: Likewise.
* config/i386/xsaveoptintrin.h: Likewise.
* config/i386/xsavesintrin.h: Likewise.
* config/i386/xtestintrin.h: Likewise.
* doc/extend.texi: Document general_regs_only function attribute.

gcc/testsuite/

PR target/99744
* gcc.target/i386/pr99744-3.c: New test.
* gcc.target/i386/pr99744-4.c: Likewise.
---
 gcc/config/i386/adxintrin.h   |  18 +-
 gcc/config/i386/bmi2intrin.h  |  24 +-
 gcc/config/i386/bmiintrin.h   |  92 --
 gcc/config/i386/cetintrin.h   |  33 +-
 gcc/config/i386/cldemoteintrin.h  |   3 +-
 gcc/config/i386/clflushoptintrin.h|   3 +-
 gcc/config/i386/clwbintrin.h  |   3 +-
 gcc/config/i386/clzerointrin.h|   4 +-
 gcc/config/i386/enqcmdintrin.h|   6 +-
 gcc/config/i386/fxsrintrin.h  |  12 +-
 gcc/config/i386/hresetintrin.h|   3 +-
 gcc/config/i386/i386-options.c|   2 +
 gcc/config/i386/i386.c|  29 +-
 gcc/config/i386/ia32intrin.h  |  94 --
 gcc/config/i386/lwpintrin.h   |  24 +-
 gcc/config/i386/lzcntintrin.h |  20 +-
 gcc/config/i386/movdirintrin.h|   9 +-
 gcc/config/i386/mwaitxintrin.h|   8 +-
 gcc/config/i386/pconfigintrin.h   |   3 +-
 gcc/config/i386/pkuintrin.h   |   6 +-
 gcc/config/i386/popcntintrin.h|   8 +-
 gcc/config/i386/rdseedintrin.h|   9 +-
 gcc/config/i386/rtmintrin.h   |   9 +-
 gcc/config/i386/serializeintrin.h |   8 +-
 gcc/config/i386/sgxintrin.h   |   9 +-
 gcc/config/i386/tbmintrin.h   |  80 +++--
 gcc/config/i386/tsxldtrkintrin.h  |   6 +-
 gcc/config/i386/uintrintrin.h |  12 +-
 gcc/config/i386/waitpkgintrin.h   |   9 +-
 gcc/config/i386/wbnoinvdintrin.h  |   3 +-
 gcc/config/i386/x86gprintrin.h|  45 ++-
 gcc/config/i386/xsavecintrin.h|   6 +-
 gcc/config/i386/xsaveintrin.h |  18 +-
 gcc/config/i386/xsaveoptintrin.h  |   6 +-
 gcc/config/i386/xsavesintrin.h|  12 +-
 gcc/config/i386/xtestintrin.h |   3 +-
 gcc/doc/extend.texi   |   5 +
 gcc/testsuite/gcc.target/i386/pr99744-3.c |  13 +
 gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 ++
 39 files changed, 831 insertions(+), 183 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
 create mode 100644 

[PATCH v5 1/3] x86: Move OPTION_MASK_* to i386-common.h

2021-04-21 Thread H.J. Lu via Gcc-patches
Move OPTION_MASK_* to i386-common.h so that they can be used in x86
backend.

* common/config/i386/i386-common.c (OPTION_MASK_*): Move to ...
* common/config/i386/i386-common.h: Here.  New file.
* config/i386/i386.h: Include common/config/i386/i386-common.h.
---
 gcc/common/config/i386/i386-common.c | 299 -
 gcc/common/config/i386/i386-common.h | 317 +++
 gcc/config/i386/i386.h   |   1 +
 3 files changed, 318 insertions(+), 299 deletions(-)
 create mode 100644 gcc/common/config/i386/i386-common.h

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index af3f38755fa..ca828016711 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -29,305 +29,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "flags.h"
 
-/* Define a set of ISAs which are available when a given ISA is
-   enabled.  MMX and SSE ISAs are handled separately.  */
-
-#define OPTION_MASK_ISA_MMX_SET OPTION_MASK_ISA_MMX
-#define OPTION_MASK_ISA_3DNOW_SET \
-  (OPTION_MASK_ISA_3DNOW | OPTION_MASK_ISA_MMX_SET)
-#define OPTION_MASK_ISA_3DNOW_A_SET \
-  (OPTION_MASK_ISA_3DNOW_A | OPTION_MASK_ISA_3DNOW_SET)
-
-#define OPTION_MASK_ISA_SSE_SET OPTION_MASK_ISA_SSE
-#define OPTION_MASK_ISA_SSE2_SET \
-  (OPTION_MASK_ISA_SSE2 | OPTION_MASK_ISA_SSE_SET)
-#define OPTION_MASK_ISA_SSE3_SET \
-  (OPTION_MASK_ISA_SSE3 | OPTION_MASK_ISA_SSE2_SET)
-#define OPTION_MASK_ISA_SSSE3_SET \
-  (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_SSE3_SET)
-#define OPTION_MASK_ISA_SSE4_1_SET \
-  (OPTION_MASK_ISA_SSE4_1 | OPTION_MASK_ISA_SSSE3_SET)
-#define OPTION_MASK_ISA_SSE4_2_SET \
-  (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_SSE4_1_SET)
-#define OPTION_MASK_ISA_AVX_SET \
-  (OPTION_MASK_ISA_AVX | OPTION_MASK_ISA_SSE4_2_SET \
-   | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_FMA_SET \
-  (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_AVX_SET)
-#define OPTION_MASK_ISA_AVX2_SET \
-  (OPTION_MASK_ISA_AVX2 | OPTION_MASK_ISA_AVX_SET)
-#define OPTION_MASK_ISA_FXSR_SET OPTION_MASK_ISA_FXSR
-#define OPTION_MASK_ISA_XSAVE_SET OPTION_MASK_ISA_XSAVE
-#define OPTION_MASK_ISA_XSAVEOPT_SET \
-  (OPTION_MASK_ISA_XSAVEOPT | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_AVX512F_SET \
-  (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_AVX2_SET)
-#define OPTION_MASK_ISA_AVX512CD_SET \
-  (OPTION_MASK_ISA_AVX512CD | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512PF_SET \
-  (OPTION_MASK_ISA_AVX512PF | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512ER_SET \
-  (OPTION_MASK_ISA_AVX512ER | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512DQ_SET \
-  (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512BW_SET \
-  (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512VL_SET \
-  (OPTION_MASK_ISA_AVX512VL | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512IFMA_SET \
-  (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512VBMI_SET \
-  (OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512BW_SET)
-#define OPTION_MASK_ISA2_AVX5124FMAPS_SET OPTION_MASK_ISA2_AVX5124FMAPS
-#define OPTION_MASK_ISA2_AVX5124VNNIW_SET OPTION_MASK_ISA2_AVX5124VNNIW
-#define OPTION_MASK_ISA_AVX512VBMI2_SET \
-  (OPTION_MASK_ISA_AVX512VBMI2 | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512VNNI_SET \
-  (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA2_AVXVNNI_SET OPTION_MASK_ISA2_AVXVNNI
-#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET \
-  (OPTION_MASK_ISA_AVX512VPOPCNTDQ | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512BITALG_SET \
-  (OPTION_MASK_ISA_AVX512BITALG | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA2_AVX512BF16_SET OPTION_MASK_ISA2_AVX512BF16
-#define OPTION_MASK_ISA_RTM_SET OPTION_MASK_ISA_RTM
-#define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW
-#define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED
-#define OPTION_MASK_ISA_ADX_SET OPTION_MASK_ISA_ADX
-#define OPTION_MASK_ISA_PREFETCHWT1_SET OPTION_MASK_ISA_PREFETCHWT1
-#define OPTION_MASK_ISA_CLFLUSHOPT_SET OPTION_MASK_ISA_CLFLUSHOPT
-#define OPTION_MASK_ISA_XSAVES_SET \
-  (OPTION_MASK_ISA_XSAVES | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_XSAVEC_SET \
-  (OPTION_MASK_ISA_XSAVEC | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_CLWB_SET OPTION_MASK_ISA_CLWB
-#define OPTION_MASK_ISA2_AVX512VP2INTERSECT_SET 
OPTION_MASK_ISA2_AVX512VP2INTERSECT
-#define OPTION_MASK_ISA2_AMX_TILE_SET OPTION_MASK_ISA2_AMX_TILE
-#define OPTION_MASK_ISA2_AMX_INT8_SET OPTION_MASK_ISA2_AMX_INT8
-#define OPTION_MASK_ISA2_AMX_BF16_SET OPTION_MASK_ISA2_AMX_BF16
-
-/* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
-   as -msse4.2.  */
-#define OPTION_MASK_ISA_SSE4_SET OPTION_MASK_ISA_SSE4_2_SET
-
-#define OPTION_MASK_ISA_SSE4A_SET \
-  

[PATCH v5 3/3] x86: Add general_regs_only attribute to MWAIT intrinsics

2021-04-21 Thread H.J. Lu via Gcc-patches
Add general_regs_only attribute to MWAIT intrinsics for callers with
general-regs-only target attribute.

gcc/

PR target/9974
* config/i386/mwaitintrin.h (_mm_monitor): Add general_regs_only
attribute.
(_mm_mwait): Likewise.

gcc/testsuite/

PR target/9974
* gcc.target/i386/pr99744-5.c: New test.
---
 gcc/config/i386/mwaitintrin.h |  6 --
 gcc/testsuite/gcc.target/i386/pr99744-5.c | 25 +++
 2 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c

diff --git a/gcc/config/i386/mwaitintrin.h b/gcc/config/i386/mwaitintrin.h
index 1ecbc4abb69..d5007875914 100644
--- a/gcc/config/i386/mwaitintrin.h
+++ b/gcc/config/i386/mwaitintrin.h
@@ -31,14 +31,16 @@
 #endif /* __MWAIT__ */
 
 extern __inline void
-__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__attribute__((__gnu_inline__, __always_inline__, __artificial__,
+  __general_regs_only__))
 _mm_monitor (void const * __P, unsigned int __E, unsigned int __H)
 {
   __builtin_ia32_monitor (__P, __E, __H);
 }
 
 extern __inline void
-__attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__attribute__((__gnu_inline__, __always_inline__, __artificial__,
+  __general_regs_only__))
 _mm_mwait (unsigned int __E, unsigned int __H)
 {
   __builtin_ia32_mwait (__E, __H);
diff --git a/gcc/testsuite/gcc.target/i386/pr99744-5.c 
b/gcc/testsuite/gcc.target/i386/pr99744-5.c
new file mode 100644
index 000..9e40e5ef428
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99744-5.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mmwait" } */
+
+/* Test calling MWAIT intrinsics from functions with general-regs-only
+   target attribute.  */
+
+#include 
+
+#define _CONCAT(x,y) x ## y
+
+#define test_2(func, type, op1_type, op2_type) \
+  __attribute__ ((target("general-regs-only")))
\
+  type _CONCAT(do_,func) (op1_type A, op2_type B)  \
+  { return func (A, B); }
+
+#define test_3(func, type, op1_type, op2_type, op3_type)   \
+  __attribute__ ((target("general-regs-only")))
\
+  type _CONCAT(do_,func) (op1_type A, op2_type B, op3_type C)  \
+  { return func (A, B, C); }
+
+#ifndef __iamcu__
+/* mwaitintrin.h */
+test_3 (_mm_monitor, void, void const *, unsigned int, unsigned int)
+test_2 (_mm_mwait, void, unsigned int, unsigned int)
+#endif
-- 
2.30.2



[PATCH v5 0/3] x86: Add general_regs_only function attribute

2021-04-21 Thread H.J. Lu via Gcc-patches
I realized that

commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
Author: H.J. Lu 
Date:   Fri Aug 21 09:42:49 2020 -0700

x86: Add target("general-regs-only") function attribute

is incomplete since it is impossible to call integer intrinsics from
a function with general-regs-only target attribute.  We need to add a
general_regs_only function attribute to go with it to mark functions
which use only general purpose registers.  When making inlining
decisions on such functions, x86 backend can exclude non-GPR compiler
options.  The general_regs_only attribute should be added to all x86
intrinsics which use only general purpose registers.

H.J. Lu (3):
  x86: Move OPTION_MASK_* to i386-common.h
  x86: Add general_regs_only function attribute
  x86: Add general_regs_only attribute to MWAIT intrinsics

 gcc/common/config/i386/i386-common.c  | 299 --
 gcc/common/config/i386/i386-common.h  | 317 +++
 gcc/config/i386/adxintrin.h   |  18 +-
 gcc/config/i386/bmi2intrin.h  |  24 +-
 gcc/config/i386/bmiintrin.h   |  92 --
 gcc/config/i386/cetintrin.h   |  33 +-
 gcc/config/i386/cldemoteintrin.h  |   3 +-
 gcc/config/i386/clflushoptintrin.h|   3 +-
 gcc/config/i386/clwbintrin.h  |   3 +-
 gcc/config/i386/clzerointrin.h|   4 +-
 gcc/config/i386/enqcmdintrin.h|   6 +-
 gcc/config/i386/fxsrintrin.h  |  12 +-
 gcc/config/i386/hresetintrin.h|   3 +-
 gcc/config/i386/i386-options.c|   2 +
 gcc/config/i386/i386.c|  29 +-
 gcc/config/i386/i386.h|   1 +
 gcc/config/i386/ia32intrin.h  |  94 --
 gcc/config/i386/lwpintrin.h   |  24 +-
 gcc/config/i386/lzcntintrin.h |  20 +-
 gcc/config/i386/movdirintrin.h|   9 +-
 gcc/config/i386/mwaitintrin.h |   6 +-
 gcc/config/i386/mwaitxintrin.h|   8 +-
 gcc/config/i386/pconfigintrin.h   |   3 +-
 gcc/config/i386/pkuintrin.h   |   6 +-
 gcc/config/i386/popcntintrin.h|   8 +-
 gcc/config/i386/rdseedintrin.h|   9 +-
 gcc/config/i386/rtmintrin.h   |   9 +-
 gcc/config/i386/serializeintrin.h |   8 +-
 gcc/config/i386/sgxintrin.h   |   9 +-
 gcc/config/i386/tbmintrin.h   |  80 +++--
 gcc/config/i386/tsxldtrkintrin.h  |   6 +-
 gcc/config/i386/uintrintrin.h |  12 +-
 gcc/config/i386/waitpkgintrin.h   |   9 +-
 gcc/config/i386/wbnoinvdintrin.h  |   3 +-
 gcc/config/i386/x86gprintrin.h|  45 ++-
 gcc/config/i386/xsavecintrin.h|   6 +-
 gcc/config/i386/xsaveintrin.h |  18 +-
 gcc/config/i386/xsaveoptintrin.h  |   6 +-
 gcc/config/i386/xsavesintrin.h|  12 +-
 gcc/config/i386/xtestintrin.h |   3 +-
 gcc/doc/extend.texi   |   5 +
 gcc/testsuite/gcc.target/i386/pr99744-3.c |  13 +
 gcc/testsuite/gcc.target/i386/pr99744-4.c | 357 ++
 gcc/testsuite/gcc.target/i386/pr99744-5.c |  25 ++
 44 files changed, 1178 insertions(+), 484 deletions(-)
 create mode 100644 gcc/common/config/i386/i386-common.h
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-5.c

-- 
2.30.2



Re: [PATCH 3/4] Overhaul in isa_flags and handling it.

2021-04-21 Thread H.J. Lu via Gcc-patches
On Wed, Apr 21, 2021 at 3:58 AM Uros Bizjak  wrote:
>
> On Wed, Apr 21, 2021 at 10:28 AM Martin Liska  wrote:
> >
> >
> > gcc/ChangeLog:
> >
> > * config/i386/i386-options.c (TARGET_EXPLICIT_NO_SAHF_P):
> > Define.
> > (SET_TARGET_NO_SAHF): Likewise.
> > (TARGET_EXPLICIT_PREFETCH_SSE_P): Likewise.
> > (SET_TARGET_PREFETCH_SSE): Likewise.
> > (TARGET_EXPLICIT_NO_TUNE_P): Likewise.
> > (SET_TARGET_NO_TUNE): Likewise.
> > (TARGET_EXPLICIT_NO_80387_P): Likewise.
> > (SET_TARGET_NO_80387): Likewise.
> > (DEF_PTA): New.
> > * config/i386/i386.h (TARGET_*): Remove.
> > * opth-gen.awk: Generate new used macros.
>
> LGTM, but please coordinate the commit with HJ, he has a couple of
> patches in this area.
>

Rebase is needed since my -mwait patch has been checked in.

-- 
H.J.


Re: [PATCH] testsuite: Add -fchecking to dg-ice tests

2021-04-21 Thread Richard Biener via Gcc-patches
On Wed, Apr 21, 2021 at 2:10 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> In --enable-checking=release builds (which is the default on release
> branches), I'm getting various extra FAILs that don't appear in
> --enable-checking=yes builds.
>
> XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (internal compiler error)
> FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (test for excess errors)
> XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (internal compiler error)
> FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (test for excess errors)
> XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (internal compiler error)
> FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (test for excess errors)
> FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++14 (test for excess errors)
> FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++17 (test for excess errors)
> FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++2a (test for excess errors)
> FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++14 (test for excess errors)
> FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++17 (test for excess errors)
> FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++2a (test for excess errors)
> FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++17 (test for excess errors)
> FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++2a (test for excess errors)
> FAIL: g++.dg/cpp2a/nontype-class39.C  -std=c++2a (test for excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++14 (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++17 (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++2a (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++98 (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++14 (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++17 (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++2a (test for 
> excess errors)
> FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++98 (test for 
> excess errors)
>
> These are tests that have dg-ice and most of those ICEs are checking ICEs
> which go away in release checking when -fno-checking is the default.
>
> The following patch adds -fchecking option to those.
> Tested on x86_64-linux --enable-checking=release build.
> Ok for trunk/10.2?

OK.

Richard.

> 2021-04-21  Jakub Jelinek  
>
> * g++.dg/cpp1z/constexpr-lambda26.C: Add dg-additional-options
> -fchecking.
> * g++.dg/cpp1y/auto-fn61.C: Likewise.
> * g++.dg/cpp2a/nontype-class39.C: Likewise.
> * g++.dg/cpp0x/constexpr-52830.C: Likewise.
> * g++.dg/cpp0x/vt-88982.C: Likewise.
> * c-c++-common/goacc/kernels-decompose-ice-1.c: Add -fchecking to
> dg-additional-options.
> * c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise.
>
> --- gcc/testsuite/g++.dg/cpp1z/constexpr-lambda26.C.jj  2020-11-10 
> 23:29:45.646246177 +0100
> +++ gcc/testsuite/g++.dg/cpp1z/constexpr-lambda26.C 2021-04-21 
> 12:24:53.515946657 +0200
> @@ -1,5 +1,6 @@
>  // PR c++/87765
>  // { dg-do compile { target c++17 } }
> +// { dg-additional-options "-fchecking" }
>  // { dg-ice "cxx_eval_constant_expression" }
>
>  template 
> --- gcc/testsuite/g++.dg/cpp1y/auto-fn61.C.jj   2020-08-11 13:44:01.233003025 
> +0200
> +++ gcc/testsuite/g++.dg/cpp1y/auto-fn61.C  2021-04-21 12:24:22.079298420 
> +0200
> @@ -1,5 +1,6 @@
>  // PR c++/88003
>  // { dg-do compile { target c++14 } }
> +// { dg-additional-options "-fchecking" }
>  // { dg-ice "poplevel_class" }
>
>  auto test() {
> --- gcc/testsuite/g++.dg/cpp2a/nontype-class39.C.jj 2020-11-10 
> 23:29:45.646246177 +0100
> +++ gcc/testsuite/g++.dg/cpp2a/nontype-class39.C2021-04-21 
> 12:25:16.427690283 +0200
> @@ -1,5 +1,6 @@
>  // PR c++/89565
>  // { dg-do compile { target c++20 } }
> +// { dg-additional-options "-fchecking" }
>  // { dg-ice "resolve_args" }
>
>  template 
> --- gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C.jj 2021-01-14 
> 20:00:52.868105693 +0100
> +++ gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C2021-04-21 
> 12:22:27.474581330 +0200
> @@ -1,5 +1,6 @@
>  // PR c++/52830
>  // { dg-do compile { target c++11 } }
> +// { dg-additional-options "-fchecking" }
>  // { dg-ice "comptypes" }
>
>  template struct eif { typedef void type; };
> --- gcc/testsuite/g++.dg/cpp0x/vt-88982.C.jj2020-11-10 23:29:45.646246177 
> +0100
> +++ gcc/testsuite/g++.dg/cpp0x/vt-88982.C   2021-04-21 12:23:46.824692964 
> +0200
> @@ -1,5 +1,6 @@
>  // PR c++/88982
>  // { dg-do compile { target c++11 } }
> +// { dg-additional-options "-fchecking" }
>  // { dg-ice "tsubst_pack_expansion" }
>
>  template struct A {
> --- gcc/testsuite/c-c++-common/goacc/kernels-decompose-ice-1.c.jj   
> 2021-04-20 23:46:09.179190182 +0200
> +++ gcc/testsuite/c-c++-common/goacc/kernels-decompose-ice-1.c  2021-04-21 
> 12:25:57.966225475 +0200
> 

Re: [PATCH 2/4] Generate PTA features from a def file.

2021-04-21 Thread Richard Biener via Gcc-patches
On Wed, Apr 21, 2021 at 2:09 PM Uros Bizjak via Gcc-patches
 wrote:
>
> On Wed, Apr 21, 2021 at 10:28 AM Martin Liska  wrote:
> >
> >
> > gcc/ChangeLog:
> >
> > * config/i386/i386.h (PTA_*): Remove.
> > (enum pta_flag): New.
> > (DEF_PTA): Generate PTA_* values from i386.def.
> > * wide-int-bitmask.h (WIDE_INT_BITMASK_FROM_NTH): New macro.
> > * config/i386/i386.def: New file.
>
> Shouldn't this file be rather named i386-isa.def?
>
> Otherwise LGTM, but someone have to approve wide-int-bitmask.h change.

Hmm, such macro isn't "nice" and as it seems you could have inlined it
at the single use in the source file?

> Uros.
>
> > ---
> >  gcc/config/i386/i386.def | 110 +++
> >  gcc/config/i386/i386.h   | 106 ++---
> >  gcc/wide-int-bitmask.h   |   3 ++
> >  3 files changed, 128 insertions(+), 91 deletions(-)
> >  create mode 100644 gcc/config/i386/i386.def
> >


Re: [PATCH] Use std::thread::hardware_concurrency in lto-wrapper.c.

2021-04-21 Thread Jeff Law via Gcc-patches



On 4/21/2021 2:08 AM, Martin Liška wrote:

Having c++11, we can now use a handy function called 
std::thread::hardware_concurrency.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* lto-wrapper.c (cpuset_popcount): Remove.
(init_num_threads): Remove and use hardware_concurrency.


OK

jeff



Re: [PATCH] Fix clang warnings.

2021-04-21 Thread Jeff Law via Gcc-patches



On 4/21/2021 2:18 AM, Martin Liška wrote:

Hello.

The patch silents Clang warnings:
1) if (TARGET_MACHO
) ...
  #if TARGET_MACHO

can be replaced with #if TARGET_MACHO

2) add_AT_vms_delta is used only if VMS_DEBUGGING_INFO is defined
3) tree.c:13454:16: warning: result of comparison of constant 42405 with 
expression of type 'enum tree_code' is always false 
[-Wtautological-constant-out-of-range-compare]

4) use canonical (more common form) of a condition that's intentionally 
disabled.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* config/i386/i386.c: Remove superfluous || TARGET_MACHO
which remains to be '(... || 0)' and clang complains about it.
* dwarf2out.c (AT_vms_delta): Declare conditionally.
(add_AT_vms_delta): Likewise.
* tree.c (fld_simplified_type): Use rather more common pattern
for disabling of something (#if 0).
(get_tree_code_name): Likewise.
(verify_type_variant): Likewise.


OK

jeff



[PATCH] Avoid -latomic for amdgcn offloading

2021-04-21 Thread Richard Biener
libatomic isn't built for amdgcn but reduction-16.c adds it
via -foffload=-latomic when offloading for nvptx is enabled.
The following avoids linker errors when offloading to amdgcn is enabled
as well.

Tested on x86_64-unknown-linux-gnu, OK for trunk and GCC 11 branch?

Thanks.
Richard.

2021-04-21  Richard Biener  

libgomp/
* testsuite/libgomp.c-c++-common/reduction-16.c: Use -latomic
only on nvptx-none.
---
 libgomp/testsuite/libgomp.c-c++-common/reduction-16.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c 
b/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c
index e60fe3664ed..0eea73b144b 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-additional-options "-foffload=-latomic" { target offload_target_nvptx 
} } */
+/* { dg-additional-options "-foffload=nvptx-none=-latomic" { target 
offload_target_nvptx } } */
 
 #include 
 
-- 
2.26.2


Re: [PATCH] testsuite: Add -fchecking to dg-ice tests

2021-04-21 Thread Jeff Law via Gcc-patches



On 4/21/2021 4:52 AM, Jakub Jelinek via Gcc-patches wrote:

Hi!

In --enable-checking=release builds (which is the default on release
branches), I'm getting various extra FAILs that don't appear in
--enable-checking=yes builds.

XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (test for excess errors)
XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (test for excess errors)
XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp2a/nontype-class39.C  -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++14 (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++17 (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++2a (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++98 (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++14 (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++17 (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++2a (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++98 (test for excess 
errors)

These are tests that have dg-ice and most of those ICEs are checking ICEs
which go away in release checking when -fno-checking is the default.

The following patch adds -fchecking option to those.
Tested on x86_64-linux --enable-checking=release build.
Ok for trunk/10.2?

2021-04-21  Jakub Jelinek  

* g++.dg/cpp1z/constexpr-lambda26.C: Add dg-additional-options
-fchecking.
* g++.dg/cpp1y/auto-fn61.C: Likewise.
* g++.dg/cpp2a/nontype-class39.C: Likewise.
* g++.dg/cpp0x/constexpr-52830.C: Likewise.
* g++.dg/cpp0x/vt-88982.C: Likewise.
* c-c++-common/goacc/kernels-decompose-ice-1.c: Add -fchecking to
dg-additional-options.
* c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise.



OK.  One could argue this should fall under the obvious rule in the future.

Jeff



Re: [PATCH][AArch64] Leveraging the use of STP instruction for vec_duplicate

2021-04-21 Thread Richard Sandiford via Gcc-patches
Victor Do Nascimento via Gcc-patches  writes:
> The backend pattern for storing a pair of identical values in 32 and 64-bit 
> modes with the machine instruction STP was missing, and multiple instructions 
> were needed to reproduce this behavior as a result of failed RTL pattern 
> match in combine pass.
>
> For the test case :
>
> typedef long long v2di __attribute__((vector_size (16)));
> typedef int v2si __attribute__((vector_size (8)));
>
> void
> foo (v2di *x, long long a)
> {
>   v2di tmp = {a, a};
>   *x = tmp;
> }
>
> void
> foo2 (v2si *x, int a)
> {
>   v2si tmp = {a, a};
>   *x = tmp;
> }
>
> at -O2 on aarch64 gives:
>
> foo:
>   stp x1, x1, [x0]
>   ret
> foo2:
>   stp w1, w1, [x0]
>   ret
>
> instead of:
>
> foo:
>   dup v0.2d, x1
>   str q0, [x0]
>   ret
> foo2:
>   dup v0.2s, w1
>   str d0, [x0]
>   ret
>
> In preparation for the next stage 1  phase of development, added new RTL 
> template, unittest and checked for regressions on bootstrapped 
> aarch64-none-linux-gnu.
>
> gcc/ChangeLog
>
> 2021-02-04 victor Do Nascimento 
>
>   * config/aarch64/aarch64-simd.md: Implement RTX pattern for
>   mapping 'vec_duplicate' RTX onto 'STP' ASM insn.
>   * config/aarch64/iterators.md: Implement ldpstp_vel_sz iterator
>   to map STP/LDP vector element mode to correct suffix in
>   attribute type definition of aarch64_simd_stp pattern.

A more typical changelog entry would be:

* config/aarch64/iterators.md (ldpstp_vel_sz): New mode attribute.
* config/aarch64/aarch64-simd.md (aarch64_simd_stp): New pattern.

> gcc/testsuite/ChangeLog
>
> 2021-02-04 Victor Do Nascimento 
>
>   * gcc.target/stp_vec-dup_32_64-1.c: Added test.
>
> Regards,
> Victor
>
> ---
>  gcc/config/aarch64/aarch64-simd.md| 10 +
>  gcc/config/aarch64/iterators.md   |  3 +++
>  .../gcc.target/aarch64/stp_vec_dup_32_64-1.c  | 22 +++
>  3 files changed, 35 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 71aa77dd010..3d53bab0018 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -205,6 +205,16 @@
>[(set_attr "type" "neon_stp")]
>  )
>  
> +(define_insn "aarch64_simd_stp"
> +  [(set (match_operand:VP_2E 0 "aarch64_mem_pair_operand" "=Ump,Ump")
> + (vec_duplicate:VP_2E (match_operand: 1 "register_operand" 
> "w,r")))]

Formatting nit: should just be one tab here.

I would have just made that change locally and committed, but I think
there's a problem: aarch64_mem_pair_operand and Ump are geared for pairs
of full-vector stores, rather than for pairs of elements.  This means that
(for example) the V2SI range will be [-256,255] * 8 rather than the expected
[-256,255] * 4.

I think we need to use aarch64_mem_pair_lanes_operand and Umn instead,
as for store_pair_lanes.  In addition:

  /* If we are dealing with ADDR_QUERY_LDP_STP_N that means the incoming mode
 corresponds to the actual size of the memory being loaded/stored and the
 mode of the corresponding addressing mode is half of that.  */
  if (type == ADDR_QUERY_LDP_STP_N
  && known_eq (GET_MODE_SIZE (mode), 16))
mode = DFmode;

only handles 128-bit vectors, whereas here we need it to handle 64-bit
vectors too.

It would be good to test the limits, e.g.:

> diff --git a/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c 
> b/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c
> new file mode 100644
> index 000..a37c903dfd4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c
> @@ -0,1 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +typedef long long v2di __attribute__((vector_size (16)));
> +typedef int v2si __attribute__((vector_size (8)));
> +
> +void
> +foo (v2di *x, long long a)
> +{
> +  v2di tmp = {a, a};
> +  *x = tmp;
> +}
> +
> +void
> +foo2 (v2si *x, int a)
> +{
> +  v2si tmp = {a, a};
> +  *x = tmp;
> +}

We could have additional tests for:

x[-129] = tmp; // out of range
x[-128] = tmp; // in range
x[127] = tmp; // in range
x[128] = tmp; // out of range

Thanks,
Richard

> +
> +/* { dg-final { scan-assembler-times "stp\t" 2 } } */
> +/* { dg-final { scan-assembler-not "dup\t" } } */


Re: GCC 11.1 Release Candidate available from gcc.gnu.org

2021-04-21 Thread Jonathan Wakely via Gcc-patches

On 21/04/21 12:38 +0100, Jonathan Wakely wrote:

On 20/04/21 22:12 -0700, Thomas Rodgers wrote:

@@ -86,6 +88,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
   }

+_GLIBCXX_ALWAYS_INLINE bool
+_M_try_acquire() noexcept
+{
+  for (;;)
+   {
+ auto __err = sem_trywait(&_M_semaphore);
+ if (__err && (errno == EINTR))
+   continue;
+ else if (__err && (errno == EAGAIN))
+   return false;
+ else if (__err)
+   std::terminate();
+ else
+   break;
+   }
+  return true;
+}
+
   _GLIBCXX_ALWAYS_INLINE void
   _M_release(std::ptrdiff_t __update) noexcept
   {


Please just commit this part to trunk and gcc-11, not the macro
renaming (as that's been fixed by Jakub already).


I think on trunk I'd prefer to do the attached. WDYT?


commit 07241d79b6720d4f392d5a8ba6e9c21d2801c8c7
Author: Jonathan Wakely 
Date:   Wed Apr 21 13:04:05 2021

libstdc++: Streamline  implementation [PR 100164]

This adds the missing _M_try_acquire member function so that
__platform_semaphore works.

Also adjust the  implementation so that __platform_semaphore
is not defined unless we're going to use it, which avoids including
 (and polluting he global namespace) when we don't need it.

Also rename the _GLIBCXX_REQUIRE_POSIX_SEMAPHORE macro to
_GLIBCXX_USE_POSIX_SEMAPHORE for consistency with the similar
_GLIBCXX_USE_CXX11_ABI macro that can be used to request an alternative
(ABI-changing) implementation.

libstdc++-v3/ChangeLog:

PR libstdc++/100164
* include/bits/semaphore_base.h: Only define at most one of
__platform_semaphore and __atomic_semaphore.
(__platform_semaphore::_M_try_acquire()): Add missing function.
* include/std/semaphore: Do not define anything unless one of
the implementations is available.
* testsuite/30_threads/semaphore/try_acquire_posix.cc: Define
macro to request POSIX semaphore implementation. Use standard
API, not private implementation.

diff --git a/libstdc++-v3/include/bits/semaphore_base.h b/libstdc++-v3/include/bits/semaphore_base.h
index 7e3235d182e..80134d7fc4c 100644
--- a/libstdc++-v3/include/bits/semaphore_base.h
+++ b/libstdc++-v3/include/bits/semaphore_base.h
@@ -1,4 +1,4 @@
-// -*- C++ -*- header.
+// std::counting_semaphore implementation -*- C++ -*- header.
 
 // Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
@@ -32,25 +32,32 @@
 
 #pragma GCC system_header
 
-#include 
-#if __cpp_lib_atomic_wait
-#include 
-#include 
-#endif // __cpp_lib_atomic_wait
-
-#ifdef _GLIBCXX_HAVE_POSIX_SEMAPHORE
-# include 
-# include 
-#endif
-
 #include 
 #include 
 
+// Note: the _GLIBCXX_USE_POSIX_SEMAPHORE macro can be used to force the
+// use of Posix semaphores (sem_t). Doing so however, alters the ABI.
+
+#ifndef _GLIBCXX_HAVE_POSIX_SEMAPHORE
+# undef _GLIBCXX_USE_POSIX_SEMAPHORE
+#endif
+
+#include 
+#if __cpp_lib_atomic_wait && ! _GLIBCXX_USE_POSIX_SEMAPHORE
+# include 
+# include 
+#elif _GLIBCXX_HAVE_POSIX_SEMAPHORE
+# include 
+# include 
+# undef _GLIBCXX_USE_POSIX_SEMAPHORE
+# define _GLIBCXX_USE_POSIX_SEMAPHORE 1
+#endif
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
-#ifdef _GLIBCXX_HAVE_POSIX_SEMAPHORE
+#if _GLIBCXX_USE_POSIX_SEMAPHORE
   struct __platform_semaphore
   {
 using __clock_t = chrono::system_clock;
@@ -76,23 +83,44 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
   for (;;)
 	{
-	  auto __err = sem_wait(&_M_semaphore);
-	  if (__err && (errno == EINTR))
-	continue;
-	  else if (__err)
-	std::terminate();
+	  if (auto __err = sem_wait(&_M_semaphore))
+	{
+	  if (errno == EINTR)
+		continue;
+	  else
+		std::terminate();
+	}
 	  else
 	break;
 	}
 }
 
+_GLIBCXX_ALWAYS_INLINE bool
+_M_try_acquire() noexcept
+{
+  for (;;)
+	{
+	  if (auto __err = sem_trywait(&_M_semaphore))
+	{
+	  if (errno == EINTR)
+		continue;
+	  else if (errno == EAGAIN)
+		return false;
+	  else
+		std::terminate();
+	}
+	  else
+	break;
+	}
+  return true;
+}
+
 _GLIBCXX_ALWAYS_INLINE void
 _M_release(std::ptrdiff_t __update) noexcept
 {
   for(; __update != 0; --__update)
 	{
-	   auto __err = sem_post(&_M_semaphore);
-	   if (__err)
+	   if (auto __err = sem_post(&_M_semaphore))
 	 std::terminate();
 	}
 }
@@ -162,9 +190,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   private:
 sem_t _M_semaphore;
   };
-#endif // _GLIBCXX_HAVE_POSIX_SEMAPHORE
 
-#if __cpp_lib_atomic_wait
+  using __semaphore_impl = __platform_semaphore;
+
+#elif __cpp_lib_atomic_wait
+
   struct __atomic_semaphore
   {
 static constexpr ptrdiff_t _S_max = __gnu_cxx::__int_traits::__max;
@@ -245,19 +275,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   private:
 alignas(__detail::__platform_wait_alignment)
-

Re: [PATCH] aarch64: Avoid duplicating bti j insns for jump tables [PR99988]

2021-04-21 Thread Richard Sandiford via Gcc-patches
Alex Coplan  writes:
> Hi Richard,
>
> On 15/04/2021 18:45, Richard Sandiford wrote:
>> Looks good in general, but like you say, it's GCC 12 material.
>
> Thanks for the review. The attached patch addresses these comments and
> bootstraps/regtests OK on aarch64-linux-gnu. OK for trunk?

OK, thanks.

Richard


Re: [committed] libstdc++: Fix whitespace in license boilerplate

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 01:01:44PM +0100, Jonathan Wakely via Gcc-patches wrote:
> libstdc++-v3/ChangeLog:
> 
>   * include/std/latch: Replace tab characters in license text.
>   * include/std/semaphore: Likewise.
> 
> Tested x86_64-linux. Committed to trunk.
> 
> OK for gcc-11 too?

Yes.

> commit 88202c883c07da1c03dbb1ad440f1b70189c4399
> Author: Jonathan Wakely 
> Date:   Wed Apr 21 12:59:04 2021
> 
> libstdc++: Fix whitespace in license boilerplate
> 
> libstdc++-v3/ChangeLog:
> 
> * include/std/latch: Replace tab characters in license text.
> * include/std/semaphore: Likewise.

Jakub



[committed] libstdc++: Fix whitespace in license boilerplate

2021-04-21 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* include/std/latch: Replace tab characters in license text.
* include/std/semaphore: Likewise.

Tested x86_64-linux. Committed to trunk.

OK for gcc-11 too?


commit 88202c883c07da1c03dbb1ad440f1b70189c4399
Author: Jonathan Wakely 
Date:   Wed Apr 21 12:59:04 2021

libstdc++: Fix whitespace in license boilerplate

libstdc++-v3/ChangeLog:

* include/std/latch: Replace tab characters in license text.
* include/std/semaphore: Likewise.

diff --git a/libstdc++-v3/include/std/latch b/libstdc++-v3/include/std/latch
index 20b75f8181a..f00fcf34ed6 100644
--- a/libstdc++-v3/include/std/latch
+++ b/libstdc++-v3/include/std/latch
@@ -2,7 +2,7 @@
 
 // Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
-// This file is part of the GNU ISO C++ Library.   This library is free
+// This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
 // terms of the GNU General Public License as published by the
 // Free Software Foundation; either version 3, or (at your option)
@@ -10,7 +10,7 @@
 
 // This library is distributed in the hope that it will be useful,
 // but WITHOUT ANY WARRANTY; without even the implied warranty of
-// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 // GNU General Public License for more details.
 
 // Under Section 7 of GPL version 3, you are granted additional
@@ -19,7 +19,7 @@
 
 // You should have received a copy of the GNU General Public License and
 // a copy of the GCC Runtime Library Exception along with this program;
-// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // .
 
 /** @file include/latch
diff --git a/libstdc++-v3/include/std/semaphore 
b/libstdc++-v3/include/std/semaphore
index 02a8214e569..a1560915d83 100644
--- a/libstdc++-v3/include/std/semaphore
+++ b/libstdc++-v3/include/std/semaphore
@@ -2,7 +2,7 @@
 
 // Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
-// This file is part of the GNU ISO C++ Library.   This library is free
+// This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
 // terms of the GNU General Public License as published by the
 // Free Software Foundation; either version 3, or (at your option)
@@ -10,7 +10,7 @@
 
 // This library is distributed in the hope that it will be useful,
 // but WITHOUT ANY WARRANTY; without even the implied warranty of
-// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 // GNU General Public License for more details.
 
 // Under Section 7 of GPL version 3, you are granted additional
@@ -19,7 +19,7 @@
 
 // You should have received a copy of the GNU General Public License and
 // a copy of the GCC Runtime Library Exception along with this program;
-// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // .
 
 /** @file include/semaphore


Re: [RFC] Run pass_sink_code once more after ivopts/fre

2021-04-21 Thread Richard Biener
On Tue, 20 Apr 2021, Xionghu Luo wrote:

> 
> 
> On 2021/4/15 19:34, Richard Biener wrote:
> > On Thu, 15 Apr 2021, Xionghu Luo wrote:
> > 
> >> Thanks,
> >>
> >> On 2021/4/14 14:41, Richard Biener wrote:
>  "#538,#235,#234,#233" will all be sunk from bb 35 to bb 37 by rtl-sink,
>  but it moves #538 first, then #235, there is strong dependency here. It
>  seemsdoesn't like the LCM framework that could solve all and do the
>  delete-insert in one iteration.
> >>> So my question was whether we want to do both within the LCM store
> >>> sinking framework.  The LCM dataflow is also used by RTL PRE which
> >>> handles both loads and non-loads so in principle it should be able
> >>> to handle stores and non-stores for the sinking case (PRE on the
> >>> reverse CFG).
> >>>
> >>> A global dataflow is more powerful than any local ad-hoc method.
> >>
> >> My biggest concern is whether the LCM DF framework could support sinking
> >> *multiple* reverse-dependent non-store instructions together by *one*
> >> calling of LCM DF.   If this is not supported, we need run multiple LCM
> >> until no new changes, it would be time consuming obviously (unless
> >> compiling time is not important here).
> > 
> > As said it is used for PRE and there it most definitely can do that.
> 
> I did some investigation about PRE and attached a case to show how it
> works, it is quite like store-motion, and actually there is a rtl-hoist
> pass in gcse.c which only works for code size.  All of them are
> leveraging the LCM framework to move instructions upward or downward.
> 
> PRE and rtl-hoist move instructions upward, they analyze/hash the SOURCE
> exprs and call pre_edge_lcm, store-motion and rtl-sink move instructions
> downward, so they analyze/hash the DEST exprs and call pre_edge_rev_lcm.
> The four problems are all converted to the LCM DF problem with
> n_basic_blocks * m_exprs of 4 matrix (antic, transp, avail, kill) as input
> and two outputs of where to insert/delete.
> 
> PRE scan each instruction and hash the SRC to table without *checking the
> relationship between instructions*, for the case attached, BB 37, BB 38
> and BB 41 both contains SOURCE expr "r262:DI+r139:DI", but BB 37 and BB 41
> save it to index 106, BB 38 save it to index 110. After finishing this pass,
> "r262:DI+r139:DI" BB41 is replaced with "r194:DI=r452:DI", then insert
> expr to BB 75~BB 80 to create full redundancies from partial redundancies,
> finally update instruction in BB 37.

I'm not familiar with the actual PRE code but reading the toplevel comment
it seems that indeed it can only handle expressions contained in a single
insn unless a REG_EQUAL note provides a short-hand for the larger one.

That of course means it would need to mark things as not transparent
for correctness where they'd be if moved together.  Now, nothing
prevents you changing the granularity of what you feed LCM.

So originally we arrived at looking into LCM because there's already
a (store) sinking pass on RTL (using LCM) so adding another (loop-special)
one didn't look like the best obvious solution.

That said, LCM would work for single-instruction expressions.  
Alternatively a greedy algorithm like you prototyped could be used.
Another pass to look at would be RTL invariant motion which seems to
compute some kind of dependency graph - not sure if that would be
adaptable for the reverse CFG problem.

> Issues witnessed in the PRE run:
> 1) "r262:DI+r139:DI" in BLOCK 38 is not replaced;
> 2) PRE also use pseudo register to store the result like store-motion and
> even rtl-hoist. Actually we need real "move" instead of "replace" for
> rtl-sink to improve performance though also potential register pressure issue
> like rtl-hoist?
> 3) SRC instruction is scanned WITHOUT back chain check cross instructions
> in hash_scan_set, it couldn't handle below case though a+c is same with b+d.
> So I am afraid single run couldn't solve the instruction dependency issue
> to sink multiple instructions out as before for rtl-sink.
> 
> BB1:
> a = b;
> c = d;
> s = a + c;
> BB2:
> s = b + d;
> 
> 
> gcse.c:
> changed = one_pre_gcse_pass ()
> alloc_hash_table (_hash_table);
> compute_hash_table (_hash_table);
> compute_hash_table_work (table);
>   FOR_EACH_BB_FN (current_bb, cfun)
>FOR_BB_INSNS (current_bb, insn)
> hash_scan_insn (insn, table);
>   hash_scan_set (pat, insn, table);
>if REG_P (dest)
>  insert_expr_in_table (src, GET_MODE (dest), insn, antic_p, avail_p,
>  max_distance, table);
>hash = hash_expr (x, mode, _not_record_p, table->size);
> dump_hash_table (dump_file, "Expression", _hash_table);
> edge_list = compute_pre_data ();
> compute_local_properties (transp, comp, antloc, _hash_table);
> edge_list = pre_edge_lcm (expr_hash_table.n_elems, transp, comp, antloc,
> ae_kill, _insert_map, _delete_map);
> changed |= pre_gcse (edge_list);
> changed = pre_delete ();  /* Create a pseudo-reg to store the result of
> reaching 

[PATCH] LTO: fallback to -flto=N if -flto=jobserver does not work.

2021-04-21 Thread Martin Liška
When -flto=jobserver is used and we cannot detect job server, then we can
still fallbackto -flto=N mode.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* lto-wrapper.c (run_gcc): When -flto=jobserver is used, but the
makeserver cannot be detected, then use -flto=N fallback.
---
 gcc/lto-wrapper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 03a5922f8ea..0b626d7c811 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -1585,8 +1585,9 @@ run_gcc (unsigned argc, char *argv[])
   if (jobserver && jobserver_error != NULL)
{
  warning (0, jobserver_error);
- parallel = 0;
+ /* Fall back to auto parallelism.  */
  jobserver = 0;
+ auto_parallel = 1;
}
   else if (!jobserver && jobserver_error == NULL)
{
-- 
2.31.1



Re: [PATCH] aarch64: Avoid duplicating bti j insns for jump tables [PR99988]

2021-04-21 Thread Alex Coplan via Gcc-patches
Hi Richard,

On 15/04/2021 18:45, Richard Sandiford wrote:
> Looks good in general, but like you say, it's GCC 12 material.

Thanks for the review. The attached patch addresses these comments and
bootstraps/regtests OK on aarch64-linux-gnu. OK for trunk?

Thanks,
Alex

> 
> Alex Coplan  writes:
> > diff --git a/gcc/config/aarch64/aarch64-bti-insert.c 
> > b/gcc/config/aarch64/aarch64-bti-insert.c
> > index 936649769c7..943fa3c1097 100644
> > --- a/gcc/config/aarch64/aarch64-bti-insert.c
> > +++ b/gcc/config/aarch64/aarch64-bti-insert.c
> > @@ -120,6 +120,13 @@ aarch64_pac_insn_p (rtx x)
> >return false;
> >  }
> >  
> > +static bool
> > +aarch64_bti_j_insn_p (rtx_insn *insn)
> > +{
> > +  rtx pat = PATTERN (insn);
> > +  return GET_CODE (pat) == UNSPEC_VOLATILE && XINT (pat, 1) == 
> > UNSPECV_BTI_J;
> > +}
> > +
> 
> Nit, but even a simple function like this should have a comment. :-)
> 
> >  /* Insert the BTI instruction.  */
> >  /* This is implemented as a late RTL pass that runs before branch
> > shortening and does the following.  */
> > @@ -165,6 +172,9 @@ rest_of_insert_bti (void)
> >   for (j = GET_NUM_ELEM (vec) - 1; j >= 0; --j)
> > {
> >   label = as_a  (XEXP (RTVEC_ELT (vec, j), 0));
> > + if (aarch64_bti_j_insn_p (next_nonnote_insn (label)))
> > +   continue;
> > +
> 
> This should be next_nonnote_nondebug_insn (quite the mouthful),
> otherwise debug instructions could affect the choice.
> 
> The thing returned by next_nonnote_nondebug_insn isn't in general
> guaranteed to be an insn (unlike next_real_nondebug_insn).  It might
> also be null in very odd cases.  I think we should therefore check
> for null and INSN_P before checking PATTERN.
> 
> Thanks,
> Richard
> 
> >   bti_insn = gen_bti_j ();
> >   emit_insn_after (bti_insn, label);
> > }
diff --git a/gcc/config/aarch64/aarch64-bti-insert.c 
b/gcc/config/aarch64/aarch64-bti-insert.c
index 936649769c7..5d6bc169d6b 100644
--- a/gcc/config/aarch64/aarch64-bti-insert.c
+++ b/gcc/config/aarch64/aarch64-bti-insert.c
@@ -120,6 +120,17 @@ aarch64_pac_insn_p (rtx x)
   return false;
 }
 
+/* Check if INSN is a BTI J insn.  */
+static bool
+aarch64_bti_j_insn_p (rtx_insn *insn)
+{
+  if (!insn || !INSN_P (insn))
+return false;
+
+  rtx pat = PATTERN (insn);
+  return GET_CODE (pat) == UNSPEC_VOLATILE && XINT (pat, 1) == UNSPECV_BTI_J;
+}
+
 /* Insert the BTI instruction.  */
 /* This is implemented as a late RTL pass that runs before branch
shortening and does the following.  */
@@ -165,6 +176,10 @@ rest_of_insert_bti (void)
  for (j = GET_NUM_ELEM (vec) - 1; j >= 0; --j)
{
  label = as_a  (XEXP (RTVEC_ELT (vec, j), 0));
+ rtx_insn *next = next_nonnote_nondebug_insn (label);
+ if (aarch64_bti_j_insn_p (next))
+   continue;
+
  bti_insn = gen_bti_j ();
  emit_insn_after (bti_insn, label);
}
diff --git a/gcc/testsuite/gcc.target/aarch64/pr99988.c 
b/gcc/testsuite/gcc.target/aarch64/pr99988.c
new file mode 100644
index 000..2d87f41a717
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr99988.c
@@ -0,0 +1,66 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mbranch-protection=standard" } */
+/* { dg-final { scan-assembler-times {bti j} 13 } } */
+int a;
+int c();
+int d();
+int e();
+int f();
+int g();
+void h() {
+  switch (a) {
+  case 0:
+  case 56:
+  case 57:
+break;
+  case 58:
+  case 59:
+  case 61:
+  case 62:
+c();
+  case 64:
+  case 63:
+d();
+  case 66:
+  case 65:
+d();
+  case 68:
+  case 67:
+d();
+  case 69:
+  case 70:
+d();
+  case 71:
+  case 72:
+  case 88:
+  case 87:
+d();
+  case 90:
+  case 89:
+d();
+  case 92:
+  case 1:
+d();
+  case 93:
+  case 73:
+  case 4:
+e();
+  case 76:
+  case 5:
+f();
+  case 7:
+  case 8:
+  case 84:
+  case 85:
+break;
+  case 6:
+  case 299:
+  case 9:
+  case 80:
+  case 2:
+  case 3:
+e();
+  default:
+g();
+  }
+}


Re: GCC 11.1 Release Candidate available from gcc.gnu.org

2021-04-21 Thread Jonathan Wakely via Gcc-patches

On 20/04/21 22:12 -0700, Thomas Rodgers wrote:

@@ -86,6 +88,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
}

+_GLIBCXX_ALWAYS_INLINE bool
+_M_try_acquire() noexcept
+{
+  for (;;)
+   {
+ auto __err = sem_trywait(&_M_semaphore);
+ if (__err && (errno == EINTR))
+   continue;
+ else if (__err && (errno == EAGAIN))
+   return false;
+ else if (__err)
+   std::terminate();
+ else
+   break;
+   }
+  return true;
+}
+
_GLIBCXX_ALWAYS_INLINE void
_M_release(std::ptrdiff_t __update) noexcept
{


Please just commit this part to trunk and gcc-11, not the macro
renaming (as that's been fixed by Jakub already).




Re: [PATCH] Support LABEL_DECL in %qD directive.

2021-04-21 Thread Martin Liška
On 4/21/21 1:26 PM, Martin Liška wrote:
> On 4/21/21 12:56 PM, Jakub Jelinek wrote:
>> On Wed, Apr 21, 2021 at 12:52:42PM +0200, Martin Liška wrote:
>>> On 4/21/21 11:04 AM, Jakub Jelinek wrote:
 Wouldn't it be better to be consistent with tree-pretty-print.c on this
 or perhaps just call dump_generic_node or whatever is used to dump
 those e.g. for C?
>>>
>>> Yes, I'm going to install patch that does:
>>>
>>> +  if (DECL_NAME (t))
>>>
>>> +   pp_cxx_tree_identifier (pp, DECL_NAME (t));
>>>
>>> +  else
>>>
>>> +   dump_generic_node (pp, t, 0, TDF_SLIM, false);
>>
>> Wouldn't flags | TDF_SLIM be better, so that it honors nouid etc.?
>>
>>  Jakub
>>
> 
> Good point, fixed that.

Oh, it leads to:

/home/marxin/Programming/gcc/gcc/cp/error.c: In function ‘void 
dump_decl(cxx_pretty_printer*, tree, int)’:

/home/marxin/Programming/gcc/gcc/cp/error.c:1368:37: error: invalid conversion 
from ‘int’ to ‘dump_flags_t’ {aka ‘dump_flag’} [-fpermissive]

 1368 |  dump_generic_node (pp, t, 0, flags | TDF_SLIM, false);

  |   ~~^~

  | |

  | int


Thus I reverted the change. It uses flags defined in cp/cp-tree.h:
#define TFF_PLAIN_IDENTIFIER(0)
#define TFF_SCOPE   (1)
#define TFF_CHASE_TYPEDEF   (1 << 1)
#define TFF_DECL_SPECIFIERS (1 << 2)
...

Martin

> 
> Thanks,
> Martin
> 



Re: [PATCH] Support LABEL_DECL in %qD directive.

2021-04-21 Thread Martin Liška
On 4/21/21 12:56 PM, Jakub Jelinek wrote:
> On Wed, Apr 21, 2021 at 12:52:42PM +0200, Martin Liška wrote:
>> On 4/21/21 11:04 AM, Jakub Jelinek wrote:
>>> Wouldn't it be better to be consistent with tree-pretty-print.c on this
>>> or perhaps just call dump_generic_node or whatever is used to dump
>>> those e.g. for C?
>>
>> Yes, I'm going to install patch that does:
>>
>> +  if (DECL_NAME (t))
>>
>> +   pp_cxx_tree_identifier (pp, DECL_NAME (t));
>>
>> +  else
>>
>> +   dump_generic_node (pp, t, 0, TDF_SLIM, false);
> 
> Wouldn't flags | TDF_SLIM be better, so that it honors nouid etc.?
> 
>   Jakub
> 

Good point, fixed that.

Thanks,
Martin


Re: [PATCH] Support the new ("v0") mangling scheme in rust-demangle.

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Tue, Apr 20, 2021 at 05:55:46PM +0200, Andreas Schwab wrote:
> On Nov 13 2020, Nikhil Benesch via Gcc-patches wrote:
> 
> > +static void
> > +demangle_const_char (struct rust_demangler *rdm)
> > +{
> > +  size_t hex_len;
> > +  uint64_t value;
> > +
> > +  hex_len = parse_hex_nibbles (rdm, );
> > +
> > +  if (hex_len == 0 || hex_len > 8)
> > +{
> > +  rdm->errored = 1;
> > +  return;
> > +}
> > +
> > +  /* Match Rust's character "debug" output as best as we can. */
> > +  PRINT ("'");
> > +  if (value == '\t')
> > +PRINT ("\\t");
> > +  else if (value == '\r')
> > +PRINT ("\\r");
> > +  else if (value == '\n')
> > +PRINT ("\\n");
> > +  else if (value > ' ' && value < '~')
> > +/* Rust also considers many non-ASCII codepoints to be printable, but
> > +   that logic is not easily ported to C. */
> > +print_str (rdm, (char *) , 1);
> 
>   * rust-demangle.c (demangle_const_char): Properly print the
>   character value.

Ok for trunk/11.2/10.4 .

Jakub



Re: [PATCH 3/4] Overhaul in isa_flags and handling it.

2021-04-21 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 21, 2021 at 10:28 AM Martin Liska  wrote:
>
>
> gcc/ChangeLog:
>
> * config/i386/i386-options.c (TARGET_EXPLICIT_NO_SAHF_P):
> Define.
> (SET_TARGET_NO_SAHF): Likewise.
> (TARGET_EXPLICIT_PREFETCH_SSE_P): Likewise.
> (SET_TARGET_PREFETCH_SSE): Likewise.
> (TARGET_EXPLICIT_NO_TUNE_P): Likewise.
> (SET_TARGET_NO_TUNE): Likewise.
> (TARGET_EXPLICIT_NO_80387_P): Likewise.
> (SET_TARGET_NO_80387): Likewise.
> (DEF_PTA): New.
> * config/i386/i386.h (TARGET_*): Remove.
> * opth-gen.awk: Generate new used macros.

LGTM, but please coordinate the commit with HJ, he has a couple of
patches in this area.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386-options.c | 303 -
>  gcc/config/i386/i386.h | 192 +
>  gcc/opth-gen.awk   |  11 +-
>  3 files changed, 45 insertions(+), 461 deletions(-)
>


Re: [PATCH] Support LABEL_DECL in %qD directive.

2021-04-21 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 21, 2021 at 12:52:42PM +0200, Martin Liška wrote:
> On 4/21/21 11:04 AM, Jakub Jelinek wrote:
> > Wouldn't it be better to be consistent with tree-pretty-print.c on this
> > or perhaps just call dump_generic_node or whatever is used to dump
> > those e.g. for C?
> 
> Yes, I'm going to install patch that does:
> 
> +  if (DECL_NAME (t))
> 
> +   pp_cxx_tree_identifier (pp, DECL_NAME (t));
> 
> +  else
> 
> +   dump_generic_node (pp, t, 0, TDF_SLIM, false);

Wouldn't flags | TDF_SLIM be better, so that it honors nouid etc.?

Jakub



Re: [PATCH 4/4] Remove TARGET_foo (ix86_tune == PROCESSOR_foo) macros.

2021-04-21 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 21, 2021 at 10:28 AM Martin Liska  wrote:
>
>
> gcc/ChangeLog:
>
> * config/i386/i386-expand.c (decide_alignment): Use newly named
> macro TARGET_CPU_P.
> * config/i386/i386.c (ix86_decompose_address): Likewise.
> (ix86_address_cost): Likewise.
> (ix86_lea_outperforms): Likewise.
> (ix86_avoid_lea_for_addr): Likewise.
> (ix86_add_stmt_cost): Likewise.
> * config/i386/i386.h (TARGET_*): Remove.
> (TARGET_CPU_P): New macro.
> * config/i386/i386.md: Use newly named macro TARGET_CPU_P.
> * config/i386/x86-tune-sched-atom.c (do_reorder_for_imul): Likewise.
> (swap_top_of_ready_list): Likewise.
> (ix86_atom_sched_reorder): Likewise.
> * config/i386/x86-tune-sched-bd.c (ix86_bd_has_dispatch): Likewise.
> * config/i386/x86-tune-sched.c (ix86_adjust_cost): Likewise.

OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386-expand.c |  2 +-
>  gcc/config/i386/i386.c| 16 ++
>  gcc/config/i386/i386.h| 46 +--
>  gcc/config/i386/i386.md   |  8 ++---
>  gcc/config/i386/x86-tune-sched-atom.c |  7 ++--
>  gcc/config/i386/x86-tune-sched-bd.c   |  5 +--
>  gcc/config/i386/x86-tune-sched.c  |  2 +-
>  7 files changed, 23 insertions(+), 63 deletions(-)
>


[PATCH] testsuite: Add -fchecking to dg-ice tests

2021-04-21 Thread Jakub Jelinek via Gcc-patches
Hi!

In --enable-checking=release builds (which is the default on release
branches), I'm getting various extra FAILs that don't appear in
--enable-checking=yes builds.

XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (test for excess errors)
XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (test for excess errors)
XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp2a/nontype-class39.C  -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++14 (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++17 (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++2a (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++98 (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++14 (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++17 (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++2a (test for excess 
errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++98 (test for excess 
errors)

These are tests that have dg-ice and most of those ICEs are checking ICEs
which go away in release checking when -fno-checking is the default.

The following patch adds -fchecking option to those.
Tested on x86_64-linux --enable-checking=release build.
Ok for trunk/10.2?

2021-04-21  Jakub Jelinek  

* g++.dg/cpp1z/constexpr-lambda26.C: Add dg-additional-options
-fchecking.
* g++.dg/cpp1y/auto-fn61.C: Likewise.
* g++.dg/cpp2a/nontype-class39.C: Likewise.
* g++.dg/cpp0x/constexpr-52830.C: Likewise.
* g++.dg/cpp0x/vt-88982.C: Likewise.
* c-c++-common/goacc/kernels-decompose-ice-1.c: Add -fchecking to
dg-additional-options.
* c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise.

--- gcc/testsuite/g++.dg/cpp1z/constexpr-lambda26.C.jj  2020-11-10 
23:29:45.646246177 +0100
+++ gcc/testsuite/g++.dg/cpp1z/constexpr-lambda26.C 2021-04-21 
12:24:53.515946657 +0200
@@ -1,5 +1,6 @@
 // PR c++/87765
 // { dg-do compile { target c++17 } }
+// { dg-additional-options "-fchecking" }
 // { dg-ice "cxx_eval_constant_expression" }
 
 template 
--- gcc/testsuite/g++.dg/cpp1y/auto-fn61.C.jj   2020-08-11 13:44:01.233003025 
+0200
+++ gcc/testsuite/g++.dg/cpp1y/auto-fn61.C  2021-04-21 12:24:22.079298420 
+0200
@@ -1,5 +1,6 @@
 // PR c++/88003
 // { dg-do compile { target c++14 } }
+// { dg-additional-options "-fchecking" }
 // { dg-ice "poplevel_class" }
 
 auto test() {
--- gcc/testsuite/g++.dg/cpp2a/nontype-class39.C.jj 2020-11-10 
23:29:45.646246177 +0100
+++ gcc/testsuite/g++.dg/cpp2a/nontype-class39.C2021-04-21 
12:25:16.427690283 +0200
@@ -1,5 +1,6 @@
 // PR c++/89565
 // { dg-do compile { target c++20 } }
+// { dg-additional-options "-fchecking" }
 // { dg-ice "resolve_args" }
 
 template 
--- gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C.jj 2021-01-14 
20:00:52.868105693 +0100
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C2021-04-21 
12:22:27.474581330 +0200
@@ -1,5 +1,6 @@
 // PR c++/52830
 // { dg-do compile { target c++11 } }
+// { dg-additional-options "-fchecking" }
 // { dg-ice "comptypes" }
 
 template struct eif { typedef void type; };
--- gcc/testsuite/g++.dg/cpp0x/vt-88982.C.jj2020-11-10 23:29:45.646246177 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/vt-88982.C   2021-04-21 12:23:46.824692964 
+0200
@@ -1,5 +1,6 @@
 // PR c++/88982
 // { dg-do compile { target c++11 } }
+// { dg-additional-options "-fchecking" }
 // { dg-ice "tsubst_pack_expansion" }
 
 template struct A {
--- gcc/testsuite/c-c++-common/goacc/kernels-decompose-ice-1.c.jj   
2021-04-20 23:46:09.179190182 +0200
+++ gcc/testsuite/c-c++-common/goacc/kernels-decompose-ice-1.c  2021-04-21 
12:25:57.966225475 +0200
@@ -1,7 +1,7 @@
 /* Test OpenACC 'kernels' construct decomposition.  */
 
 /* { dg-additional-options "-fopt-info-omp-all" } */
-/* { dg-additional-options "--param=openacc-kernels=decompose" } */
+/* { dg-additional-options "-fchecking --param=openacc-kernels=decompose" } */
 /* { dg-ice "TODO" }

Re: [PATCH] Support LABEL_DECL in %qD directive.

2021-04-21 Thread Martin Liška
On 4/21/21 11:04 AM, Jakub Jelinek wrote:
> Wouldn't it be better to be consistent with tree-pretty-print.c on this
> or perhaps just call dump_generic_node or whatever is used to dump
> those e.g. for C?

Yes, I'm going to install patch that does:

+  if (DECL_NAME (t))

+   pp_cxx_tree_identifier (pp, DECL_NAME (t));

+  else

+   dump_generic_node (pp, t, 0, TDF_SLIM, false);


Martin


Re: [PATCH 2/4] Generate PTA features from a def file.

2021-04-21 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 21, 2021 at 10:28 AM Martin Liska  wrote:
>
>
> gcc/ChangeLog:
>
> * config/i386/i386.h (PTA_*): Remove.
> (enum pta_flag): New.
> (DEF_PTA): Generate PTA_* values from i386.def.
> * wide-int-bitmask.h (WIDE_INT_BITMASK_FROM_NTH): New macro.
> * config/i386/i386.def: New file.

Shouldn't this file be rather named i386-isa.def?

Otherwise LGTM, but someone have to approve wide-int-bitmask.h change.

Uros.

> ---
>  gcc/config/i386/i386.def | 110 +++
>  gcc/config/i386/i386.h   | 106 ++---
>  gcc/wide-int-bitmask.h   |   3 ++
>  3 files changed, 128 insertions(+), 91 deletions(-)
>  create mode 100644 gcc/config/i386/i386.def
>


  1   2   >