[PATCH] Fortran : ICE in gfc_validate_kind PR96099

2020-10-01 Thread Mark Eggleston
This is a follow up to PR95586 which fixed only the ICE that occurred 
when using derived types in an implicit statement.  The ICE occurred 
because an attempt was made to determine kind for types that do not have 
kinds.


This patch ensures that kind is only determined for types that support kind.

OK for master?

Is it worth backporting?

[PATCH] Fortran  : ICE in gfc_validate_kind PR96099

Only check for kind if the type supports kind.

2020-10-02  Mark Eggleston 

/gcc/fortran

    PR fortran/96099
    * decl.c (gfc_match_implicit): Check for numeric and logical
    types.

2020-10-02  Mark Eggleston 

/gcc/testsuite

    PR fortran/96099
    * gfortran.dg/pr96099_1.f90: New test.
    * gfortran.dg/pr96099_2.f90: New test.

--
https://www.codethink.co.uk/privacy.html

>From 8770d2c3f599f8e758747b606613ae53f0b26bc9 Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Thu, 1 Oct 2020 11:14:09 +0100
Subject: [PATCH] Fortran  : ICE in gfc_validate_kind PR96099

Only check for kind if the type supports kind.

2020-10-02  Mark Eggleston  

/gcc/fortran

	PR fortran/96099
	* decl.c (gfc_match_implicit): Check for numeric and logical
	types.

2020-10-02  Mark Eggleston  

/gcc/testsuite

	PR fortran/96099
	* gfortran.dg/pr96099_1.f90: New test.
	* gfortran.dg/pr96099_2.f90: New test.
---
 gcc/fortran/decl.c  | 2 +-
 gcc/testsuite/gfortran.dg/pr96099_1.f90 | 8 
 gcc/testsuite/gfortran.dg/pr96099_2.f90 | 9 +
 3 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr96099_1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/pr96099_2.f90

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 326e6f5db7a..bddf69cce19 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -4835,7 +4835,7 @@ gfc_match_implicit (void)
   /* Last chance -- check   ().  */
   if (ts.type == BT_CHARACTER)
 	m = gfc_match_char_spec ();
-  else if (ts.type != BT_DERIVED)
+  else if (gfc_numeric_ts() || ts.type == BT_LOGICAL)
 	{
 	  m = gfc_match_kind_spec (, false);
 	  if (m == MATCH_NO)
diff --git a/gcc/testsuite/gfortran.dg/pr96099_1.f90 b/gcc/testsuite/gfortran.dg/pr96099_1.f90
new file mode 100644
index 000..9754bd39dfc
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr96099_1.f90
@@ -0,0 +1,8 @@
+! { dg-do compile }
+
+program pr96099_1
+   implicit class(t) (1) ! { dg-error "Syntax error in IMPLICIT" }
+   type t
+   end type
+end
+
diff --git a/gcc/testsuite/gfortran.dg/pr96099_2.f90 b/gcc/testsuite/gfortran.dg/pr96099_2.f90
new file mode 100644
index 000..3136d2ef377
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr96099_2.f90
@@ -0,0 +1,9 @@
+! { dg-do compile }
+
+program pr96099_2
+   integer n1
+   parameter (n1 = 1)
+   implicit class(t) (n1) ! { dg-error "Syntax error in IMPLICIT" }
+   type t
+   end type
+end
-- 
2.11.0



Re: [committed][testsuite] Enable pr94600-{1,3}.c tests for nvptx

2020-10-01 Thread Hans-Peter Nilsson
On Thu, 1 Oct 2020, Tom de Vries wrote:

> [ was: Re: [committed][testsuite] Re-enable pr94600-{1,3}.c tests for arm ]
>
> On 10/1/20 7:38 AM, Hans-Peter Nilsson wrote:
> > On Wed, 30 Sep 2020, Tom de Vries wrote:
> >> I've analyzed the compilation on strict-alignment target arm-eabi, and
> >
> > An analysis should result in more than that statement.
> >
>
> Well, it refers to the analysis in the commit log of the patch, sorry if
> that was not obvious.

Aha, I think I only saw your first commit, thanks.  Yes, that
looked more appropriate, but I would have preferred a proper
review to a commit-as-obvious (assuming that was the track when
nothing else was stated), since this was more than just target
gating or *trivial* missing predicates.

> Thanks for the pointer to pr94600-2.c.  I've compared the behaviour
> between pr94600-1.c and pr94600-2.c and figured out why in one case we
> get the load/store pair, and in the other the memcpy.  See rationale in
> commit below.

You may be on the right track judging from the commit log, and I
see my hunch about MOVE_RATIO wasn't far off either.  I guess I
should look into it too with rested eyes, but I'm happy with
this direction; not disabling the test for many targets.

Thanks for looking into it again.

brgds, H-P


Re: [PATCH] tree-optimization/97151 - improve PTA for C++ operator delete

2020-10-01 Thread Jason Merrill via Gcc-patches

On 10/1/20 5:26 AM, Richard Biener wrote:

On Wed, 30 Sep 2020, Jason Merrill wrote:


On 9/28/20 3:09 PM, Jason Merrill wrote:

On 9/28/20 3:56 AM, Richard Biener wrote:

On Fri, 25 Sep 2020, Jason Merrill wrote:


On 9/25/20 2:30 AM, Richard Biener wrote:

On Thu, 24 Sep 2020, Jason Merrill wrote:


On 9/24/20 3:43 AM, Richard Biener wrote:

On Wed, 23 Sep 2020, Jason Merrill wrote:


On 9/23/20 2:42 PM, Richard Biener wrote:

On September 23, 2020 7:53:18 PM GMT+02:00, Jason Merrill

wrote:

On 9/23/20 4:14 AM, Richard Biener wrote:

C++ operator delete, when DECL_IS_REPLACEABLE_OPERATOR_DELETE_P,
does not cause the deleted object to be escaped.? It also has no
other interesting side-effects for PTA so skip it like we do
for BUILT_IN_FREE.


Hmm, this is true of the default implementation, but since the
function

is replaceable, we don't know what a user definition might do with
the
pointer.


But can the object still be 'used' after delete? Can delete fail /
throw?

What guarantee does the predicate give us?


The deallocation function is called as part of a delete expression in
order
to
release the storage for an object, ending its lifetime (if it was not
ended
by
a destructor), so no, the object can't be used afterward.


OK, but the delete operator can access the object contents if there
wasn't a destructor ...



A deallocation function that throws has undefined behavior.


OK, so it seems the 'replaceable' operators are the global ones
(for user-defined/class-specific placement variants I see arbitrary
extra arguments that we'd possibly need to handle).

I'm happy to revert but I'd like to have a testcase that FAILs
with the patch ;)

Now, the following aborts:

struct X {
 static struct X saved;
 int *p;
 X() { __builtin_memcpy (this, , sizeof (X)); }
};
void operator delete (void *p)
{
 __builtin_memcpy (::saved, p, sizeof (X));
}
int main()
{
 int y = 1;
 X *p = new X;
 p->p = 
 delete p;
 X *q = new X;
 *(q->p) = 2;
 if (y != 2)
?? __builtin_abort ();
}

and I could fix this by not making *p but what *p points to escape.
The testcase is of course maximally awkward, but hey ... ;)

Now this would all be moot if operator delete may not access
the object (or if the object contents are undefined at that point).

Oh, and the testcase segfaults when compiled with GCC 10 because
there we elide the new X / delete p pair ... which is invalid then?
Hmm, we emit

 MEM[(struct X *)_8] ={v} {CLOBBER};
 operator delete (_8, 8);

so the object contents are undefined _before_ calling delete
even when I do not have a DTOR?? That is, the above,
w/o -fno-lifetime-dse, makes the PTA patch OK for the testcase.


Yes, all classes have a destructor, even if it's trivial, so the
object's
lifetime definitely ends before the call to operator delete. This is
less
clear for scalar objects, but treating them similarly would be
consistent
with
other recent changes, so I think it's fine for us to assume that scalar
objects are also invalidated before the call to operator delete.  But of
course this doesn't apply to explicit calls to operator delete outside
of a
delete expression.


OK, so change the testcase main slightly to

int main()
{
??? int y = 1;
??? X *p = new X;
??? p->p = 
??? ::operator delete(p);
??? X *q = new X;
??? *(q->p) = 2;
??? if (y != 2)
? __builtin_abort ();
}

in this case the lifetime of *p does not end before calling
::operator delete() and delete can stash the object contents
somewhere before ending its lifetime.? For the very same reason
we may not elide a new/delete pair like in

int main()
{
??? int *p = new int;
??? *p = 1;
??? ::operator delete (p);
}


Correct; the permission to elide new/delete pairs are for the expressions,
not
the functions.


which we before the change did not do only because calling
operator delete made p escape.? Unfortunately points-to analysis
cannot really reconstruct whether delete was called as part of
a delete expression or directly (and thus whether object lifetime
ended already), neither can DCE.? So I guess we need to mark
the operator delete call in some way to make those transforms
safe.? At least currently any operator delete call makes the
alias guarantee of a operator new call moot by forcing the object
to be aliased with all global and escaped memory ...

Looks like there are some unallocated flags for CALL_EXPR we could
pick but I wonder if we can recycle protected_flag which is

 CALL_FROM_THUNK_P and
 CALL_ALLOCA_FOR_VAR_P in
 CALL_EXPR

for calls to DECL_IS_OPERATOR_{NEW,DELETE}_P, thus whether
we have CALL_FROM_THUNK_P for those operators.? Guess picking
a new flag is safer.


We won't ever call those operators from a thunk, so it should be OK to
reuse
it.


But, does it seem correct that we need to distinguish
delete expressions from plain calls to operator delete?


A reason for that distinction came up in the context of omitting
new/delete
pairs: we want to 

Re: [RISC-V] Add support for AddressSanitizer on RISC-V GCC

2020-10-01 Thread Jim Wilson
On Tue, Aug 25, 2020 at 12:39 PM Jim Wilson  wrote:
> On Wed, Aug 19, 2020 at 1:02 AM Joshua via Gcc-patches
>  wrote:
> > * config/riscv/riscv.c (asan_shadow_offset): Implement the offset 
> > of asan shadow memory for risc-v.
> > (asan_shadow_offset): new macro definition.
>
> When I try the patch, I get asan errors complaining about memory mappings.

I tried looking at this again today.  I spent a few hours debugging
sanitizer code to see what is going on, and managed to convince myself
that the patch can't possibly work for riscv64-linux.  There are at
least two changes missing.  It also can't possibly work for
riscv32-linux.  There is at least one change missing.
libsanitizer/configure.tgt only supports riscv64-linux, so nothing
gets built for riscv32-linux.  I have no idea how you got this
working.  Maybe you forgot to send the entire patch?  If there is a
way to get it working, it would be good if you could describe in
detail how you got it working.

For riscv64-linux with these additional patches

hifiveu017:1192$ more tmp.file
diff --git a/libsanitizer/asan/asan_mapping.h b/libsanitizer/asan/asan_mapping.h
index 09be904270c..906fb1eebc8 100644
--- a/libsanitizer/asan/asan_mapping.h
+++ b/libsanitizer/asan/asan_mapping.h
@@ -164,6 +164,7 @@ static const u64 kAArch64_ShadowOffset64 = 1ULL << 36;
 static const u64 kMIPS32_ShadowOffset32 = 0x0aaa;
 static const u64 kMIPS64_ShadowOffset64 = 1ULL << 37;
 static const u64 kPPC64_ShadowOffset64 = 1ULL << 41;
+static const u64 kRISCV64_ShadowOffset64 = 1ULL << 36;
 static const u64 kSystemZ_ShadowOffset64 = 1ULL << 52;
 static const u64 kSPARC64_ShadowOffset64 = 1ULL << 43;  // 0x800
 static const u64 kFreeBSD_ShadowOffset32 = 1ULL << 30;  // 0x4000
@@ -210,6 +211,8 @@ static const u64 kMyriadCacheBitMask32 = 0x4000ULL;
 #define SHADOW_OFFSET kAArch64_ShadowOffset64
 #  elif defined(__powerpc64__)
 #define SHADOW_OFFSET kPPC64_ShadowOffset64
+#  elif defined(__riscv) && (__riscv_xlen == 64)
+#define SHADOW_OFFSET kRISCV64_ShadowOffset64
 #  elif defined(__s390x__)
 #define SHADOW_OFFSET kSystemZ_ShadowOffset64
 #  elif SANITIZER_FREEBSD
diff --git a/libsanitizer/sanitizer_common/sanitizer_linux.cpp b/libsanitizer/sa
nitizer_common/sanitizer_linux.cpp
index 11c03e286dc..962df07772e 100644
--- a/libsanitizer/sanitizer_common/sanitizer_linux.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
@@ -1048,6 +1048,8 @@ uptr GetMaxVirtualAddress() {
   return (1ULL << (MostSignificantSetBitIndex(GET_CURRENT_FRAME()) + 1)) - 1;
 # elif defined(__mips64)
   return (1ULL << 40) - 1;  // 0x00ffUL;
+# elif defined(__riscv) && (__riscv_xlen == 64)
+  return (1ULL << 39) - 1;  // 0x0080UL;
 # elif defined(__s390x__)
   return (1ULL << 53) - 1;  // 0x001fUL;
 #elif defined(__sparc__)
hifiveu017:1193$

I now get
==2936470==AddressSanitizer CHECK failed:
../../../../gcc-git/libsanitizer/sanitizer_common/sanitizer_allocator_primary64.h:76
"((kSpaceBeg)) == ((address_range.Init(TotalSpaceSize,
PrimaryAllocatorName, kSpaceBeg)))" (0x6000,
0x)

so there is still something missing but I think I am closer.

Jim


Re: [PATCH] Fix build of ppc64 target.

2020-10-01 Thread David Edelsohn via Gcc-patches
On Thu, Oct 1, 2020 at 8:02 PM Andrew MacLeod  wrote:
>
> On 10/1/20 5:30 PM, David Edelsohn wrote:
> >>> * config/rs6000/rs6000-call.c: Include value-range.h.
> >>> * config/rs6000/rs6000.c: Likewise.
> >> This is okay for trunk, thanks!  (It is trivial and obvious as well, so
> >> please just commit things like this without prior approval.)
> > This patch is not the correct long-term solution, as I explained to
> > Martin on IRC.  If it is approved as a work-around, it should be
> > stated that it is a band-aid.  Equally obvious is including
> > value-range.h in tree-ssa-propagate.h.
> >
> > The tree-ssa-propagate.h, value-query.h and value-range.h headers
> > currently are in an inconsistent state.
> >
> > GCC has worked to move towards a "flat" header files model to detangle
> > the header dependencies in GCC.  Most headers don't include other
> > headers.  In fact Andrew worked on the header reduction effort.
> >
> > As part of the recent Ranger infrastructure patches, Aldy included
> > value-query.h in tree-ssa-propagate.h, but value-query.h depends on
> > the irange type defined in value-range.h.  I presume that the other
> > uses of tree-ssa-propagate.h that refer to the irange methods also
> > include value-range.h from some other dependency.
> >
> > I don't know which solution Aldy and Andrew prefer.
> > tree-ssa-propagate.h could include value-range.h or users of
> > tree-ssa-propagate.h that need Ranger could include tree-query.h.  Or
> > tree-query.h needs to be self-contained and provide the irange type.
> > Or tree-query.h and tree-range.h need to be combined.  The current
> > interdependency of the headers does not seem like a wise choice.
> >
> > Thanks, David
> >
> Sorry David, I'm guessing its fixed for the moment. via workaround.
>
> Its not a problem in any other file because the other 29 files all
> include the core "ssa.h" header file, which includes tree-vrp.h, which
> is where value-range.h comes from.
>
> in fact, I see rs6000-call.c has:
>
> #include "tree-ssa-propagate.h"
> #include "tree-vrp.h"
> #include "tree-ssanames.h"
>
> which is out of order.  if we swap them to be:
>
> #include "tree-vrp.h"
> #include "tree-ssa-propagate.h"
> #include "tree-ssanames.h"
>
> the it compiles just fine.   Thats probably the right fix for the moment.
>
>   I see it includes a number of things that ssa.h brings in, so it could
> strategically be changed to:
>
> diff --git a/gcc/config/rs6000/rs6000-call.c
> b/gcc/config/rs6000/rs6000-call.c
> index a8b520834c7..1ae4df61af3 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -57,7 +57,7 @@
>   #include "gimplify.h"
>   #include "gimple-fold.h"
>   #include "gimple-iterator.h"
> -#include "gimple-ssa.h"
> +#include "ssa.h"
>   #include "builtins.h"
>   #include "tree-vector-builder.h"
>   #if TARGET_XCOFF
> @@ -65,8 +65,6 @@
>   #endif
>   #include "ppc-auxv.h"
>   #include "tree-ssa-propagate.h"
> -#include "tree-vrp.h"
> -#include "tree-ssanames.h"
>   #include "targhooks.h"
>   #include "opts.h"
>
>
> I think tree-vrp.h really needs to be flattened too. It shouldn't be
> including value-range.h, and that should be pushed into ssa.h as well.
> I think we'll probably be including value-query.h from the ssa.h header
> long term as well...   ssa.h should be a prerequisite for any file which
> uses tree-ssa-propagate.h...
>
> The rest of that wont be sorted out for a bit, I'm seeing tree-vrp.h is
> included a few places which ssa.h is not.. so there is a bit of
> shuffling to do to make the long term solution "work" :-P
>
> I guess its time to fix any bit-rot and rerun the tools to co-ordinate this.
>
> Sorry for the hassle.

Hi, Andrew

Thanks for investigating.  And I definitely am not suggesting that you
delay the great progress on Ranger to flatten and compact tree-vrp.h
and ssa.h immediately.

Inclusion of any header file in tree-ssa-propagate.h is new, which
surprised me because of the GCC strategy for headers.  As you and Aldy
continue to develop Ranger, I wanted to alert you to the fragility of
the current header design.  The rs6000 port is a very effective
canary!

Thanks, David


Re: [PATCH] Fix build of ppc64 target.

2020-10-01 Thread Andrew MacLeod via Gcc-patches

On 10/1/20 5:30 PM, David Edelsohn wrote:

* config/rs6000/rs6000-call.c: Include value-range.h.
* config/rs6000/rs6000.c: Likewise.

This is okay for trunk, thanks!  (It is trivial and obvious as well, so
please just commit things like this without prior approval.)

This patch is not the correct long-term solution, as I explained to
Martin on IRC.  If it is approved as a work-around, it should be
stated that it is a band-aid.  Equally obvious is including
value-range.h in tree-ssa-propagate.h.

The tree-ssa-propagate.h, value-query.h and value-range.h headers
currently are in an inconsistent state.

GCC has worked to move towards a "flat" header files model to detangle
the header dependencies in GCC.  Most headers don't include other
headers.  In fact Andrew worked on the header reduction effort.

As part of the recent Ranger infrastructure patches, Aldy included
value-query.h in tree-ssa-propagate.h, but value-query.h depends on
the irange type defined in value-range.h.  I presume that the other
uses of tree-ssa-propagate.h that refer to the irange methods also
include value-range.h from some other dependency.

I don't know which solution Aldy and Andrew prefer.
tree-ssa-propagate.h could include value-range.h or users of
tree-ssa-propagate.h that need Ranger could include tree-query.h.  Or
tree-query.h needs to be self-contained and provide the irange type.
Or tree-query.h and tree-range.h need to be combined.  The current
interdependency of the headers does not seem like a wise choice.

Thanks, David


Sorry David, I'm guessing its fixed for the moment. via workaround.

Its not a problem in any other file because the other 29 files all 
include the core "ssa.h" header file, which includes tree-vrp.h, which 
is where value-range.h comes from.


in fact, I see rs6000-call.c has:

#include "tree-ssa-propagate.h"
#include "tree-vrp.h"
#include "tree-ssanames.h"

which is out of order.  if we swap them to be:

#include "tree-vrp.h"
#include "tree-ssa-propagate.h"
#include "tree-ssanames.h"

the it compiles just fine.   Thats probably the right fix for the moment.

 I see it includes a number of things that ssa.h brings in, so it could 
strategically be changed to:


diff --git a/gcc/config/rs6000/rs6000-call.c 
b/gcc/config/rs6000/rs6000-call.c

index a8b520834c7..1ae4df61af3 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -57,7 +57,7 @@
 #include "gimplify.h"
 #include "gimple-fold.h"
 #include "gimple-iterator.h"
-#include "gimple-ssa.h"
+#include "ssa.h"
 #include "builtins.h"
 #include "tree-vector-builder.h"
 #if TARGET_XCOFF
@@ -65,8 +65,6 @@
 #endif
 #include "ppc-auxv.h"
 #include "tree-ssa-propagate.h"
-#include "tree-vrp.h"
-#include "tree-ssanames.h"
 #include "targhooks.h"
 #include "opts.h"


I think tree-vrp.h really needs to be flattened too. It shouldn't be 
including value-range.h, and that should be pushed into ssa.h as well.
I think we'll probably be including value-query.h from the ssa.h header 
long term as well...   ssa.h should be a prerequisite for any file which 
uses tree-ssa-propagate.h...


The rest of that wont be sorted out for a bit, I'm seeing tree-vrp.h is 
included a few places which ssa.h is not.. so there is a bit of 
shuffling to do to make the long term solution "work" :-P


I guess its time to fix any bit-rot and rerun the tools to co-ordinate this.

Sorry for the hassle.

Andrew





[PATCH] libstdc++: Add C++2a synchronization support

2020-10-01 Thread Thomas Rodgers
From: Thomas Rodgers 

Updated patch incorporating latest feedback.

Add support for -
  * atomic_flag::wait/notify_one/notify_all
  * atomic::wait/notify_one/notify_all
  * counting_semaphore
  * binary_semaphore
  * latch

libstdc++-v3/ChangeLog:

* include/Makefile.am (bits_headers): Add new header.
* include/Makefile.in: Regenerate.
* include/bits/atomic_base.h (__atomic_flag::wait): Define.
(__atomic_flag::notify_one): Likewise.
(__atomic_flag::notify_all): Likewise.
(__atomic_base<_Itp>::wait): Likewise.
(__atomic_base<_Itp>::notify_one): Likewise.
(__atomic_base<_Itp>::notify_all): Likewise.
(__atomic_base<_Ptp*>::wait): Likewise.
(__atomic_base<_Ptp*>::notify_one): Likewise.
(__atomic_base<_Ptp*>::notify_all): Likewise.
(__atomic_impl::wait): Likewise.
(__atomic_impl::notify_one): Likewise.
(__atomic_impl::notify_all): Likewise.
(__atomic_float<_Fp>::wait): Likewise.
(__atomic_float<_Fp>::notify_one): Likewise.
(__atomic_float<_Fp>::notify_all): Likewise.
(__atomic_ref<_Tp>::wait): Likewise.
(__atomic_ref<_Tp>::notify_one): Likewise.
(__atomic_ref<_Tp>::notify_all): Likewise.
(atomic_wait<_Tp>): Likewise.
(atomic_wait_explicit<_Tp>): Likewise.
(atomic_notify_one<_Tp>): Likewise.
(atomic_notify_all<_Tp>): Likewise.
* include/bits/atomic_wait.h: New file.
* include/bits/atomic_timed_wait.h: New file.
* include/bits/semaphore_base.h: New file.
* include/std/atomic (atomic::wait): Define.
(atomic::wait_one): Likewise.
(atomic::wait_all): Likewise.
(atomic<_Tp>::wait): Likewise.
(atomic<_Tp>::wait_one): Likewise.
(atomic<_Tp>::wait_all): Likewise.
(atomic<_Tp*>::wait): Likewise.
(atomic<_Tp*>::wait_one): Likewise.
(atomic<_Tp*>::wait_all): Likewise.
* include/std/latch: New file.
* include/std/semaphore: New file.
* include/std/version: Add __cpp_lib_semaphore and
__cpp_lib_latch defines.
* testsuite/29_atomic/atomic/wait_notify/atomic_refs.cc: New test.
* testsuite/29_atomic/atomic/wait_notify/bool.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/integrals.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/floats.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/generic.cc: Liekwise.
* testsuite/29_atomic/atomic/wait_notify/generic.h: New File.
* testsuite/29_atomics/atomic_flag/wait_notify/1.cc: New test.
* testsuite/30_thread/semaphore/1.cc: New test.
* testsuite/30_thread/semaphore/2.cc: Likewise.
* testsuite/30_thread/semaphore/least_max_value_neg.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire_for.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire_posix.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire_until.cc: Likewise.
* testsuite/30_thread/latch/1.cc: New test.
* testsuite/30_thread/latch/2.cc: New test.
* testsuite/30_thread/latch/3.cc: New test.
---
 libstdc++-v3/include/Makefile.am  |   5 +
 libstdc++-v3/include/Makefile.in  |   5 +
 libstdc++-v3/include/bits/atomic_base.h   | 195 +++-
 libstdc++-v3/include/bits/atomic_timed_wait.h | 281 
 libstdc++-v3/include/bits/atomic_wait.h   | 301 ++
 libstdc++-v3/include/bits/semaphore_base.h| 283 
 libstdc++-v3/include/std/atomic   |  73 +
 libstdc++-v3/include/std/latch|  90 ++
 libstdc++-v3/include/std/semaphore|  92 ++
 libstdc++-v3/include/std/version  |   2 +
 .../atomic/wait_notify/atomic_refs.cc | 103 ++
 .../29_atomics/atomic/wait_notify/bool.cc |  59 
 .../29_atomics/atomic/wait_notify/floats.cc   |  32 ++
 .../29_atomics/atomic/wait_notify/generic.cc  |  31 ++
 .../29_atomics/atomic/wait_notify/generic.h   | 160 ++
 .../atomic/wait_notify/integrals.cc   |  65 
 .../29_atomics/atomic/wait_notify/pointers.cc |  59 
 .../29_atomics/atomic_flag/wait_notify/1.cc   |  61 
 libstdc++-v3/testsuite/30_threads/latch/1.cc  |  27 ++
 libstdc++-v3/testsuite/30_threads/latch/2.cc  |  27 ++
 libstdc++-v3/testsuite/30_threads/latch/3.cc  |  50 +++
 .../testsuite/30_threads/semaphore/1.cc   |  27 ++
 .../testsuite/30_threads/semaphore/2.cc   |  27 ++
 .../semaphore/least_max_value_neg.cc  |  30 ++
 .../30_threads/semaphore/try_acquire.cc   |  55 
 .../30_threads/semaphore/try_acquire_for.cc   |  85 +
 .../30_threads/semaphore/try_acquire_posix.cc | 153 +
 .../30_threads/semaphore/try_acquire_until.cc |  94 

Go patch committed: Set varargs correctly for type of method expression

2020-10-01 Thread Ian Lance Taylor via Gcc-patches
This Go frontend patch set varargs correctly for the type of method
expression.  This fixes https://golang.org/issue/41737.  Bootstrapped
and ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline
and GCC 10 branch.

Ian
8e23cd3a2d23ad851938bf7015fc97539d65a8c6
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 8d9fda54619..94827406df1 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-c9c084bce713e258721e12041a351ec8ad33ad17
+801c458a562d22260ff176c26d65639dd32c8a90
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/types.cc b/gcc/go/gofrontend/types.cc
index 7f65b4a5db2..e7a742f6366 100644
--- a/gcc/go/gofrontend/types.cc
+++ b/gcc/go/gofrontend/types.cc
@@ -5350,8 +5350,12 @@ Function_type::copy_with_receiver_as_param(bool 
want_pointer_receiver) const
   ++p)
new_params->push_back(*p);
 }
-  return Type::make_function_type(NULL, new_params, this->results_,
- this->location_);
+  Function_type* ret = Type::make_function_type(NULL, new_params,
+   this->results_,
+   this->location_);
+  if (this->is_varargs_)
+ret->set_is_varargs();
+  return ret;
 }
 
 // Make a copy of a function type ignoring any receiver and adding a
diff --git a/libgo/go/reflect/all_test.go b/libgo/go/reflect/all_test.go
index ee37359814b..68efab6e145 100644
--- a/libgo/go/reflect/all_test.go
+++ b/libgo/go/reflect/all_test.go
@@ -2396,8 +2396,14 @@ func TestVariadicMethodValue(t *testing.T) {
points := []Point{{20, 21}, {22, 23}, {24, 25}}
want := int64(p.TotalDist(points[0], points[1], points[2]))
 
+   // Variadic method of type.
+   tfunc := TypeOf((func(Point, ...Point) int)(nil))
+   if tt := TypeOf(p).Method(4).Type; tt != tfunc {
+   t.Errorf("Variadic Method Type from TypeOf is %s; want %s", tt, 
tfunc)
+   }
+
// Curried method of value.
-   tfunc := TypeOf((func(...Point) int)(nil))
+   tfunc = TypeOf((func(...Point) int)(nil))
v := ValueOf(p).Method(4)
if tt := v.Type(); tt != tfunc {
t.Errorf("Variadic Method Type is %s; want %s", tt, tfunc)


Re: [PATCH 1/9] PowerPC: Map long double built-in functions if IEEE 128-bit long double.

2020-10-01 Thread Joseph Myers
On Thu, 24 Sep 2020, Michael Meissner via Gcc-patches wrote:

> To map the math functions, typically this patch changes l to f128.
> However there are some exceptions that are handled with this patch.

glibc 2.32 added __*ieee128 names for the *f128 functions, to allow the 
long double functions to be called in a namespace-clean way (when the 
defined feature test macros do not enable the TS 18661-3 function names). 
So I think GCC should also prefer to map to those names where possible, 
rather than the *f128 names.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [RFC] Offloading and automatic linking of libraries

2020-10-01 Thread Joseph Myers
On Thu, 24 Sep 2020, Tobias Burnus wrote:

> Hi all,
> 
> we got the user comment that it is far from obvious to
> use  -foffload=-latomic if the following error shows up:
> 
> unresolved symbol __atomic_compare_exchange_16
> collect2: error: ld returned 1 exit status
> mkoffload: fatal error: /powerpc64le-none-linux-gnu-accel-nvptx-none-gcc
> returned 1 exit status
> 
> In principle, the same issue pops up with -lm and -lgfortran,
> which on the host are automatically linked for g++ (-lm) and
> for gfortran (both) but not gcc.

As discussed in bug 81358, I think --as-needed -latomic --no-as-needed 
should be used by the driver by default (when the compiler is configured 
with libatomic supported).  The same ought to apply for the offloading 
libatomic as well: it should be linked in automatically when needed.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [RS6000] ICE in decompose, at rtl.h:2282

2020-10-01 Thread Segher Boessenkool
Hi!

On Thu, Oct 01, 2020 at 11:03:37PM +0930, Alan Modra wrote:
> during RTL pass: fwprop1
> gcc.dg/pr82596.c: In function 'test_cststring':
> gcc.dg/pr82596.c:27:1: internal compiler error: in decompose, at rtl.h:2282
> 
> -m32 gcc/testsuite/gcc.dg/pr82596.c fails along with other tests after
> applying rtx_cost patches, which exposed a backend bug.
> 
> legitimize_address when presented with the following address
>   (plus (reg) (const_int 0x7))
> attempts to rewrite it as a high/low sum.  The low part is 0x, or
> -1, making the high part 0x8000.  But this is no longer canonical
> for SImode.

Yes, you can in general not just do GEN_INT on stuff you did arithmetic
on.  Nice catch :-)

Okay for trunk, and backports if possible after a day or two.  Thanks!

>   * config/rs6000/rs6000.c (rs6000_legitimize_address): Properly
>   sign extend high part of address constant.

I would just say "use gen_int_mode" ;-)  But this is fine, sure.


Segher


Re: [RS6000] -mno-minimal-toc vs. power10 pcrelative

2020-10-01 Thread Segher Boessenkool
Hi Alan,

On Fri, Oct 02, 2020 at 07:06:46AM +0930, Alan Modra wrote:
> > > I was looking at it again today
> > > with the aim of converting this ugly macro to a function, and spotted
> > > the duplication in freebsd64.h.  Which has some bit-rot.
> > > 
> > > Do you like the following?  rs6000_linux64_override_options is
> > > functionally the same as what was in linux64.h.  I built various
> > > configurations to test the change, powerpc64-linux, powerpc64le-linux
> > > without any 32-bit targets enabled, powerpc64-freebsd12.0.
> > 
> > Please do this as two patches?  One the refactoring without any
> > functional changes (which is pre-approved -- the name "linux64" isn't
> > great if you use it in other OSes as well, but I cannot think of a
> > better name either),
> 
> The patch as posted has no functional changes.

Ah.  You said the freebsd one had some bitrot, and I couldn't spot that
easily.  But in the actual patch you are just throwing away all of the
freebsd stuff here, and the new, shared implementation is exactly what
the linux one was?  That is fine (I hope :-) ).

> I do have a followup patch..  Commit c6be439b37 wrongly left a block
> of code inside and "else" block, which changed the default for power10
> TARGET_NO_FP_IN_TOC accidentally.  We don't want FP constants in the
> TOC when -mcmodel=medium can address them just as efficiently outside
> the TOC.
> 
>   * config/rs6000/rs6000.c (rs6000_linux64_override_options):
>   Formatting.  Correct setting of TARGET_NO_FP_IN_TOC and
>   TARGET_NO_SUM_IN_TOC.

Okay for trunk.  Thanks!


Segher


[PATCH] c++: Fix printing of C++20 template parameter object [PR97014]

2020-10-01 Thread Marek Polacek via Gcc-patches
No one is interested in the mangled name of the C++20 template parameter
object for a class NTTP.  So instead of printing

  required for the satisfaction of ‘positive’ [with T = 
X<::_ZTAXtl5ratioLin1ELi2EEE>]

let's print

  required for the satisfaction of ‘positive’ [with T = X<{-1, 2}>]

I don't think adding a test is necessary for this.

gcc/cp/ChangeLog:

PR c++/97014
* cxx-pretty-print.c (pp_cxx_template_argument_list): If the
argument is template_parm_object_p, print its DECL_INITIAL.
---
 gcc/cp/cxx-pretty-print.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/cp/cxx-pretty-print.c b/gcc/cp/cxx-pretty-print.c
index d10c18db039..8bea79b93a2 100644
--- a/gcc/cp/cxx-pretty-print.c
+++ b/gcc/cp/cxx-pretty-print.c
@@ -1910,6 +1910,8 @@ pp_cxx_template_argument_list (cxx_pretty_printer *pp, 
tree t)
  if (TYPE_P (arg) || (TREE_CODE (arg) == TEMPLATE_DECL
   && TYPE_P (DECL_TEMPLATE_RESULT (arg
pp->type_id (arg);
+ else if (template_parm_object_p (arg))
+   pp->expression (DECL_INITIAL (arg));
  else
pp->expression (arg);
}

base-commit: dfaa24c974bab4bc1bd3840d67ca1701acc0010c
-- 
2.26.2



Re: [RS6000] -mno-minimal-toc vs. power10 pcrelative

2020-10-01 Thread Alan Modra via Gcc-patches
Hi Segher,
On Thu, Oct 01, 2020 at 01:22:07PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Oct 01, 2020 at 10:57:48PM +0930, Alan Modra wrote:
> > On Wed, Sep 30, 2020 at 03:56:32PM -0500, Segher Boessenkool wrote:
> > > On Wed, Sep 30, 2020 at 05:01:45PM +0930, Alan Modra wrote:
> > > > * config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't
> > > > set -mcmodel=small for -mno-minimal-toc when pcrel.
> > > 
> > > > - SET_CMODEL (CMODEL_SMALL);\
> > > > + if (TARGET_MINIMAL_TOC\
> > > > + || !(TARGET_PCREL \
> > > > +  || (PCREL_SUPPORTED_BY_OS\
> > > > +  && (rs6000_isa_flags_explicit\
> > > > +  & OPTION_MASK_PCREL) == 0))) \
> > > > +   SET_CMODEL (CMODEL_SMALL);  \
> > > 
> > > Please write this in a more readable way?  With some "else" statements,
> > > perhaps.
> > > 
> > > It is also fine to SET_CMODEL twice if that makes for simpler code.
> > 
> > Committed as per your suggestion.
> 
> Thanks.
> 
> > I was looking at it again today
> > with the aim of converting this ugly macro to a function, and spotted
> > the duplication in freebsd64.h.  Which has some bit-rot.
> > 
> > Do you like the following?  rs6000_linux64_override_options is
> > functionally the same as what was in linux64.h.  I built various
> > configurations to test the change, powerpc64-linux, powerpc64le-linux
> > without any 32-bit targets enabled, powerpc64-freebsd12.0.
> 
> Please do this as two patches?  One the refactoring without any
> functional changes (which is pre-approved -- the name "linux64" isn't
> great if you use it in other OSes as well, but I cannot think of a
> better name either),

The patch as posted has no functional changes.  I even avoided
formatting changes as much as possible.  The only changes were those
necessary to use the code from linux64.h in a powerpc64-freebsd
compiler, where

#define TARGET_PROFILE_KERNEL 0
..
TARGET_PROFILE_KERNEL = 0;
doesn't work, nor does

if (!RS6000_BI_ARCH_P)
  error (INVALID_32BIT, "32");
when RS6000_BI_ARCH_P is undefined.

> and the other the actual change (which probably is
> fine as well, but it is hard to see with the patch like this).

I do have a followup patch..  Commit c6be439b37 wrongly left a block
of code inside and "else" block, which changed the default for power10
TARGET_NO_FP_IN_TOC accidentally.  We don't want FP constants in the
TOC when -mcmodel=medium can address them just as efficiently outside
the TOC.

* config/rs6000/rs6000.c (rs6000_linux64_override_options):
Formatting.  Correct setting of TARGET_NO_FP_IN_TOC and
TARGET_NO_SUM_IN_TOC.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 48f3cdec440..a1651551ff2 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3493,8 +3493,7 @@ rs6000_linux64_override_options ()
}
   if (!global_options_set.x_rs6000_current_cmodel)
SET_CMODEL (CMODEL_MEDIUM);
-  if ((rs6000_isa_flags_explicit
-  & OPTION_MASK_MINIMAL_TOC) != 0)
+  if ((rs6000_isa_flags_explicit & OPTION_MASK_MINIMAL_TOC) != 0)
{
  if (global_options_set.x_rs6000_current_cmodel
  && rs6000_current_cmodel != CMODEL_SMALL)
@@ -3503,23 +3502,18 @@ rs6000_linux64_override_options ()
SET_CMODEL (CMODEL_SMALL);
  else if (TARGET_PCREL
   || (PCREL_SUPPORTED_BY_OS
-  && (rs6000_isa_flags_explicit
-  & OPTION_MASK_PCREL) == 0))
+  && (rs6000_isa_flags_explicit & OPTION_MASK_PCREL) == 0))
/* Ignore -mno-minimal-toc.  */
;
  else
SET_CMODEL (CMODEL_SMALL);
}
-  else
+  if (rs6000_current_cmodel != CMODEL_SMALL)
{
- if (rs6000_current_cmodel != CMODEL_SMALL)
-   {
- if (!global_options_set.x_TARGET_NO_FP_IN_TOC)
-   TARGET_NO_FP_IN_TOC
- = rs6000_current_cmodel == CMODEL_MEDIUM;
- if (!global_options_set.x_TARGET_NO_SUM_IN_TOC)
-   TARGET_NO_SUM_IN_TOC = 0;
-   }
+ if (!global_options_set.x_TARGET_NO_FP_IN_TOC)
+   TARGET_NO_FP_IN_TOC = rs6000_current_cmodel == CMODEL_MEDIUM;
+ if (!global_options_set.x_TARGET_NO_SUM_IN_TOC)
+   TARGET_NO_SUM_IN_TOC = 0;
}
   if (TARGET_PLTSEQ && DEFAULT_ABI != ABI_ELFv2)
{


-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Fix build of ppc64 target.

2020-10-01 Thread David Edelsohn via Gcc-patches
> > * config/rs6000/rs6000-call.c: Include value-range.h.
> > * config/rs6000/rs6000.c: Likewise.
>
> This is okay for trunk, thanks!  (It is trivial and obvious as well, so
> please just commit things like this without prior approval.)

This patch is not the correct long-term solution, as I explained to
Martin on IRC.  If it is approved as a work-around, it should be
stated that it is a band-aid.  Equally obvious is including
value-range.h in tree-ssa-propagate.h.

The tree-ssa-propagate.h, value-query.h and value-range.h headers
currently are in an inconsistent state.

GCC has worked to move towards a "flat" header files model to detangle
the header dependencies in GCC.  Most headers don't include other
headers.  In fact Andrew worked on the header reduction effort.

As part of the recent Ranger infrastructure patches, Aldy included
value-query.h in tree-ssa-propagate.h, but value-query.h depends on
the irange type defined in value-range.h.  I presume that the other
uses of tree-ssa-propagate.h that refer to the irange methods also
include value-range.h from some other dependency.

I don't know which solution Aldy and Andrew prefer.
tree-ssa-propagate.h could include value-range.h or users of
tree-ssa-propagate.h that need Ranger could include tree-query.h.  Or
tree-query.h needs to be self-contained and provide the irange type.
Or tree-query.h and tree-range.h need to be combined.  The current
interdependency of the headers does not seem like a wise choice.

Thanks, David


Re: [PATCH v2] builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-10-01 Thread Segher Boessenkool
On Thu, Oct 01, 2020 at 08:08:01AM +0200, Richard Biener wrote:
> On Wed, 30 Sep 2020, Segher Boessenkool wrote:
> 
> > On Wed, Sep 30, 2020 at 09:02:34AM +0200, Richard Biener wrote:
> > > On Tue, 29 Sep 2020, Segher Boessenkool wrote:
> > > > I don't see much about optabs in the docs either.  Add some text to
> > > > optabs.def itself then?
> > > 
> > > All optabs are documented in doc/md.texi as 'instruction patterns'
> > 
> > Except for what seems to be the majority that isn't.
> 
> Really?  Everytime I looked for one I found it.

Yeah, all the obvious ones are documented of course.  I only looked at
the non-obvious ones now, to see where these fe* should go, and maybe I
just had some unlucky picks, but many of those are not documented it
seems.  "The majority are not" is a gross exaggeration ;-)

> > > This is where new optabs need to be documented.
> > 
> > It's going to be challenging to find a reasonable spot in there.
> > Oh well.
> 
> Put it next to fmin/fmax docs or sin, etc. - at least the section
> should be clear ;)  But yeah, patterns seem to be quite randomly
> "sorted"...

And no chapter toc etc.

Yeah, this should just go with the other fp things, of course.  Duh.
Thanks!


Segher


c++: Kill DECL_HIDDEN_P

2020-10-01 Thread Nathan Sidwell

There are only a couple of asserts remaining using this macro, and
nothing using TYPE_HIDDEN_P.  Killed thusly.

gcc/cp/
* cp-tree.h (DECL_ANTICIPATED): Adjust comment.
(DECL_HIDDEN_P, TYPE_HIDDEN_P): Delete.
* tree.c (ovl_insert): Delete DECL_HIDDEN_P assert.
(ovl_skip_hidden): Likewise.

pushing to trunk

nathan

--
Nathan Sidwell
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 48a4074b370..3ccd54ce24b 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -4045,22 +4045,11 @@ more_aggr_init_expr_args_p (const aggr_init_expr_arg_iterator *iter)
 
 /* Nonzero if NODE is a DECL which we know about but which has not
been explicitly declared, such as a built-in function or a friend
-   declared inside a class.  In the latter case DECL_HIDDEN_FRIEND_P
-   will be set.  */
+   declared inside a class.  */
 #define DECL_ANTICIPATED(NODE) \
   (DECL_LANG_SPECIFIC (TYPE_FUNCTION_OR_TEMPLATE_DECL_CHECK (NODE)) \
->u.base.anticipated_p)
 
-/* Is DECL NODE a hidden name?  */
-#define DECL_HIDDEN_P(NODE) \
-  (DECL_LANG_SPECIFIC (NODE) && TYPE_FUNCTION_OR_TEMPLATE_DECL_P (NODE) \
-   && DECL_ANTICIPATED (NODE))
-
-/* True if this is a hidden class type.*/
-#define TYPE_HIDDEN_P(NODE) \
-  (DECL_LANG_SPECIFIC (TYPE_NAME (NODE)) \
-   && DECL_ANTICIPATED (TYPE_NAME (NODE)))
-
 /* True for artificial decls added for OpenMP privatized non-static
data members.  */
 #define DECL_OMP_PRIVATIZED_MEMBER(NODE) \
diff --git i/gcc/cp/tree.c w/gcc/cp/tree.c
index 0b80d8ed408..8b7c6798ee9 100644
--- i/gcc/cp/tree.c
+++ w/gcc/cp/tree.c
@@ -2261,8 +2261,6 @@ ovl_insert (tree fn, tree maybe_ovl, int using_or_hidden)
 {
   maybe_ovl = ovl_make (fn, maybe_ovl);
 
-  gcc_checking_assert ((using_or_hidden < 0) == DECL_HIDDEN_P (fn));
-
   if (using_or_hidden < 0)
 	OVL_HIDDEN_P (maybe_ovl) = true;
   if (using_or_hidden > 0)
@@ -2287,14 +2285,8 @@ ovl_insert (tree fn, tree maybe_ovl, int using_or_hidden)
 tree
 ovl_skip_hidden (tree ovl)
 {
-  for (;
-   ovl && TREE_CODE (ovl) == OVERLOAD && OVL_HIDDEN_P (ovl);
-   ovl = OVL_CHAIN (ovl))
-gcc_checking_assert (DECL_HIDDEN_P (OVL_FUNCTION (ovl)));
-
-  /* We should not see a naked hidden decl.  */
-  gcc_checking_assert (!(ovl && TREE_CODE (ovl) != OVERLOAD
-			 && DECL_HIDDEN_P (ovl)));
+  while (ovl && TREE_CODE (ovl) == OVERLOAD && OVL_HIDDEN_P (ovl))
+ovl = OVL_CHAIN (ovl);
 
   return ovl;
 }


[committed][nvptx] Emit mov.u32 instead of cvt.u32.u32 for truncsiqi2

2020-10-01 Thread Tom de Vries
Hi,

When running:
...
$ gcc.sh src/gcc/testsuite/gcc.target/nvptx/abi-complex-arg.c -S -dP
...
we have in abi-complex-arg.s:
...
//(insn 3 5 4 2
//  (set
//(reg:QI 23)
//(truncate:QI (reg:SI 22))) "abi-complex-arg.c":38:1 29 {truncsiqi2}
//  (nil))
cvt.u32.u32 %r23, %r22; // 3[c=4]  truncsiqi2/0
...

The cvt.u32.u32 can be written shorter and clearer as mov.u32.

Fix this in define_insn "truncsi2".

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[nvptx] Emit mov.u32 instead of cvt.u32.u32 for truncsiqi2

gcc/ChangeLog:

2020-10-01  Tom de Vries  

PR target/80845
* config/nvptx/nvptx.md (define_insn "truncsi2"): Emit mov.u32
instead of cvt.u32.u32.

---
 gcc/config/nvptx/nvptx.md | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 035f6e0151b..ccbcd096fd1 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -383,9 +383,13 @@ (define_insn "truncsi2"
   [(set (match_operand:QHIM 0 "nvptx_nonimmediate_operand" "=R,m")
(truncate:QHIM (match_operand:SI 1 "nvptx_register_operand" "R,R")))]
   ""
-  "@
-   %.\\tcvt%t0.u32\\t%0, %1;
-   %.\\tst%A0.u%T0\\t%0, %1;"
+  {
+if (which_alternative == 1)
+  return "%.\\tst%A0.u%T0\\t%0, %1;";
+if (GET_MODE (operands[0]) == QImode)
+  return "%.\\tmov%t0\\t%0, %1;";
+return "%.\\tcvt%t0.u32\\t%0, %1;";
+  }
   [(set_attr "subregs_ok" "true")])
 
 (define_insn "truncdi2"


Re: [PATCH] Fix build of ppc64 target.

2020-10-01 Thread Segher Boessenkool
On Thu, Oct 01, 2020 at 08:59:12PM +0200, Martin Liška wrote:
> Since a889e06ac68 the following fails.
> 
> In file included from ../../gcc/tree-ssa-propagate.h:25:0,
>  from ../../gcc/config/rs6000/rs6000.c:78:
> ../../gcc/value-query.h:90:31: error: ‘irange’ has not been declared
>virtual bool range_of_expr (irange , tree name, gimple * = NULL) = 0;
>^~
> ../../gcc/value-query.h:91:31: error: ‘irange’ has not been declared
>virtual bool range_on_edge (irange , edge, tree name);
>^~
> ../../gcc/value-query.h:92:31: error: ‘irange’ has not been declared
>virtual bool range_of_stmt (irange , gimple *, tree name = NULL);
>^~
> In file included from ../../gcc/tree-ssa-propagate.h:25:0,
>  from ../../gcc/config/rs6000/rs6000-call.c:67:
> ../../gcc/value-query.h:90:31: error: ‘irange’ has not been declared
>virtual bool range_of_expr (irange , tree name, gimple * = NULL) = 0;
>^~
> ../../gcc/value-query.h:91:31: error: ‘irange’ has not been declared
>virtual bool range_on_edge (irange , edge, tree name);
>^~
> ../../gcc/value-query.h:92:31: error: ‘irange’ has not been declared
>virtual bool range_of_stmt (irange , gimple *, tree name = NULL);
> 
> Ready for master?

This is okay for trunk, thanks!  (It is trivial and obvious as well, so
please just commit things like this without prior approval.)


Segher


>   * config/rs6000/rs6000-call.c: Include value-range.h.
>   * config/rs6000/rs6000.c: Likewise.


GCC 10 backports

2020-10-01 Thread Martin Liška

I'm going to install the following 3 tested backports.

Martin
>From 0d91a9613ca1c4b8b11d668a1b8e1a6a37c41b7a Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 25 Sep 2020 16:21:34 +0200
Subject: [PATCH 3/3] gcov: fix streaming of HIST_TYPE_IOR histogram type.

gcc/ChangeLog:

	PR gcov-profile/64636
	* value-prof.c (stream_out_histogram_value): Allow negative
	values for HIST_TYPE_IOR.

(cherry picked from commit 1921ebcaf6467996aede69e1bbe32400d8a20fe7)
---
 gcc/value-prof.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/value-prof.c b/gcc/value-prof.c
index 45677be46b1..9d60b56c43a 100644
--- a/gcc/value-prof.c
+++ b/gcc/value-prof.c
@@ -332,7 +332,10 @@ stream_out_histogram_value (struct output_block *ob, histogram_value hist)
   /* When user uses an unsigned type with a big value, constant converted
 	 to gcov_type (a signed type) can be negative.  */
   gcov_type value = hist->hvalue.counters[i];
-  if (hist->type == HIST_TYPE_TOPN_VALUES)
+  if (hist->type == HIST_TYPE_TOPN_VALUES
+	  || hist->type == HIST_TYPE_IOR)
+	/* Note that the IOR counter tracks pointer values and these can have
+	   sign bit set.  */
 	;
   else
 	gcc_assert (value >= 0);
-- 
2.28.0

>From 4c2be1627f6b78dff9209b979b26080f5b929d89 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 24 Sep 2020 13:34:13 +0200
Subject: [PATCH 2/3] switch conversion: make a rapid speed up

gcc/ChangeLog:

	PR tree-optimization/96979
	* tree-switch-conversion.c (jump_table_cluster::can_be_handled):
	Make a fast bail out.
	(bit_test_cluster::can_be_handled): Likewise here.
	* tree-switch-conversion.h (get_range): Use wi::to_wide instead
	of a folding.

gcc/testsuite/ChangeLog:

	PR tree-optimization/96979
	* g++.dg/tree-ssa/pr96979.C: New test.

(cherry picked from commit e46858e445d35ca4a7df1996186fe884879b)
---
 gcc/testsuite/g++.dg/tree-ssa/pr96979.C | 48 +
 gcc/tree-switch-conversion.c| 37 ++-
 gcc/tree-switch-conversion.h|  7 ++--
 3 files changed, 79 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr96979.C

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr96979.C b/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
new file mode 100644
index 000..ec0f57a8548
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr96979.C
@@ -0,0 +1,48 @@
+/* PR tree-optimization/96979 */
+/* { dg-do compile } */
+/* { dg-options "-std=c++17 -O2" } */
+
+using u64 = unsigned long long;
+
+constexpr inline u64
+foo (const char *str) noexcept
+{
+  u64 value = 0xcbf29ce484222325ULL;
+  for (u64 i = 0; str[i]; i++)
+value = (value ^ u64(str[i])) * 0x10001b3ULL;
+  return value;
+}
+
+struct V
+{
+  enum W
+  {
+#define A(n) n,
+#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
+#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
+#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
+#define E D(foo1) D(foo2) D(foo3)
+E
+last
+  };
+
+  constexpr static W
+  bar (const u64 h) noexcept
+  {
+switch (h)
+  {
+#undef A
+#define F(n) #n
+#define A(n) case foo (F(n)): return n;
+E
+  }
+return last;
+  }
+};
+
+int
+baz (const char *s)
+{
+  const u64 h = foo (s);
+  return V::bar (h);
+}
diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index bf910dd62b5..8da1be1cd99 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -1271,6 +1271,18 @@ jump_table_cluster::can_be_handled (const vec ,
   if (range == 0)
 return false;
 
+  if (range > HOST_WIDE_INT_M1U / 100)
+return false;
+
+  unsigned HOST_WIDE_INT lhs = 100 * range;
+  if (lhs < range)
+return false;
+
+  /* First make quick guess as each cluster
+ can add at maximum 2 to the comparison_count.  */
+  if (lhs > 2 * max_ratio * (end - start + 1))
+return false;
+
   unsigned HOST_WIDE_INT comparison_count = 0;
   for (unsigned i = start; i <= end; i++)
 {
@@ -1278,10 +1290,6 @@ jump_table_cluster::can_be_handled (const vec ,
   comparison_count += sc->m_range_p ? 2 : 1;
 }
 
-  unsigned HOST_WIDE_INT lhs = 100 * range;
-  if (lhs < range)
-return false;
-
   return lhs <= max_ratio * comparison_count;
 }
 
@@ -1367,12 +1375,12 @@ bit_test_cluster::can_be_handled (unsigned HOST_WIDE_INT range,
 {
   /* Check overflow.  */
   if (range == 0)
-return 0;
+return false;
 
   if (range >= GET_MODE_BITSIZE (word_mode))
 return false;
 
-  return uniq <= 3;
+  return uniq <= m_max_case_bit_tests;
 }
 
 /* Return true when cluster starting at START and ending at END (inclusive)
@@ -1382,6 +1390,7 @@ bool
 bit_test_cluster::can_be_handled (const vec ,
   unsigned start, unsigned end)
 {
+  auto_vec dest_bbs;
   /* For algorithm correctness, bit test for a single case must return
  true.  We bail out in 

[PATCH] Fix build of ppc64 target.

2020-10-01 Thread Martin Liška

Since a889e06ac68 the following fails.

In file included from ../../gcc/tree-ssa-propagate.h:25:0,
 from ../../gcc/config/rs6000/rs6000.c:78:
../../gcc/value-query.h:90:31: error: ‘irange’ has not been declared
   virtual bool range_of_expr (irange , tree name, gimple * = NULL) = 0;
   ^~
../../gcc/value-query.h:91:31: error: ‘irange’ has not been declared
   virtual bool range_on_edge (irange , edge, tree name);
   ^~
../../gcc/value-query.h:92:31: error: ‘irange’ has not been declared
   virtual bool range_of_stmt (irange , gimple *, tree name = NULL);
   ^~
In file included from ../../gcc/tree-ssa-propagate.h:25:0,
 from ../../gcc/config/rs6000/rs6000-call.c:67:
../../gcc/value-query.h:90:31: error: ‘irange’ has not been declared
   virtual bool range_of_expr (irange , tree name, gimple * = NULL) = 0;
   ^~
../../gcc/value-query.h:91:31: error: ‘irange’ has not been declared
   virtual bool range_on_edge (irange , edge, tree name);
   ^~
../../gcc/value-query.h:92:31: error: ‘irange’ has not been declared
   virtual bool range_of_stmt (irange , gimple *, tree name = NULL);

Ready for master?

Thanks,
Martin

gcc/ChangeLog:

* config/rs6000/rs6000-call.c: Include value-range.h.
* config/rs6000/rs6000.c: Likewise.
---
 gcc/config/rs6000/rs6000-call.c | 1 +
 gcc/config/rs6000/rs6000.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index a8b520834c7..d10119bd6bf 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -64,6 +64,7 @@
 #include "xcoffout.h"  /* get declarations of xcoff_*_section_name */
 #endif
 #include "ppc-auxv.h"
+#include "value-range.h"
 #include "tree-ssa-propagate.h"
 #include "tree-vrp.h"
 #include "tree-ssanames.h"
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 375fff59928..6a05f84a021 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -75,6 +75,7 @@
 #endif
 #include "case-cfn-macros.h"
 #include "ppc-auxv.h"
+#include "value-range.h"
 #include "tree-ssa-propagate.h"
 #include "tree-vrp.h"
 #include "tree-ssanames.h"
--
2.28.0



Re: [RS6000] -mno-minimal-toc vs. power10 pcrelative

2020-10-01 Thread Segher Boessenkool
Hi!

On Thu, Oct 01, 2020 at 10:57:48PM +0930, Alan Modra wrote:
> On Wed, Sep 30, 2020 at 03:56:32PM -0500, Segher Boessenkool wrote:
> > On Wed, Sep 30, 2020 at 05:01:45PM +0930, Alan Modra wrote:
> > >   * config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't
> > >   set -mcmodel=small for -mno-minimal-toc when pcrel.
> > 
> > > -   SET_CMODEL (CMODEL_SMALL);\
> > > +   if (TARGET_MINIMAL_TOC\
> > > +   || !(TARGET_PCREL \
> > > +|| (PCREL_SUPPORTED_BY_OS\
> > > +&& (rs6000_isa_flags_explicit\
> > > +& OPTION_MASK_PCREL) == 0))) \
> > > + SET_CMODEL (CMODEL_SMALL);  \
> > 
> > Please write this in a more readable way?  With some "else" statements,
> > perhaps.
> > 
> > It is also fine to SET_CMODEL twice if that makes for simpler code.
> 
> Committed as per your suggestion.

Thanks.

> I was looking at it again today
> with the aim of converting this ugly macro to a function, and spotted
> the duplication in freebsd64.h.  Which has some bit-rot.
> 
> Do you like the following?  rs6000_linux64_override_options is
> functionally the same as what was in linux64.h.  I built various
> configurations to test the change, powerpc64-linux, powerpc64le-linux
> without any 32-bit targets enabled, powerpc64-freebsd12.0.

Please do this as two patches?  One the refactoring without any
functional changes (which is pre-approved -- the name "linux64" isn't
great if you use it in other OSes as well, but I cannot think of a
better name either), and the other the actual change (which probably is
fine as well, but it is hard to see with the patch like this).

Thanks,


Segher


Re: [PATCH] Put absolute address jump table in data.rel.ro.local if targets support relocations

2020-10-01 Thread Richard Sandiford via Gcc-patches
Sorry for the slow review.

HAO CHEN GUI via Gcc-patches  writes:
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> index 513fc5fe295..6f5bf8d7d73 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -9315,10 +9315,10 @@ mips_select_rtx_section (machine_mode mode, rtx x,
> default_function_rodata_section.  */
>  
>  static section *
> -mips_function_rodata_section (tree decl)
> +mips_function_rodata_section (tree decl, bool relocatable ATTRIBUTE_UNUSED)

Now that we're C++, it's more idiomatic to leave off the parameter name:

  mips_function_rodata_section (tree decl, bool)

Same for the rest of the patch.

> @@ -2491,9 +2491,19 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int 
> optimize_p ATTRIBUTE_UNUSED,
> if (! JUMP_TABLES_IN_TEXT_SECTION)
>   {
> int log_align;
> +   bool relocatable;
> +
> +   relocatable = 0;

Very minor, but simpler as:

   bool relocatable = false;

Same for the later hunk.

> @@ -549,16 +549,17 @@ Whatever the actual target object format, this is often 
> good enough.",
>   void, (tree decl, int reloc),
>   default_unique_section)
>  
> -/* Return the readonly data section associated with function DECL.  */
> +/* Return the readonly or relocated readonly data section
> +   associated with function DECL.  */
>  DEFHOOK
>  (function_rodata_section,
> - "Return the readonly data section associated with\n\
> + "Return the readonly or reloc readonly data section associated with\n\
>  @samp{DECL_SECTION_NAME (@var{decl})}.\n\

Maybe add “; @var{relocatable} selects the latter over the former.”

>  The default version of this function selects @code{.gnu.linkonce.r.name} 
> if\n\
>  the function's section is @code{.gnu.linkonce.t.name}, @code{.rodata.name}\n\
> -if function is in @code{.text.name}, and the normal readonly-data section\n\
> -otherwise.",
> - section *, (tree decl),
> +or @code{.data.rel.ro.name} if function is in @code{.text.name}, and\n\
> +the normal readonly-data or reloc readonly data section otherwise.",
> + section *, (tree decl, bool relocatable),
>   default_function_rodata_section)
>  
>  /* Nonnull if the target wants to override the default ".rodata" prefix
> diff --git a/gcc/varasm.c b/gcc/varasm.c
> index 4070f9c17e8..91ab75aed06 100644
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -726,12 +726,26 @@ switch_to_other_text_partition (void)
>switch_to_section (current_function_section ());
>  }
>  
> -/* Return the read-only data section associated with function DECL.  */
> +/* Return the read-only or relocated read-only data section
> +   associated with function DECL.  */
>  
>  section *
> -default_function_rodata_section (tree decl)
> +default_function_rodata_section (tree decl, bool relocatable)
>  {
> -  if (decl != NULL_TREE && DECL_SECTION_NAME (decl))
> +  const char* sname;
> +  unsigned int flags;
> +
> +  flags = 0;
> +
> +  if (relocatable)
> +{
> +  sname = ".data.rel.ro.local";
> +  flags = (SECTION_WRITE | SECTION_RELRO);
> +}
> +  else
> +sname = ".rodata";
> +
> +  if (decl && DECL_SECTION_NAME (decl))
>  {
>const char *name = DECL_SECTION_NAME (decl);
>  
> @@ -744,12 +758,12 @@ default_function_rodata_section (tree decl)
> dot = strchr (name + 1, '.');
> if (!dot)
>   dot = name;
> -   len = strlen (dot) + 8;
> +   len = strlen (dot) + strlen (sname) + 1;
> rname = (char *) alloca (len);
>  
> -   strcpy (rname, ".rodata");
> +   strcpy (rname, sname);
> strcat (rname, dot);
> -   return get_section (rname, SECTION_LINKONCE, decl);
> +   return get_section (rname, (SECTION_LINKONCE | flags), decl);
>   }
>/* For .gnu.linkonce.t.foo we want to use .gnu.linkonce.r.foo.  */
>else if (DECL_COMDAT_GROUP (decl)
> @@ -767,15 +781,18 @@ default_function_rodata_section (tree decl)
>  && strncmp (name, ".text.", 6) == 0)
>   {
> size_t len = strlen (name) + 1;
> -   char *rname = (char *) alloca (len + 2);
> +   char *rname = (char *) alloca (len + strlen (sname) - 5);
>  
> -   memcpy (rname, ".rodata", 7);
> -   memcpy (rname + 7, name + 5, len - 5);
> -   return get_section (rname, 0, decl);
> +   memcpy (rname, sname, strlen (sname));
> +   memcpy (rname + strlen (sname), name + 5, len - 5);
> +   return get_section (rname, flags, decl);
>   }
>  }

Don't we need to handle the .gnu.linkonce.t. case too?  I believe
the suffix there is “.d.rel.ro.local” (replacing “.t”)

My main concern is how this interacts with non-ELF targets.
It looks like AIX/XCOFF, Darwin and Cygwin already pick
default_no_function_rodata_section, so they should be fine.
But at the moment, all the fancy stuff in default_function_rodata_section
is indirectly guarded by targetm_common.have_named_sections, with the
hook falling back to readonly_data_section if the function isn't in a
named section.

So I 

[PATCH] ipa-prop: Fix multiple-target speculation resolution

2020-10-01 Thread Martin Jambor
Hi,

as the FIXME which this patch removes states, the current code does
not work when a call with multiple speculative targets gets resolved
through parameter tracking during inlining - it feeds the inliner an
edge it has already dealt with.  The patch makes the code which should
prevent it aware of the possibility that that speculation can have
more than one target now.

Bootstrapped and tested and LTO bootstrapped on x86_64-linux.  I did not
try profiled LTO bootstrap because it fails even without the patch (even
without Ada, just C, C++ and Fortran, at least commit 92f0d3d03a7 does).
OK for trunk?

Thanks,

Martin


gcc/ChangeLog:

2020-09-30  Martin Jambor  

PR ipa/96394
* ipa-prop.c (update_indirect_edges_after_inlining): Do not add
resolved speculation edges to vector of new direct edges even in
presence of multiple speculative direct edges for a single call.

gcc/testsuite/ChangeLog:

2020-09-30  Martin Jambor  

PR ipa/96394
* gcc.dg/tree-prof/pr96394.c: New test.
---
 gcc/ipa-prop.c   | 10 ++--
 gcc/testsuite/gcc.dg/tree-prof/pr96394.c | 64 
 2 files changed, 70 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-prof/pr96394.c

diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index b28c78eeab4..2ef7a48c5d2 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -3787,11 +3787,13 @@ update_indirect_edges_after_inlining (struct 
cgraph_edge *cs,
 
   param_index = ici->param_index;
   jfunc = ipa_get_ith_jump_func (top, param_index);
-  cgraph_node *spec_target = NULL;
 
-  /* FIXME: This may need updating for multiple calls.  */
+  auto_vec spec_targets;
   if (ie->speculative)
-   spec_target = ie->first_speculative_call_target ()->callee;
+   for (cgraph_edge *direct = ie->first_speculative_call_target ();
+direct;
+direct = direct->next_speculative_call_target ())
+ spec_targets.safe_push (direct->callee);
 
   if (!opt_for_fn (node->decl, flag_indirect_inlining))
new_direct_edge = NULL;
@@ -3814,7 +3816,7 @@ update_indirect_edges_after_inlining (struct cgraph_edge 
*cs,
 
   /* If speculation was removed, then we need to do nothing.  */
   if (new_direct_edge && new_direct_edge != ie
- && new_direct_edge->callee == spec_target)
+ && spec_targets.contains (new_direct_edge->callee))
{
  new_direct_edge->indirect_inlining_edge = 1;
  top = IPA_EDGE_REF (cs);
diff --git a/gcc/testsuite/gcc.dg/tree-prof/pr96394.c 
b/gcc/testsuite/gcc.dg/tree-prof/pr96394.c
new file mode 100644
index 000..4280182a7c3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/pr96394.c
@@ -0,0 +1,64 @@
+/* PR ipa/96394 */
+/* { dg-options "-O2" } */
+
+typedef struct _entry {
+int has_next;
+int next_ix;
+int count;
+} entry;
+
+extern entry table[];
+
+void *
+__attribute__((noipa))
+PyErr_Format(entry * e){ return 0; }
+
+void ae(entry *);
+int h(entry *);
+int ap(entry *);
+int ag(entry *);
+
+int ag(entry *j) {
+  if (j->has_next)
+h([j->next_ix]);
+  return 0;
+}
+static int ai(entry *j, int k(entry *), int l, int m) {
+  int am = 1;
+  int ab;
+
+  /* k is either 'h' or 'ap': 50%/50% */
+  ab = k(j);
+
+  /* loop never gets executed on real data */
+  for (; j->count >= 2; am += 2)
+if (l) {
+  entry *i = [am + m];
+  PyErr_Format(i);
+}
+  return ab;
+}
+void
+__attribute__((noipa))
+bug() {
+  h(table);
+  h(table);
+}
+int h(entry *j) { return ai(j, ap, 4, 5); }
+int ap(entry *j) { return ai(j, ag, 14, 4); }
+
+int main(void)
+{
+bug();
+}
+
+entry table[2] = {
+{ .has_next = 1
+, .next_ix  = 1
+, .count= 0
+},
+{ .has_next = 0
+, .next_ix  = 0
+, .count= 0
+},
+};
-- 
2.28.0



Re: [PATCH] generalized range_query class for multiple contexts

2020-10-01 Thread Martin Sebor via Gcc-patches

On 10/1/20 9:34 AM, Aldy Hernandez wrote:



On 10/1/20 3:22 PM, Andrew MacLeod wrote:
 > On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:
 >>> Thanks for doing all this!  There isn't anything I don't understand
 >>> in the sprintf changes so no questions from me (well, almost none).
 >>> Just some comments:
 >> Thanks for your comments on the sprintf/strlen API conversion.
 >>
 >>> The current call statement is available in all functions that take
 >>> a directive argument, as dir->info.callstmt.  There should be no need
 >>> to also add it as a new argument to the functions that now need it.
 >> Fixed.
 >>
 >>> The change adds code along these lines in a bunch of places:
 >>>
 >>> + value_range vr;
 >>> + if (!query->range_of_expr (vr, arg, stmt))
 >>> +   vr.set_varying (TREE_TYPE (arg));
 >>>
 >>> I thought under the new Ranger APIs when a range couldn't be
 >>> determined it would be automatically set to the maximum for
 >>> the type.  I like that and have been moving in that direction
 >>> with my code myself (rather than having an API fail, have it
 >>> set the max range and succeed).
 >> I went through all the above idioms and noticed all are being used on
 >> supported types (integers or pointers).  So range_of_expr will always
 >> return true.  I've removed the if() and the set_varying.
 >>
 >>> Since that isn't so in this case, I think it would still be nice
 >>> if the added code could be written as if the range were set to
 >>> varying in this case and (ideally) reduced to just initialization:
 >>>
 >>> value_range vr = some-function (query, stmt, arg);
 >>>
 >>> some-function could be an inline helper defined just for the sprintf
 >>> pass (and maybe also strlen which also seems to use the same pattern),
 >>> or it could be a value_range AKA irange ctor, or it could be a member
 >>> of range_query, whatever is the most appropriate.
 >>>
 >>> (If assigning/copying a value_range is thought to be too expensive,
 >>> declaring it first and then passing it to that helper to set it
 >>> would work too).
 >>>
 >>> In strlen, is the removed comment no longer relevant?  (I.e., does
 >>> the ranger solve the problem?)
 >>>
 >>> -  /* The range below may be "inaccurate" if a constant has been
 >>> -    substituted earlier for VAL by this pass that hasn't been
 >>> -    propagated through the CFG.  This shoud be fixed by the new
 >>> -    on-demand VRP if/when it becomes available (hopefully in
 >>> -    GCC 11).  */
 >> It should.
 >>
 >>> I'm wondering about the comment added to get_range_strlen_dynamic
 >>> and other places:
 >>>
 >>> + // FIXME: Use range_query instead of global ranges.
 >>>
 >>> Is that something you're planning to do in a followup or should
 >>> I remember to do it at some point?
 >> I'm not planning on doing it.  It's just a reminder that it would be
 >> beneficial to do so.
 >>
 >>> Otherwise I have no concern with the changes.
 >> It's not cleared whether Andrew approved all 3 parts of the patchset
 >> or just the valuation part.  I'll wait for his nod before committing
 >> this chunk.
 >>
 >> Aldy
 >>
 > I have no issue with it, so OK.

Pushed all 3 patches.

 >
 > Just an observation that should be pointed out, I believe Aldy has all
 > the code for converting to a ranger, but we have not pursued that any
 > further yet since there is a regression due to our lack of equivalence
 > processing I think?  That should be resolved in the coming month, but at
 > the moment is a holdback/concern for converting these passes...  iirc.

Yes.  Martin, the take away here is that the strlen/sprintf pass has 
been converted to the new API, but ranger is still not up and running on 
it (even on the branch).


With the new API, all you have to do is remove all instances of 
evrp_range_analyzer and replace them with a ranger.  That's it.
Below is an untested patch that would convert you to a ranger once it's 
contributed.


IIRC when I enabled the ranger for your pass a while back, there was one 
or two regressions due to missing equivalences, and the rest were 
because the tests were expecting an actual specific range, and the 
ranger returned a slightly different/better one.  You'll need to adjust 
your tests.


Ack.  I'll be on the lookout for the ranger commit (if you hppen
to remember and CC me on it just in case I might miss it that would
be great).

Thanks
Martin



Aldy

diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index f4d1c5ca256..9f9e95b7155 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -58,8 +58,8 @@ along with GCC; see the file COPYING3.  If not see
  #include "tree-ssa-loop.h"
  #include "tree-scalar-evolution.h"
  #include "vr-values.h"
-#include "gimple-ssa-evrp-analyze.h"
  #include "tree-ssa.h"
+#include "gimple-range.h"

  /* A vector indexed by SSA_NAME_VERSION.  0 means unknown, positive value
     is an index into strinfo vector, negative value stands for
@@ -5755,16 

[PATCH] c++: Verify 'this' of NS member functions in constexpr [PR97230]

2020-10-01 Thread Marek Polacek via Gcc-patches
This PR points out that when we're invoking a non-static member function
on a null instance during constant evaluation, we should reject.
cxx_eval_call_expression calls cxx_bind_parameters_in_call which
evaluates function arguments, but it won't detect problems like these.

Well, ok, so use integer_zerop to detect a null 'this'.  This also
detects member calls on a variable whose lifetime has ended, because
check_return_expr creates an artificial nullptr:
10195   else if (!processing_template_decl
10196&& maybe_warn_about_returning_address_of_local (retval, 
loc)
10197&& INDIRECT_TYPE_P (valtype))
10198 retval = build2 (COMPOUND_EXPR, TREE_TYPE (retval), retval,
10199  build_zero_cst (TREE_TYPE (retval)));
It would be great if we could somehow distinguish between those two
cases, but experiments with setting TREE_THIS_VOLATILE on the zero
didn't work, so I left it be.

But by the same token, we should detect out-of-bounds accesses.  For
this I'm (ab)using eval_and_check_array_index so that I don't have
to reimplement bounds checking yet again.  But this only works for
ARRAY_REFs, so won't detect

  X x;
  ()[0].foo(); // ok
  ()[1].foo(); // bad

so I've added a special handling of POINTER_PLUS_EXPRs.

While here, we should also detect using an inactive union member.  For
that, I'm using cxx_eval_component_reference.

Does this approach seem sensible?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97230
* constexpr.c (eval_and_check_array_index): Forward declare.
(cxx_eval_component_reference): Likewise.
(cxx_eval_call_expression): Verify the 'this' pointer for
non-static member functions.

gcc/testsuite/ChangeLog:

PR c++/97230
* g++.dg/cpp0x/constexpr-member-fn1.C: New test.
---
 gcc/cp/constexpr.c| 72 ++-
 .../g++.dg/cpp0x/constexpr-member-fn1.C   | 44 
 2 files changed, 115 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-member-fn1.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index a118f8a810b..f62f37ce384 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -2181,6 +2181,11 @@ cxx_eval_thunk_call (const constexpr_ctx *ctx, tree t, 
tree thunk_fndecl,
   non_constant_p, overflow_p);
 }
 
+static tree eval_and_check_array_index (const constexpr_ctx *, tree, bool,
+   bool *, bool *);
+static tree cxx_eval_component_reference (const constexpr_ctx *, tree,
+ bool, bool *, bool *);
+
 /* Subroutine of cxx_eval_constant_expression.
Evaluate the call expression tree T in the context of OLD_CALL expression
evaluation.  */
@@ -2467,6 +2472,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
   if (*non_constant_p)
 return t;
 
+  tree result = NULL_TREE;
   depth_ok = push_cx_call_context (t);
 
   /* Remember the object we are constructing.  */
@@ -2496,8 +2502,72 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
new_obj = NULL_TREE;
}
 }
+   /* Verify that the object we're invoking the function on is sane.  */
+  else if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fun)
+  /* maybe_add_lambda_conv_op creates a null 'this' pointer.  */
+  && !LAMBDA_TYPE_P (CP_DECL_CONTEXT (fun)))
+{
+  tree thisarg = TREE_VEC_ELT (new_call.bindings, 0);
+  if (integer_zerop (thisarg))
+   {
+ if (!ctx->quiet)
+   error_at (loc, "member call on null pointer is not allowed "
+ "in a constant expression");
+ *non_constant_p = true;
+ result = error_mark_node;
+   }
+  else
+   {
+ STRIP_NOPS (thisarg);
+ if (TREE_CODE (thisarg) == ADDR_EXPR)
+   thisarg = TREE_OPERAND (thisarg, 0);
+ /* Detect out-of-bounds accesses.  */
+ if (TREE_CODE (thisarg) == ARRAY_REF)
+   {
+ eval_and_check_array_index (ctx, thisarg, /*allow_one_past*/false,
+ non_constant_p, overflow_p);
+ if (*non_constant_p)
+   result = error_mark_node;
+   }
+ /* Detect using an inactive member of a union.  */
+ else if (TREE_CODE (thisarg) == COMPONENT_REF)
+   {
+ cxx_eval_component_reference (ctx, thisarg, /*lval*/false,
+   non_constant_p, overflow_p);
+ if (*non_constant_p)
+   result = error_mark_node;
+   }
+ /* Detect other invalid accesses like
 
-  tree result = NULL_TREE;
+  X x;
+  ()[1].foo();
+
+where we'll end up with  p+ 1.  */
+ else if (TREE_CODE (thisarg) == POINTER_PLUS_EXPR)
+   {
+ tree op0 = 

Re: [committed] aarch64: Tweak movti and movtf patterns

2020-10-01 Thread Richard Sandiford via Gcc-patches
Christophe Lyon  writes:
> On Wed, 30 Sep 2020 at 12:53, Richard Sandiford via Gcc-patches
>  wrote:
>>
>> movti lacked an way of zeroing an FPR, meaning that we'd do:
>>
>> mov x0, 0
>> mov x1, 0
>> fmovd0, x0
>> fmovv0.d[1], x1
>>
>> instead of just:
>>
>> moviv0.2d, #0
>>
>> movtf had the opposite problem for GPRs: we'd generate:
>>
>> moviv0.2d, #0
>> fmovx0, d0
>> fmovx1, v0.d[1]
>>
>> instead of just:
>>
>> mov x0, 0
>> mov x1, 0
>>
>> Also, there was an unnecessary earlyclobber on the GPR<-GPR movtf
>> alternative (but not the movti one).  The splitter handles overlap
>> correctly.
>>
>> The TF splitter used aarch64_reg_or_imm, but the _imm part only
>> accepts integer constants, not floating-point ones.  The patch
>> changes it to nonmemory_operand instead.
>>
>> Tested on aarch64-linux-gnu, pushed.
>>
>> Richard
>>
>>
>> gcc/
>> * config/aarch64/aarch64.c (aarch64_split_128bit_move_p): Add a
>> function comment.  Tighten check for FP moves.
>> * config/aarch64/aarch64.md (*movti_aarch64): Add a w<-Z alternative.
>> (*movtf_aarch64): Handle r<-Y like r<-r.  Remove unnecessary
>> earlyclobber.  Change splitter predicate from aarch64_reg_or_imm
>> to nonmemory_operand.
>>
>> gcc/testsuite/
>> * gcc.target/aarch64/movtf_1.c: New test.
>> * gcc.target/aarch64/movti_1.c: Likewise.
>
> Sorry to bother you, the new tests fail with -mabi=ilp32 :-(
> gcc.target/aarch64/movtf_1.c check-function-bodies load_q
> gcc.target/aarch64/movtf_1.c check-function-bodies load_x
> gcc.target/aarch64/movtf_1.c check-function-bodies store_q
> gcc.target/aarch64/movtf_1.c check-function-bodies store_x
> gcc.target/aarch64/movti_1.c check-function-bodies load_q
> gcc.target/aarch64/movti_1.c check-function-bodies load_x
> gcc.target/aarch64/movti_1.c check-function-bodies store_q
> gcc.target/aarch64/movti_1.c check-function-bodies store_x

Gah, should know by now that check-function-bodies and lp32 don't
go well together.

I've applied the below, sorry for the breakage.  Hope I don't
introduce another slew of failures with the arm vcond stuff…

Richard


gcc/testsuite/
* gcc.target/aarch64/movtf_1.c: Restrict the asm matching to lp64.
* gcc.target/aarch64/movti_1.c: Likewise.
---
 gcc/testsuite/gcc.target/aarch64/movtf_1.c | 2 +-
 gcc/testsuite/gcc.target/aarch64/movti_1.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/movtf_1.c 
b/gcc/testsuite/gcc.target/aarch64/movtf_1.c
index 570de931389..b975b208019 100644
--- a/gcc/testsuite/gcc.target/aarch64/movtf_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/movtf_1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O" } */
-/* { dg-final { check-function-bodies "**" "" } } */
+/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
 
 /*
 ** zero_q:
diff --git a/gcc/testsuite/gcc.target/aarch64/movti_1.c 
b/gcc/testsuite/gcc.target/aarch64/movti_1.c
index 160e1acd281..5595b3e6f02 100644
--- a/gcc/testsuite/gcc.target/aarch64/movti_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/movti_1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O" } */
-/* { dg-final { check-function-bodies "**" "" } } */
+/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
 
 /*
 ** zero_q:


[PATCH 1/4] system_data_types.7: Add '__int128'

2020-10-01 Thread Alejandro Colomar via Gcc-patches
Signed-off-by: Alejandro Colomar 
---
 man7/system_data_types.7 | 40 
 1 file changed, 40 insertions(+)

diff --git a/man7/system_data_types.7 b/man7/system_data_types.7
index e545aa1a0..5f9aa648f 100644
--- a/man7/system_data_types.7
+++ b/man7/system_data_types.7
@@ -40,6 +40,8 @@ system_data_types \- overview of system data types
 .\"* Description (no "Description" header)
 .\"A few lines describing the type.
 .\"
+.\"* Versions (optional)
+.\"
 .\"* Conforming to (see NOTES)
 .\"Format: CXY and later; POSIX.1- and later.
 .\"
@@ -48,6 +50,44 @@ system_data_types \- overview of system data types
 .\"* Bugs (if any)
 .\"
 .\"* See also
+.\"- __int128 -/
+.TP
+.I __int128
+.RS
+.RI [ signed ]
+.I __int128
+.PP
+A signed integer type
+of a fixed width of exactly 128 bits.
+.PP
+When using GCC,
+it is supported only for targets where
+the compiler is able to generate efficient code for 128-bit arithmetic".
+.PP
+Versions:
+GCC 4.6.0 and later.
+.PP
+Conforming to:
+This is a non-standard extension, present in GCC.
+It is not standardized by the C language standard nor POSIX.
+.PP
+Notes:
+This type is available without including any header.
+.PP
+Bugs:
+It is not possible to express an integer constant of type
+.I __int128
+in implementations where
+.I long long
+is less than 128 bits wide.
+.PP
+See also the
+.IR intmax_t ,
+.IR int N _t
+and
+.I unsigned __int128
+types in this page.
+.RE
 .\"- aiocb /
 .TP
 .I aiocb
-- 
2.28.0



[PATCH 2/4] __int128.3: New link to system_data_types(7)

2020-10-01 Thread Alejandro Colomar via Gcc-patches
Signed-off-by: Alejandro Colomar 
---
 man3/__int128.3 | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 man3/__int128.3

diff --git a/man3/__int128.3 b/man3/__int128.3
new file mode 100644
index 0..db50c0f09
--- /dev/null
+++ b/man3/__int128.3
@@ -0,0 +1 @@
+.so man7/system_data_types.7
-- 
2.28.0



[PATCH 3/4] system_data_types.7: Add 'unsigned __int128'

2020-10-01 Thread Alejandro Colomar via Gcc-patches
Signed-off-by: Alejandro Colomar 
---
 man7/system_data_types.7 | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/man7/system_data_types.7 b/man7/system_data_types.7
index 5f9aa648f..3cf3f0ec9 100644
--- a/man7/system_data_types.7
+++ b/man7/system_data_types.7
@@ -1822,6 +1822,41 @@ and
 .I void *
 types in this page.
 .RE
+.\"- unsigned __int128 /
+.TP
+.I unsigned __int128
+.RS
+An unsigned integer type
+of a fixed width of exactly 128 bits.
+.PP
+When using GCC,
+it is supported only for targets where
+the compiler is able to generate efficient code for 128-bit arithmetic".
+.PP
+Versions:
+GCC 4.6.0 and later.
+.PP
+Conforming to:
+This is a non-standard extension, present in GCC.
+It is not standardized by the C language standard nor POSIX.
+.PP
+Notes:
+This type is available without including any header.
+.PP
+Bugs:
+It is not possible to express an integer constant of type
+.I unsigned __int128
+in implementations where
+.I unsigned long long
+is less than 128 bits wide.
+.PP
+See also the
+.IR __int128 ,
+.I uintmax_t
+and
+.IR uint N _t
+types in this page.
+.RE
 .\"- va_list --/
 .TP
 .I va_list
-- 
2.28.0



[PATCH 0/4] Document 128-bit types

2020-10-01 Thread Alejandro Colomar via Gcc-patches
Hi Michael,

I think this might be ready for a patch.

I'm done for today :-)


Cheers,

Alex

Alejandro Colomar (4):
  system_data_types.7: Add '__int128'
  __int128.3: New link to system_data_types(7)
  system_data_types.7: Add 'unsigned __int128'
  unsigned-__int128.3: New link to system_data_types(7)

 man3/__int128.3  |  1 +
 man3/unsigned-__int128.3 |  1 +
 man7/system_data_types.7 | 75 
 3 files changed, 77 insertions(+)
 create mode 100644 man3/__int128.3
 create mode 100644 man3/unsigned-__int128.3

-- 
2.28.0



[PATCH 4/4] unsigned-__int128.3: New link to system_data_types(7)

2020-10-01 Thread Alejandro Colomar via Gcc-patches
Signed-off-by: Alejandro Colomar 
---
 man3/unsigned-__int128.3 | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 man3/unsigned-__int128.3

diff --git a/man3/unsigned-__int128.3 b/man3/unsigned-__int128.3
new file mode 100644
index 0..db50c0f09
--- /dev/null
+++ b/man3/unsigned-__int128.3
@@ -0,0 +1 @@
+.so man7/system_data_types.7
-- 
2.28.0



[PATCH][GCC 9] AArch64: Add prefer_advsimd_autovec internal tune_flag

2020-10-01 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

I'd like to add a prefer_advsimd_autovec internal tune_flag that makes GCC pick 
Advanced SIMD over SVE for autovectorisation.
No CPU tuning uses it yet, but I'd like to add this to the GCC 8 and 9 branches 
only as SVE autovectorisation is less mature there and CPUs
may want to prefer Advanced SIMD over SVE when tuning for performance.
This patch provides a minimally invasive way of achieving that.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing on the GCC 9 branch and an equivalent GCC 8 branch patch too.

gcc/
* config/aarch64/aarch64-tuning-flags.def (PREFER_ADVSIMD_AUTOVEC):
Define.
* config/aarch64/aarch64.c (aarch64_preferred_simd_mode): Use it.
(aarch64_autovectorize_vector_sizes): Likewise.


prefer-asimd-9.patch
Description: prefer-asimd-9.patch


Re: [patch] Add an if-exists-then-else spec function

2020-10-01 Thread Armin Brauns via Gcc-patches
On 01/10/2020 18.04, Olivier Hainque wrote:
> Hello,
>
> This patch is a proposal to add an if-exists-then-else
> builtin spec function, which tests for the existence of
> a file and returns one or the other of the following
> arguments depending on the result of the test.
>
Hello,

could you please make sure to update the documentation around 
gcc/doc/invoke.texi:31574 accordingly? There's already a pending patch to make 
it more complete at 
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553321.html, but there 
shouldn't be any major conflicts between the two.

Regards,

Armin




Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-01 Thread Richard Sandiford via Gcc-patches
Qing Zhao  writes:
> Hi, Richard,
>
> To answer the question, which registers should be included in “ALL”. 
> I studied X86 hard register set in more details. And also consulted with 
> H.J.Lu, And found:
>
> In the current x86 implementation, mask registers, MM0-MM7 registers, and 
> ST0-ST7 registers are not zeroed.
>
> The reasons actually are:
> 1. Mask registers are marked as “FIXED_REGS” by middle end,  (in the 
> following place, reginfo.c, init_reg_sets_1)
>
>   /* If a register is too limited to be treated as a register operand,
>  then it should never be allocated to a pseudo.  */
>   if (!TEST_HARD_REG_BIT (operand_reg_set, i))
> fixed_regs[i] = 1;

But isn't that only true when AVX512F is disabled?

The question is more why the registers shouldn't be zeroed when
they're available.

> 2. MM0-MM7 registers and ST0-ST7 registers are aliased with each other, (i.e, 
> the same set of registers have two different
> Names), so, zero individual mm or st register will be very impractical. 
> However, we can zero them together with emms. 

Ah, OK.

> So, my conclusion is, 
>
> 1. For “ALL”, we should include all call_used_regs that are not fixed_regs. 
> 2. For X86 implementation, I added more comments, and also add clearing all 
> mm and st registers with emms.
>
> In general, “ALL” should include all call_used_regs that are not fixed_regs. 

Right.  I thought the original implementation already excluded fixed
registers, but perhaps I'm misremembering.  I agree that that's the
sensible default behaviour.

Going back to the default hook, I guess one option is:

   rtx zero = CONST0_RTX (reg_raw_mode[regno]);
   rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
   if (!valid_insn_p (insn))
 sorry (…);

but with some mechanism to avoid spewing the user with messages
for the same problem.

Thanks,
Richard


Re: [PATCH] aarch64: Don't generate invalid zero/sign-extend syntax

2020-10-01 Thread Richard Sandiford via Gcc-patches
Alex Coplan  writes:
> Hi Christophe,
>
> On 08/09/2020 10:14, Christophe Lyon wrote:
>> On Mon, 17 Aug 2020 at 11:00, Alex Coplan  wrote:
>> >
>> > gcc/ChangeLog:
>> >
>> > * config/aarch64/aarch64.md
>> > (*adds__): Ensure extended operand
>> > agrees with width of extension specifier.
>> > (*subs__): Likewise.
>> > (*adds__shift_): Likewise.
>> > (*subs__shift_): Likewise.
>> > (*add__): Likewise.
>> > (*add__shft_): Likewise.
>> > (*add_uxt_shift2): Likewise.
>> > (*sub__): Likewise.
>> > (*sub__shft_): Likewise.
>> > (*sub_uxt_shift2): Likewise.
>> > (*cmp_swp__reg): Likewise.
>> > (*cmp_swp__shft_): Likewise.
>> >
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> > * gcc.target/aarch64/adds3.c: Fix test w.r.t. new syntax.
>> > * gcc.target/aarch64/cmp.c: Likewise.
>> > * gcc.target/aarch64/subs3.c: Likewise.
>> > * gcc.target/aarch64/subsp.c: Likewise.
>> > * gcc.target/aarch64/extend-syntax.c: New test.
>> >
>> 
>> Hi,
>> 
>> I've noticed some of the new tests fail with -mabi=ilp32:
>> gcc.target/aarch64/extend-syntax.c check-function-bodies add1
>> gcc.target/aarch64/extend-syntax.c check-function-bodies add3
>> gcc.target/aarch64/extend-syntax.c check-function-bodies sub2
>> gcc.target/aarch64/extend-syntax.c check-function-bodies sub3
>> gcc.target/aarch64/extend-syntax.c scan-assembler-times
>> subs\tx[0-9]+, x[0-9]+, w[0-9]+, sxtw 3 1
>> gcc.target/aarch64/subsp.c scan-assembler sub\tsp, sp, w[0-9]*, sxtw 4\n
>> 
>> Christophe
>
> AFAICT the second scan-assembler in that subsp test failed on ILP32
> before my commit. This is because we generate slightly suboptimal code
> here. On LP64 with -O, we get:
>
> f2:
> stp x29, x30, [sp, -16]!
> mov x29, sp
> add w1, w1, 1
> sub sp, sp, x1, sxtw 4
> mov x0, sp
> bl  foo
> mov sp, x29
> ldp x29, x30, [sp], 16
> ret
>
> On ILP32, we get:
>
> f2:
> stp x29, x30, [sp, -16]!
> mov x29, sp
> add w1, w1, 1
> lsl w1, w1, 4
> sub sp, sp, x1
> mov w0, wsp
> bl  foo
> mov sp, x29
> ldp x29, x30, [sp], 16
> ret
>
> And we see similar results all the way back to GCC 6. So AFAICT this
> scan-assembler has never worked. The attached patch disables it on ILP32
> since this isn't a code quality regression.
>
> This patch also fixes up the DejaGnu directives in extend-syntax.c to
> work on ILP32: we change the check-function-bodies directive to only run
> on LP64, adding scan-assembler directives for ILP32 where required.
>
> OK for trunk?

OK, thanks.  Sorry for the slow review.

Richard


[patch] Add an if-exists-then-else spec function

2020-10-01 Thread Olivier Hainque
Hello,

This patch is a proposal to add an if-exists-then-else
builtin spec function, which tests for the existence of
a file and returns one or the other of the following
arguments depending on the result of the test.

This differs from the existing if-exists or
if-exists-else function which return the name of the
tested file if it exists.

This new function is of help to a forthcoming change for
VxWorks where we check for the presence of a specific header
file to decide the name of a library to include in the link
closure, like:

  #define VXWORKS_NET_LIBS_RTP "-l%:if-exists-then-else(%:getenv(VSB_DIR 
/usr/h/public/rtnetStackLib.h) rtnet net)"


We have been using this for months in nightly gcc-9 based
compilers for numerous targets. It passes a build + local test sequence
with gcc-10 for powerpc-vxworks7r2 and a sanity check build with
a recent mainline.

Is this ok to commit ?

Thanks a lot in advance,

With Kind Regards,

Olivier


2020-10-01  Douglas Rupp  

* gcc.c (if-exists-then-else): New built-in spec function.



0013-Add-a-if-exists-then-else-built-in-spec-function.diff
Description: Binary data


Re: [PATCH PR96375] arm: Fix testcase selection for Low Overhead Loop tests

2020-10-01 Thread Andrea Corallo via Gcc-patches
Kyrylo Tkachov  writes:

>> -Original Message-
>> From: Andrea Corallo 
>> Sent: 01 October 2020 15:36
>> To: gcc-patches@gcc.gnu.org
>> Cc: Richard Earnshaw ; Kyrylo Tkachov
>> ; Christophe Lyon 
>> Subject: Re: [PATCH PR96375] arm: Fix testcase selection for Low Overhead
>> Loop tests
>> 
>> Andrea Corallo  writes:
>> 
>> > Hi all,
>> >
>> > I'd like to submit the following patch to fix PR96375 ([11 regression]
>> > arm/lob[2-5].c fail on some configurations).
>> >
>> > It fix the observed regression making sure -mthumb is always used and
>> > allowing Low Overhead Loop tests to be executed only on cortex-M profile
>> > targets.
>> >
>> > Does not introduce regressions in my testing and fix the reported one
>> > according to Christophe (in Cc).
>> >
>> > Okay for trunk?
>
> Ok.
> Thanks,
> Kyrill

Hi Kyrill,

Installed as 968ec08efef.

Thanks!

  Andrea


Re: [PATCH] generalized range_query class for multiple contexts

2020-10-01 Thread Aldy Hernandez via Gcc-patches




On 10/1/20 3:22 PM, Andrew MacLeod wrote:
> On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:
>>> Thanks for doing all this!  There isn't anything I don't understand
>>> in the sprintf changes so no questions from me (well, almost none).
>>> Just some comments:
>> Thanks for your comments on the sprintf/strlen API conversion.
>>
>>> The current call statement is available in all functions that take
>>> a directive argument, as dir->info.callstmt.  There should be no need
>>> to also add it as a new argument to the functions that now need it.
>> Fixed.
>>
>>> The change adds code along these lines in a bunch of places:
>>>
>>> + value_range vr;
>>> + if (!query->range_of_expr (vr, arg, stmt))
>>> +   vr.set_varying (TREE_TYPE (arg));
>>>
>>> I thought under the new Ranger APIs when a range couldn't be
>>> determined it would be automatically set to the maximum for
>>> the type.  I like that and have been moving in that direction
>>> with my code myself (rather than having an API fail, have it
>>> set the max range and succeed).
>> I went through all the above idioms and noticed all are being used on
>> supported types (integers or pointers).  So range_of_expr will always
>> return true.  I've removed the if() and the set_varying.
>>
>>> Since that isn't so in this case, I think it would still be nice
>>> if the added code could be written as if the range were set to
>>> varying in this case and (ideally) reduced to just initialization:
>>>
>>> value_range vr = some-function (query, stmt, arg);
>>>
>>> some-function could be an inline helper defined just for the sprintf
>>> pass (and maybe also strlen which also seems to use the same pattern),
>>> or it could be a value_range AKA irange ctor, or it could be a member
>>> of range_query, whatever is the most appropriate.
>>>
>>> (If assigning/copying a value_range is thought to be too expensive,
>>> declaring it first and then passing it to that helper to set it
>>> would work too).
>>>
>>> In strlen, is the removed comment no longer relevant?  (I.e., does
>>> the ranger solve the problem?)
>>>
>>> -  /* The range below may be "inaccurate" if a constant has been
>>> -substituted earlier for VAL by this pass that hasn't been
>>> -propagated through the CFG.  This shoud be fixed by the new
>>> -on-demand VRP if/when it becomes available (hopefully in
>>> -GCC 11).  */
>> It should.
>>
>>> I'm wondering about the comment added to get_range_strlen_dynamic
>>> and other places:
>>>
>>> + // FIXME: Use range_query instead of global ranges.
>>>
>>> Is that something you're planning to do in a followup or should
>>> I remember to do it at some point?
>> I'm not planning on doing it.  It's just a reminder that it would be
>> beneficial to do so.
>>
>>> Otherwise I have no concern with the changes.
>> It's not cleared whether Andrew approved all 3 parts of the patchset
>> or just the valuation part.  I'll wait for his nod before committing
>> this chunk.
>>
>> Aldy
>>
> I have no issue with it, so OK.

Pushed all 3 patches.

>
> Just an observation that should be pointed out, I believe Aldy has all
> the code for converting to a ranger, but we have not pursued that any
> further yet since there is a regression due to our lack of equivalence
> processing I think?  That should be resolved in the coming month, but at
> the moment is a holdback/concern for converting these passes...  iirc.

Yes.  Martin, the take away here is that the strlen/sprintf pass has 
been converted to the new API, but ranger is still not up and running on 
it (even on the branch).


With the new API, all you have to do is remove all instances of 
evrp_range_analyzer and replace them with a ranger.  That's it.
Below is an untested patch that would convert you to a ranger once it's 
contributed.


IIRC when I enabled the ranger for your pass a while back, there was one 
or two regressions due to missing equivalences, and the rest were 
because the tests were expecting an actual specific range, and the 
ranger returned a slightly different/better one.  You'll need to adjust 
your tests.


Aldy

diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index f4d1c5ca256..9f9e95b7155 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c
@@ -58,8 +58,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-loop.h"
 #include "tree-scalar-evolution.h"
 #include "vr-values.h"
-#include "gimple-ssa-evrp-analyze.h"
 #include "tree-ssa.h"
+#include "gimple-range.h"

 /* A vector indexed by SSA_NAME_VERSION.  0 means unknown, positive value
is an index into strinfo vector, negative value stands for
@@ -5755,16 +5755,13 @@ class strlen_dom_walker : public dom_walker
 public:
   strlen_dom_walker (cdi_direction direction)
 : dom_walker (direction),
-evrp (false),
 m_cleanup_cfg (false)
   {}

   virtual edge before_dom_children (basic_block);
   virtual void after_dom_children 

Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-01 Thread Qing Zhao via Gcc-patches
Hi, Richard,

To answer the question, which registers should be included in “ALL”. 
I studied X86 hard register set in more details. And also consulted with 
H.J.Lu, And found:

In the current x86 implementation, mask registers, MM0-MM7 registers, and 
ST0-ST7 registers are not zeroed.

The reasons actually are:
1. Mask registers are marked as “FIXED_REGS” by middle end,  (in the following 
place, reginfo.c, init_reg_sets_1)

  /* If a register is too limited to be treated as a register operand,
 then it should never be allocated to a pseudo.  */
  if (!TEST_HARD_REG_BIT (operand_reg_set, i))
fixed_regs[i] = 1;

2. MM0-MM7 registers and ST0-ST7 registers are aliased with each other, (i.e, 
the same set of registers have two different
Names), so, zero individual mm or st register will be very impractical. 
However, we can zero them together with emms. 

So, my conclusion is, 

1. For “ALL”, we should include all call_used_regs that are not fixed_regs. 
2. For X86 implementation, I added more comments, and also add clearing all mm 
and st registers with emms.

In general, “ALL” should include all call_used_regs that are not fixed_regs. 

thanks.

Qing

> 
> If we have common, written-down ground rules for deciding which kinds
> of register should be included and which shouldn't, it will be easier
> to apply those rules to a given target.  And it will be easier to decide
> on the right behaviour for the default implementation of the hook.
> 
> It feels like the pushback against defining a default implementation
> of the hook is also a pushback against being specific about how targets
> are supposed to select which kinds of register need to be zeroed.
> 
> Thanks,
> Richard



RE: [PATCH] arm: Add missing vec_cmp and vcond patterns

2020-10-01 Thread Kyrylo Tkachov via Gcc-patches
Hi Richard,

> -Original Message-
> From: Richard Sandiford 
> Sent: 01 October 2020 15:10
> To: gcc-patches@gcc.gnu.org
> Cc: ni...@redhat.com; Richard Earnshaw ;
> Ramana Radhakrishnan ; Kyrylo
> Tkachov 
> Subject: [PATCH] arm: Add missing vec_cmp and vcond patterns
> 
> This patch does several things at once:
> 
> (1) Add vector compare patterns (vec_cmp and vec_cmpu).
> 
> (2) Add vector selects between floating-point modes when the
> values being compared are integers (affects vcond and vcondu).
> 
> (3) Add vector selects between integer modes when the values being
> compared are floating-point (affects vcond).
> 
> (4) Add standalone vector select patterns (vcond_mask).
> 
> (5) Tweak the handling of compound comparisons with zeros.
> 
> Unfortunately it proved too difficult (for me) to separate this
> out into a series of smaller patches, since everything is so
> inter-related.  Defining only some of the new patterns does
> not leave things in a happy state.
> 
> The handling of comparisons is mostly taken from the vcond patterns.
> This means that it remains non-compliant with IEEE: “quiet” comparisons
> use signalling instructions.  But that shouldn't matter for floats,
> since we require -funsafe-math-optimizations to vectorize for them
> anyway.
> 
> It remains the case that comparisons and selects aren't implemented
> at all for HF vectors.  Implementing those feels like separate work.
> 
> Tested on arm-linux-gnueabihf and arm-eabi (for MVE).  OK to install?
> 

Ok.
Thanks, this looks good to me.
Kyrill

> Richard
> 
> 
> gcc/
>   PR target/96528
>   PR target/97288
>   * config/arm/arm-protos.h (arm_expand_vector_compare): Declare.
>   (arm_expand_vcond): Likewise.
>   * config/arm/arm.c (arm_expand_vector_compare): New function.
>   (arm_expand_vcond): Likewise.
>   * config/arm/neon.md (vec_cmp):
> New pattern.
>   (vec_cmpu): Likewise.
>   (vcond): Require operand 5 to be a
> register
>   or zero.  Use arm_expand_vcond.
>   (vcond): New pattern.
>   (vcondu): Generalize to...
>   (vcondu   to be a register or zero.  Use arm_expand_vcond.
>   (vcond_mask_): New pattern.
>   (neon_vc, neon_vc_insn): Add
> "@" marker.
>   (neon_vbsl): Likewise.
>   (neon_vcu): Reexpress as...
>   (@neon_vc): ...this.
> 
> gcc/testsuite/
>   * lib/target-supports.exp (check_effective_target_vect_cond_mixed):
> Add
>   arm neon targets.
>   * gcc.target/arm/neon-compare-1.c: New test.
>   * gcc.target/arm/neon-compare-2.c: Likewise.
>   * gcc.target/arm/neon-compare-3.c: Likewise.
>   * gcc.target/arm/neon-compare-4.c: Likewise.
>   * gcc.target/arm/neon-compare-5.c: Likewise.
>   * gcc.target/arm/neon-vcond-gt.c: Expect comparisons with zero.
>   * gcc.target/arm/neon-vcond-ltgt.c: Likewise.
>   * gcc.target/arm/neon-vcond-unordered.c: Likewise.
> ---
>  gcc/config/arm/arm-protos.h   |   2 +
>  gcc/config/arm/arm.c  | 121 
>  gcc/config/arm/neon.md| 281 --
>  gcc/testsuite/gcc.target/arm/neon-compare-1.c |  84 ++
>  gcc/testsuite/gcc.target/arm/neon-compare-2.c |  45 +++
>  gcc/testsuite/gcc.target/arm/neon-compare-3.c |  44 +++
>  gcc/testsuite/gcc.target/arm/neon-compare-4.c |  38 +++
>  gcc/testsuite/gcc.target/arm/neon-compare-5.c |  37 +++
>  gcc/testsuite/gcc.target/arm/neon-vcond-gt.c  |   2 +-
>  .../gcc.target/arm/neon-vcond-ltgt.c  |   3 +-
>  .../gcc.target/arm/neon-vcond-unordered.c |   4 +-
>  gcc/testsuite/lib/target-supports.exp |   2 +
>  12 files changed, 442 insertions(+), 221 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/neon-compare-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/neon-compare-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/neon-compare-3.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/neon-compare-4.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/neon-compare-5.c
> 
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 9bb9c61967b..703d6160c24 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -372,9 +372,11 @@ extern void arm_emit_coreregs_64bit_shift (enum
> rtx_code, rtx, rtx, rtx, rtx,
>  extern bool arm_fusion_enabled_p (tune_params::fuse_ops);
>  extern bool arm_valid_symbolic_address_p (rtx);
>  extern bool arm_validize_comparison (rtx *, rtx *, rtx *);
> +extern bool arm_expand_vector_compare (rtx, rtx_code, rtx, rtx, bool);
>  #endif /* RTX_CODE */
> 
>  extern bool arm_gen_setmem (rtx *);
> +extern void arm_expand_vcond (rtx *, machine_mode);
>  extern void arm_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel);
> 
>  extern bool arm_autoinc_modes_ok_p (machine_mode, enum
> arm_auto_incmodes);
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 8105b39e7a4..0e23246c27b 100644
> --- 

RE: [PATCH PR96375] arm: Fix testcase selection for Low Overhead Loop tests

2020-10-01 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Andrea Corallo 
> Sent: 01 October 2020 15:36
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Kyrylo Tkachov
> ; Christophe Lyon 
> Subject: Re: [PATCH PR96375] arm: Fix testcase selection for Low Overhead
> Loop tests
> 
> Andrea Corallo  writes:
> 
> > Hi all,
> >
> > I'd like to submit the following patch to fix PR96375 ([11 regression]
> > arm/lob[2-5].c fail on some configurations).
> >
> > It fix the observed regression making sure -mthumb is always used and
> > allowing Low Overhead Loop tests to be executed only on cortex-M profile
> > targets.
> >
> > Does not introduce regressions in my testing and fix the reported one
> > according to Christophe (in Cc).
> >
> > Okay for trunk?

Ok.
Thanks,
Kyrill

> >
> > Thanks
> >
> >   Andrea
> >
> > 2020-07-31  Andrea Corallo  
> >
> > * gcc.target/arm/lob1.c: Fix missing flag.
> > * gcc.target/arm/lob2.c: Likewise.
> > * gcc.target/arm/lob3.c: Likewise.
> > * gcc.target/arm/lob4.c: Likewise.
> > * gcc.target/arm/lob5.c: Likewise.
> > * gcc.target/arm/lob6.c: Likewise.
> > * lib/target-supports.exp
> > (check_effective_target_arm_v8_1_lob_ok): Return 1 only for
> > cortex-m targets, add '-mthumb' flag.
> 
> I really forgot to ping this patch (I'm reattaching it for convenience).
> 
> It still applies cleanly so... ping :)



Re: [PATCH PR96375] arm: Fix testcase selection for Low Overhead Loop tests

2020-10-01 Thread Andrea Corallo via Gcc-patches
Andrea Corallo  writes:

> Hi all,
>
> I'd like to submit the following patch to fix PR96375 ([11 regression]
> arm/lob[2-5].c fail on some configurations).
>
> It fix the observed regression making sure -mthumb is always used and
> allowing Low Overhead Loop tests to be executed only on cortex-M profile
> targets.
>
> Does not introduce regressions in my testing and fix the reported one
> according to Christophe (in Cc).
>
> Okay for trunk?
>
> Thanks
>
>   Andrea
>
> 2020-07-31  Andrea Corallo  
>
>   * gcc.target/arm/lob1.c: Fix missing flag.
>   * gcc.target/arm/lob2.c: Likewise.
>   * gcc.target/arm/lob3.c: Likewise.
>   * gcc.target/arm/lob4.c: Likewise.
>   * gcc.target/arm/lob5.c: Likewise.
>   * gcc.target/arm/lob6.c: Likewise.
>   * lib/target-supports.exp
>   (check_effective_target_arm_v8_1_lob_ok): Return 1 only for
>   cortex-m targets, add '-mthumb' flag.

I really forgot to ping this patch (I'm reattaching it for convenience).

It still applies cleanly so... ping :)

>From 7056cfbde6ccf43eaf8651af2b4a09a31c9276de Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Fri, 31 Jul 2020 14:52:24 +0100
Subject: [PATCH] arm: Fix testcase selection for Low Overhead Loop tests
 [PR96375]

gcc/testsuite/ChangeLog

2020-07-31  Andrea Corallo  

* gcc.target/arm/lob1.c: Fix missing flag.
* gcc.target/arm/lob2.c: Likewise.
* gcc.target/arm/lob3.c: Likewise.
* gcc.target/arm/lob4.c: Likewise.
* gcc.target/arm/lob5.c: Likewise.
* gcc.target/arm/lob6.c: Likewise.
* lib/target-supports.exp
(check_effective_target_arm_v8_1_lob_ok): Return 1 only for
cortex-m targets, add '-mthumb' flag.
---
 gcc/testsuite/gcc.target/arm/lob1.c   | 2 +-
 gcc/testsuite/gcc.target/arm/lob2.c   | 2 +-
 gcc/testsuite/gcc.target/arm/lob3.c   | 2 +-
 gcc/testsuite/gcc.target/arm/lob4.c   | 2 +-
 gcc/testsuite/gcc.target/arm/lob5.c   | 2 +-
 gcc/testsuite/gcc.target/arm/lob6.c   | 2 +-
 gcc/testsuite/lib/target-supports.exp | 4 ++--
 7 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/lob1.c 
b/gcc/testsuite/gcc.target/arm/lob1.c
index b92dc551d50..ba5c82cd55c 100644
--- a/gcc/testsuite/gcc.target/arm/lob1.c
+++ b/gcc/testsuite/gcc.target/arm/lob1.c
@@ -3,7 +3,7 @@
 /* { dg-do run } */
 /* { dg-require-effective-target arm_v8_1_lob_ok } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" } } */
-/* { dg-options "-march=armv8.1-m.main -O3 --save-temps" } */
+/* { dg-options "-march=armv8.1-m.main -mthumb -O3 --save-temps" } */
 #include 
 #include "lob.h"
 
diff --git a/gcc/testsuite/gcc.target/arm/lob2.c 
b/gcc/testsuite/gcc.target/arm/lob2.c
index 1fe9a9d82bb..fdeb2686f51 100644
--- a/gcc/testsuite/gcc.target/arm/lob2.c
+++ b/gcc/testsuite/gcc.target/arm/lob2.c
@@ -2,7 +2,7 @@
if a non-inlineable function call takes place inside the loop.  */
 /* { dg-do compile } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" } } */
-/* { dg-options "-march=armv8.1-m.main -O3 --save-temps" } */
+/* { dg-options "-march=armv8.1-m.main -mthumb -O3 --save-temps" } */
 #include 
 #include "lob.h"
 
diff --git a/gcc/testsuite/gcc.target/arm/lob3.c 
b/gcc/testsuite/gcc.target/arm/lob3.c
index 17cba007ccb..70314ea84b3 100644
--- a/gcc/testsuite/gcc.target/arm/lob3.c
+++ b/gcc/testsuite/gcc.target/arm/lob3.c
@@ -2,7 +2,7 @@
if causes VFP emulation library calls to happen inside the loop.  */
 /* { dg-do compile } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" } } */
-/* { dg-options "-march=armv8.1-m.main -O3 --save-temps -mfloat-abi=soft" } */
+/* { dg-options "-march=armv8.1-m.main -mthumb -O3 --save-temps 
-mfloat-abi=soft" } */
 /* { dg-require-effective-target arm_softfloat } */
 #include 
 #include "lob.h"
diff --git a/gcc/testsuite/gcc.target/arm/lob4.c 
b/gcc/testsuite/gcc.target/arm/lob4.c
index 444a2c7b4bf..792f352d682 100644
--- a/gcc/testsuite/gcc.target/arm/lob4.c
+++ b/gcc/testsuite/gcc.target/arm/lob4.c
@@ -2,7 +2,7 @@
if LR is modified within the loop.  */
 /* { dg-do compile } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" } } */
-/* { dg-options "-march=armv8.1-m.main -O3 --save-temps -mfloat-abi=soft" } */
+/* { dg-options "-march=armv8.1-m.main -mthumb -O3 --save-temps 
-mfloat-abi=soft" } */
 /* { dg-require-effective-target arm_softfloat } */
 #include 
 #include "lob.h"
diff --git a/gcc/testsuite/gcc.target/arm/lob5.c 
b/gcc/testsuite/gcc.target/arm/lob5.c
index c4f46e41532..1a6adf1e28e 100644
--- a/gcc/testsuite/gcc.target/arm/lob5.c
+++ b/gcc/testsuite/gcc.target/arm/lob5.c
@@ -3,7 +3,7 @@
therefore is not optimizable.  Outer loops are not optimized.  */
 /* { dg-do compile } */
 /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" } } */
-/* { dg-options "-march=armv8.1-m.main -O3 

[PATCH] libstdc++: Add missing P0896 changes to

2020-10-01 Thread Patrick Palka via Gcc-patches
I noticed that the following changes from this paper were not yet
implemented.

OK to commit after testing on x86_64-pc-linux-gnu finishes successfully?

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (reverse_iterator::iter_move):
Define for C++20 as per P0896.
(reverse_iterator::iter_swap): Likewise.
(move_iterator::operator*): Apply P0896 changes for C++20.
(move_iterator::operator[]): Likewise.
* testsuite/24_iterators/reverse_iterator/cust.cc: New test.
---
 libstdc++-v3/include/bits/stl_iterator.h  | 33 
 .../24_iterators/reverse_iterator/cust.cc | 51 +++
 2 files changed, 84 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/24_iterators/reverse_iterator/cust.cc

diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index f29bae92706..ca3c4cda329 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -362,6 +362,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   operator[](difference_type __n) const
   { return *(*this + __n); }
 
+#if __cplusplus > 201703L && __cpp_lib_concepts
+  friend constexpr iter_rvalue_reference_t<_Iterator>
+  iter_move(const reverse_iterator& __i)
+  noexcept(is_nothrow_copy_constructible_v<_Iterator>
+  && noexcept(ranges::iter_move(--declval<_Iterator&>(
+  {
+   auto __tmp = __i.base();
+   return ranges::iter_move(--__tmp);
+  }
+
+  template _Iter2>
+   friend constexpr void
+   iter_swap(const reverse_iterator& __x,
+ const reverse_iterator<_Iter2>& __y)
+   noexcept(is_nothrow_copy_constructible_v<_Iterator>
+&& is_nothrow_copy_constructible_v<_Iter2>
+&& noexcept(ranges::iter_swap(--declval<_Iterator&>(),
+  --declval<_Iter2&>(
+   {
+ auto __xtmp = __x.base();
+ auto __ytmp = __y.base();
+ ranges::iter_swap(--__xtmp, --__ytmp);
+   }
+#endif
+
 private:
   template
static _GLIBCXX17_CONSTEXPR _Tp*
@@ -1379,7 +1404,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   _GLIBCXX17_CONSTEXPR reference
   operator*() const
+#if __cplusplus > 201703L && __cpp_lib_concepts
+  { return ranges::iter_move(_M_current); }
+#else
   { return static_cast(*_M_current); }
+#endif
 
   _GLIBCXX17_CONSTEXPR pointer
   operator->() const
@@ -1445,7 +1474,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   _GLIBCXX17_CONSTEXPR reference
   operator[](difference_type __n) const
+#if __cplusplus > 201703L && __cpp_lib_concepts
+  { return ranges::iter_move(_M_current + __n); }
+#else
   { return std::move(_M_current[__n]); }
+#endif
 
 #if __cplusplus > 201703L && __cpp_lib_concepts
   template _Sent>
diff --git a/libstdc++-v3/testsuite/24_iterators/reverse_iterator/cust.cc 
b/libstdc++-v3/testsuite/24_iterators/reverse_iterator/cust.cc
new file mode 100644
index 000..3476780e34c
--- /dev/null
+++ b/libstdc++-v3/testsuite/24_iterators/reverse_iterator/cust.cc
@@ -0,0 +1,51 @@
+// Copyright (C) 2019-2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { target c++2a } }
+
+#include 
+#include 
+
+// This test is an adaption of 24_iterators/move_iterator/cust.cc.
+
+constexpr bool
+test01()
+{
+  struct X
+  {
+constexpr X(int i) noexcept : i(i) { }
+constexpr X(X&& x) noexcept : i(x.i) { x.i = -1; }
+constexpr X& operator=(X&& x) noexcept { i = x.i; x.i = 0; return *this; }
+int i;
+  };
+
+  X arr[] = { 1, 2 };
+  std::reverse_iterator i(arr + 1), j(arr + 2);
+  static_assert(noexcept(std::ranges::iter_swap(i, j)));
+  std::ranges::iter_swap(i, j);
+  VERIFY( arr[0].i == 2 );
+  VERIFY( arr[1].i == 1 );
+
+  static_assert(noexcept(std::ranges::iter_move(i)));
+  X x = std::ranges::iter_move(i);
+  VERIFY( arr[0].i == -1 );
+  VERIFY( x.i == 2 );
+  return true;
+}
+
+static_assert(test01());
-- 
2.28.0.651.g306ee63a70



Re: Commonize handling of attr-fnspec

2020-10-01 Thread Jan Hubicka
Hi
> > +  if (!fnspec.arg_specified_p (arg))
> > +;
> > +  else if (!fnspec.arg_used_p (arg))
> > +flags = EAF_UNUSED;
> > +  else
> > +{
> > +  if (!fnspec.arg_direct_p (arg))
> 
> negated test
> 
> > +   flags |= EAF_DIRECT;
> > +  if (!fnspec.arg_noescape_p (arg))
> > +   flags |= EAF_NOESCAPE;
> 
> likewise.
> 
> > +  if (!fnspec.arg_readonly_p (arg))
> > +   flags |= EAF_NOCLOBBER;
> 
> 
> likewise.

Oops, sorry for that.  I got carried away trying to make sense of
fortran specifiers. Thanks for catching this. I wonder how that chould
have passed testing.
> 
> >  }
> > +  return flags;
> >  }
> >  
> >  /* Detects return flags for the call STMT.  */
> > @@ -1546,24 +1542,16 @@ gimple_call_return_flags (const gcall *stmt)
> >  return ERF_NOALIAS;
> >  
> >attr = gimple_call_fnspec (stmt);
> > -  if (!attr || TREE_STRING_LENGTH (attr) < 1)
> > +  if (!attr)
> >  return 0;
> > +  attr_fnspec fnspec (attr);
> >  
> > -  switch (TREE_STRING_POINTER (attr)[0])
> > -{
> > -case '1':
> > -case '2':
> > -case '3':
> > -case '4':
> > -  return ERF_RETURNS_ARG | (TREE_STRING_POINTER (attr)[0] - '1');
> > -
> > -case 'm':
> > -  return ERF_NOALIAS;
> > +  if (fnspec.returns_arg () >= 0)
> > +return ERF_RETURNS_ARG | fnspec.returns_arg ();
> 
> hmm, maybe
> 
>  if (fnspec.returns_arg_p ())
>return ERF_RETURNS_ARG | arg;

I added arg variable, but I think returning -1 for unknown arg is kind
of more consistent with what we do elsewhere (plus referneces are not
cool)
> > +  /* True if memory reached is only written into (but not read).  */
> > +  bool
> > +  arg_writeonly_p (unsigned int i)
> 
> This is actually arg_readwrite_p (), there's no flag for write-only.
> wW are merely for noescape & only direct read/write.

I dropped this for now, but for tree-ssa-alias we will need to have way
to specify that parameter is only written into.
> 
> Currently all specified args imply noescape btw.

Yep, but I think we may want to change this, so I think it is safer to
list those that do.

This is updated patch I am re-testing. Does it look OK?

Next I would like to proceed by blowing up all specifiers to double of
size (without functional changes) and then add the extra letters.

I was wondering if we want to use ' ' instead of '.' for the second
char.  It may make it easier to read ". . . R " than "..R." but it
also may be bit misleading in a way that the there must be precisely one
space.

Honza

gcc/ChangeLog:

2020-10-01  Jan Hubicka  

* attr-fnspec.h: New file.
* calls.c (decl_return_flags): Implement using attr_fnspec.
* gimple.c (gimple_call_arg_flags): Likewise
(gimple_call_return_flags): Likewise
* tree-into-ssa.c (pass_build_ssa::execute): Likewise.
* tree-ssa-alias.c (attr_fnspec::verify): New

diff --git a/gcc/attr-fnspec.h b/gcc/attr-fnspec.h
new file mode 100644
index 000..4ad4b8758e0
--- /dev/null
+++ b/gcc/attr-fnspec.h
@@ -0,0 +1,141 @@
+/* Handling of fnspec attribute specifiers
+   Copyright (C) 2008-2020 Free Software Foundation, Inc.
+   Contributed by Richard Guenther  
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+/* Parse string of attribute "fn spec".  This is an internal attribute
+   describing side effects of a function as follows:
+
+   character 0  specifies properties of return values as follows:
+ '1'...'4'  specifies number of argument function returns (as in memset)
+ 'm'   specifies that returned value is noalias (as in malloc)
+ '.'   specifies that nothing is known.
+
+   character 1+i specifies properties of argument number i as follows:
+ 'x' or 'X' specifies that parameter is unused.
+ 'r' or 'R' specifies that parameter is only read and memory pointed to is
+   never dereferenced.
+ 'w' or 'W' specifies that parameter is only written to.
+ '.'   specifies that nothing is known.
+   The uppercase letter in addition specifies that parameter
+   is non-escaping.  */
+
+#ifndef ATTR_FNSPEC_H
+#define ATTR_FNSPEC_H
+
+class attr_fnspec
+{
+private:
+  /* fn spec attribute string.  */
+  const char *str;
+  /* length of the fn spec string.  */
+  const unsigned len;
+  /* Number of characters specifying return value.  */
+  const unsigned int return_desc_size 

[PATCH][GCC 8] AArch64: Add rng feature to Neoverse V1

2020-10-01 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This patch adds the +rng feature to the Neoverse V1 entry. It exists in the GCC 
11 and 10 branches, but was missed out on GCC 9 and 8 as those didn't support 
the rng intrinsic then, but they do now.

Bootstrapped and tested on aarch64-none-linux-gnu.
Committing to GCC 8.

Thanks,
Kyrill

gcc/
* config/aarch64/aarch64-cores.def (zeus): Add AARCH64_FL_RNG to 
features.
(neoverse-v1): Likewise.


rng-8.patch
Description: rng-8.patch


[PATCH][GCC 9] AArch64: Add rng feature to Neoverse V1

2020-10-01 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This patch adds the +rng feature to the Neoverse V1 entry. It exists in the GCC 
11 and 10 branches, but was missed out on GCC 9 and 8 as those didn't support 
the rng intrinsic then, but they do now.

Bootstrapped and tested on aarch64-none-linux-gnu.
Committing to GCC 9 and an appropriate backport to GCC 8 later.

Thanks,
Kyrill

gcc/
* config/aarch64/aarch64-cores.def (zeus): Add AARCH64_FL_RNG to 
features.
(neoverse-v1): Likewise.


rng-9.patch
Description: rng-9.patch


[PATCH] arm: Add missing vec_cmp and vcond patterns

2020-10-01 Thread Richard Sandiford via Gcc-patches
This patch does several things at once:

(1) Add vector compare patterns (vec_cmp and vec_cmpu).

(2) Add vector selects between floating-point modes when the
values being compared are integers (affects vcond and vcondu).

(3) Add vector selects between integer modes when the values being
compared are floating-point (affects vcond).

(4) Add standalone vector select patterns (vcond_mask).

(5) Tweak the handling of compound comparisons with zeros.

Unfortunately it proved too difficult (for me) to separate this
out into a series of smaller patches, since everything is so
inter-related.  Defining only some of the new patterns does
not leave things in a happy state.

The handling of comparisons is mostly taken from the vcond patterns.
This means that it remains non-compliant with IEEE: “quiet” comparisons
use signalling instructions.  But that shouldn't matter for floats,
since we require -funsafe-math-optimizations to vectorize for them
anyway.

It remains the case that comparisons and selects aren't implemented
at all for HF vectors.  Implementing those feels like separate work.

Tested on arm-linux-gnueabihf and arm-eabi (for MVE).  OK to install?

Richard


gcc/
PR target/96528
PR target/97288
* config/arm/arm-protos.h (arm_expand_vector_compare): Declare.
(arm_expand_vcond): Likewise.
* config/arm/arm.c (arm_expand_vector_compare): New function.
(arm_expand_vcond): Likewise.
* config/arm/neon.md (vec_cmp): New pattern.
(vec_cmpu): Likewise.
(vcond): Require operand 5 to be a register
or zero.  Use arm_expand_vcond.
(vcond): New pattern.
(vcondu): Generalize to...
(vcondu): New pattern.
(neon_vc, neon_vc_insn): Add "@" marker.
(neon_vbsl): Likewise.
(neon_vcu): Reexpress as...
(@neon_vc): ...this.

gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_cond_mixed): Add
arm neon targets.
* gcc.target/arm/neon-compare-1.c: New test.
* gcc.target/arm/neon-compare-2.c: Likewise.
* gcc.target/arm/neon-compare-3.c: Likewise.
* gcc.target/arm/neon-compare-4.c: Likewise.
* gcc.target/arm/neon-compare-5.c: Likewise.
* gcc.target/arm/neon-vcond-gt.c: Expect comparisons with zero.
* gcc.target/arm/neon-vcond-ltgt.c: Likewise.
* gcc.target/arm/neon-vcond-unordered.c: Likewise.
---
 gcc/config/arm/arm-protos.h   |   2 +
 gcc/config/arm/arm.c  | 121 
 gcc/config/arm/neon.md| 281 --
 gcc/testsuite/gcc.target/arm/neon-compare-1.c |  84 ++
 gcc/testsuite/gcc.target/arm/neon-compare-2.c |  45 +++
 gcc/testsuite/gcc.target/arm/neon-compare-3.c |  44 +++
 gcc/testsuite/gcc.target/arm/neon-compare-4.c |  38 +++
 gcc/testsuite/gcc.target/arm/neon-compare-5.c |  37 +++
 gcc/testsuite/gcc.target/arm/neon-vcond-gt.c  |   2 +-
 .../gcc.target/arm/neon-vcond-ltgt.c  |   3 +-
 .../gcc.target/arm/neon-vcond-unordered.c |   4 +-
 gcc/testsuite/lib/target-supports.exp |   2 +
 12 files changed, 442 insertions(+), 221 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/neon-compare-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/neon-compare-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/neon-compare-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/neon-compare-4.c
 create mode 100644 gcc/testsuite/gcc.target/arm/neon-compare-5.c

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 9bb9c61967b..703d6160c24 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -372,9 +372,11 @@ extern void arm_emit_coreregs_64bit_shift (enum rtx_code, 
rtx, rtx, rtx, rtx,
 extern bool arm_fusion_enabled_p (tune_params::fuse_ops);
 extern bool arm_valid_symbolic_address_p (rtx);
 extern bool arm_validize_comparison (rtx *, rtx *, rtx *);
+extern bool arm_expand_vector_compare (rtx, rtx_code, rtx, rtx, bool);
 #endif /* RTX_CODE */
 
 extern bool arm_gen_setmem (rtx *);
+extern void arm_expand_vcond (rtx *, machine_mode);
 extern void arm_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel);
 
 extern bool arm_autoinc_modes_ok_p (machine_mode, enum arm_auto_incmodes);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8105b39e7a4..0e23246c27b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -30634,6 +30634,127 @@ arm_split_atomic_op (enum rtx_code code, rtx old_out, 
rtx new_out, rtx mem,
 arm_post_atomic_barrier (model);
 }

+/* Expand code to compare vectors OP0 and OP1 using condition CODE.
+   If CAN_INVERT, store either the result or its inverse in TARGET
+   and return true if TARGET contains the inverse.  If !CAN_INVERT,
+   always store the result in TARGET, never its inverse.
+
+   Note that the handling of floating-point comparisons is not
+   IEEE compliant.  */
+
+bool

Re: Commonize handling of attr-fnspec

2020-10-01 Thread Richard Biener
On Thu, 1 Oct 2020, Jan Hubicka wrote:

> Hi,
> this patch adds the simple class for parsing fnspec attribute.  I plan
> to add support for generating and modifying it too (it is used by
> fortran and I plan to make modref to detect noclobbers and stuff).
> Verification is disabled until we fix remaining fortran specifier
> (I got a promised help at monday)
> 
> Bootstrapped/regtested x86_64-linux, OK?

Well, besides many errors, see below...

> Honza
> 
> gcc/ChangeLog:
> 
> 2020-10-01  Jan Hubicka  
> 
>   * calls.c (decl_return_flags): Implement using attr_fnspec.
>   * gimple.c (gimple_call_arg_flags): Likewise
>   (gimple_call_return_flags): Likewise
>   * tree-into-ssa.c (pass_build_ssa::execute): Likewise.
>   * tree-ssa-alias.c (attr_fnspec::verify): New
>   * attr-fnspec.h: New file.
> 
> diff --git a/gcc/calls.c b/gcc/calls.c
> index ed4363811c8..94f433685b9 100644
> --- a/gcc/calls.c
> +++ b/gcc/calls.c
> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "attribs.h"
>  #include "builtins.h"
>  #include "gimple-fold.h"
> +#include "attr-fnspec.h"
>  
>  #include "tree-pretty-print.h"
>  
> @@ -642,25 +643,14 @@ decl_return_flags (tree fndecl)
>if (!attr)
>  return 0;
>  
> -  attr = TREE_VALUE (TREE_VALUE (attr));
> -  if (!attr || TREE_STRING_LENGTH (attr) < 1)
> -return 0;
> -
> -  switch (TREE_STRING_POINTER (attr)[0])
> -{
> -case '1':
> -case '2':
> -case '3':
> -case '4':
> -  return ERF_RETURNS_ARG | (TREE_STRING_POINTER (attr)[0] - '1');
> +  attr_fnspec fnspec (TREE_VALUE (TREE_VALUE (attr)));
>  
> -case 'm':
> -  return ERF_NOALIAS;
> +  if (fnspec.returns_arg () >= 0)
> +return ERF_RETURNS_ARG | fnspec.returns_arg ();
>  
> -case '.':
> -default:
> -  return 0;
> -}
> +  if (fnspec.returns_noalias_p ())
> +return ERF_NOALIAS;
> +  return 0;
>  }
>  
>  /* Return nonzero when FNDECL represents a call to setjmp.  */
> diff --git a/gcc/gimple.c b/gcc/gimple.c
> index fd4e0fac0d4..2f2db309df5 100644
> --- a/gcc/gimple.c
> +++ b/gcc/gimple.c
> @@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "attribs.h"
>  #include "asan.h"
>  #include "langhooks.h"
> +#include "attr-fnspec.h"
>  
>  
>  /* All the tuples have their operand vector (if present) at the very bottom
> @@ -1508,31 +1509,26 @@ gimple_call_arg_flags (const gcall *stmt, unsigned 
> arg)
>  {
>const_tree attr = gimple_call_fnspec (stmt);
>  
> -  if (!attr || 1 + arg >= (unsigned) TREE_STRING_LENGTH (attr))
> +  if (!attr)
>  return 0;
>  
> -  switch (TREE_STRING_POINTER (attr)[1 + arg])
> -{
> -case 'x':
> -case 'X':
> -  return EAF_UNUSED;
> -
> -case 'R':
> -  return EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE;
> -
> -case 'r':
> -  return EAF_NOCLOBBER | EAF_NOESCAPE;
> -
> -case 'W':
> -  return EAF_DIRECT | EAF_NOESCAPE;
> -
> -case 'w':
> -  return EAF_NOESCAPE;
> +  int flags = 0;
> +  attr_fnspec fnspec (attr);
>  
> -case '.':
> -default:
> -  return 0;
> +  if (!fnspec.arg_specified_p (arg))
> +;
> +  else if (!fnspec.arg_used_p (arg))
> +flags = EAF_UNUSED;
> +  else
> +{
> +  if (!fnspec.arg_direct_p (arg))

negated test

> + flags |= EAF_DIRECT;
> +  if (!fnspec.arg_noescape_p (arg))
> + flags |= EAF_NOESCAPE;

likewise.

> +  if (!fnspec.arg_readonly_p (arg))
> + flags |= EAF_NOCLOBBER;


likewise.

>  }
> +  return flags;
>  }
>  
>  /* Detects return flags for the call STMT.  */
> @@ -1546,24 +1542,16 @@ gimple_call_return_flags (const gcall *stmt)
>  return ERF_NOALIAS;
>  
>attr = gimple_call_fnspec (stmt);
> -  if (!attr || TREE_STRING_LENGTH (attr) < 1)
> +  if (!attr)
>  return 0;
> +  attr_fnspec fnspec (attr);
>  
> -  switch (TREE_STRING_POINTER (attr)[0])
> -{
> -case '1':
> -case '2':
> -case '3':
> -case '4':
> -  return ERF_RETURNS_ARG | (TREE_STRING_POINTER (attr)[0] - '1');
> -
> -case 'm':
> -  return ERF_NOALIAS;
> +  if (fnspec.returns_arg () >= 0)
> +return ERF_RETURNS_ARG | fnspec.returns_arg ();

hmm, maybe

 if (fnspec.returns_arg_p ())
   return ERF_RETURNS_ARG | arg;

?

>  
> -case '.':
> -default:
> -  return 0;
> -}
> +  if (fnspec.returns_noalias_p ())
> +return ERF_NOALIAS;
> +  return 0;
>  }
>  
>  
> diff --git a/gcc/tree-into-ssa.c b/gcc/tree-into-ssa.c
> index 0d016134774..1493b323956 100644
> --- a/gcc/tree-into-ssa.c
> +++ b/gcc/tree-into-ssa.c
> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "stringpool.h"
>  #include "attribs.h"
>  #include "asan.h"
> +#include "attr-fnspec.h"
>  
>  #define PERCENT(x,y) ((float)(x) * 100.0 / (float)(y))
>  
> @@ -2492,19 +2493,19 @@ pass_build_ssa::execute (function *fun)
>  }
>  
>/* Initialize SSA_NAME_POINTS_TO_READONLY_MEMORY.  */
> -  tree fnspec = lookup_attribute 

c++: pushdecl_top_level must set context

2020-10-01 Thread Nathan Sidwell

I discovered pushdecl_top_level was not setting the decl's context,
and we ended up with namespace-scope decls with NULL context.  That
broke modules.  Then I discovered a couple of places where we set the
context to a FUNCTION_DECL, which is also wrong.  AFAICT the literals
in question belong in global scope, as they're comdatable entities.
But create_temporary would use current_scope for the context before we
pushed it into namespace scope.

This patch asserts the context is NULL and then sets it to the frobbed
global_namespace.

gcc/cp/
* name-lookup.c (pushdecl_top_level): Assert incoming context is
null, add global_namespace context.
(pushdecl_top_level_and_finish): Likewise.
* pt.c (get_template_parm_object): Clear decl context before
pushing.
* semantics.c (finish_compound_literal): Likewise.

pushing to trunk

nathan

--
Nathan Sidwell
diff --git i/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index 8cd6fe38271..620f3a6 100644
--- i/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -7404,6 +7404,8 @@ pushdecl_top_level (tree x)
 {
   bool subtime = timevar_cond_start (TV_NAME_LOOKUP);
   do_push_to_top_level ();
+  gcc_checking_assert (!DECL_CONTEXT (x));
+  DECL_CONTEXT (x) = FROB_CONTEXT (global_namespace);
   x = pushdecl_namespace_level (x);
   do_pop_from_top_level ();
   timevar_cond_stop (TV_NAME_LOOKUP, subtime);
@@ -7418,6 +7420,8 @@ pushdecl_top_level_and_finish (tree x, tree init)
 {
   bool subtime = timevar_cond_start (TV_NAME_LOOKUP);
   do_push_to_top_level ();
+  gcc_checking_assert (!DECL_CONTEXT (x));
+  DECL_CONTEXT (x) = FROB_CONTEXT (global_namespace);
   x = pushdecl_namespace_level (x);
   cp_finish_decl (x, init, false, NULL_TREE, 0);
   do_pop_from_top_level ();
diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index 869477f2c2e..45b18f6a5ad 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -7094,12 +7094,12 @@ get_template_parm_object (tree expr, tsubst_flags_t complain)
 
   tree type = cp_build_qualified_type (TREE_TYPE (expr), TYPE_QUAL_CONST);
   decl = create_temporary_var (type);
+  DECL_CONTEXT (decl) = NULL_TREE;
   TREE_STATIC (decl) = true;
   DECL_DECLARED_CONSTEXPR_P (decl) = true;
   TREE_READONLY (decl) = true;
   DECL_NAME (decl) = name;
   SET_DECL_ASSEMBLER_NAME (decl, name);
-  DECL_CONTEXT (decl) = global_namespace;
   comdat_linkage (decl);
 
   if (!zero_init_p (type))
diff --git i/gcc/cp/semantics.c w/gcc/cp/semantics.c
index b0930442bda..1e42cd799c2 100644
--- i/gcc/cp/semantics.c
+++ w/gcc/cp/semantics.c
@@ -3030,6 +3030,7 @@ finish_compound_literal (tree type, tree compound_literal,
   && initializer_constant_valid_p (compound_literal, type))
 {
   tree decl = create_temporary_var (type);
+  DECL_CONTEXT (decl) = NULL_TREE;
   DECL_INITIAL (decl) = compound_literal;
   TREE_STATIC (decl) = 1;
   if (literal_type_p (type) && CP_TYPE_CONST_NON_VOLATILE_P (type))


[RS6000] ICE in decompose, at rtl.h:2282

2020-10-01 Thread Alan Modra via Gcc-patches
during RTL pass: fwprop1
gcc.dg/pr82596.c: In function 'test_cststring':
gcc.dg/pr82596.c:27:1: internal compiler error: in decompose, at rtl.h:2282

-m32 gcc/testsuite/gcc.dg/pr82596.c fails along with other tests after
applying rtx_cost patches, which exposed a backend bug.

legitimize_address when presented with the following address
(plus (reg) (const_int 0x7))
attempts to rewrite it as a high/low sum.  The low part is 0x, or
-1, making the high part 0x8000.  But this is no longer canonical
for SImode.

Bootstrapped and regression tested powerpc64-linux biarch and
powerpc64le-linux.  OK?

* config/rs6000/rs6000.c (rs6000_legitimize_address): Properly
sign extend high part of address constant.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 375fff59928..d0924d59a65 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -8364,7 +8364,7 @@ rs6000_legitimize_address (rtx x, rtx oldx 
ATTRIBUTE_UNUSED,
low_int = 0;
   high_int = INTVAL (XEXP (x, 1)) - low_int;
   sum = force_operand (gen_rtx_PLUS (Pmode, XEXP (x, 0),
-GEN_INT (high_int)), 0);
+gen_int_mode (high_int, Pmode)), 0);
   return plus_constant (Pmode, sum, low_int);
 }
   else if (GET_CODE (x) == PLUS

-- 
Alan Modra
Australia Development Lab, IBM


Re: [RS6000] -mno-minimal-toc vs. power10 pcrelative

2020-10-01 Thread Alan Modra via Gcc-patches
On Wed, Sep 30, 2020 at 03:56:32PM -0500, Segher Boessenkool wrote:
> On Wed, Sep 30, 2020 at 05:01:45PM +0930, Alan Modra wrote:
> > * config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't
> > set -mcmodel=small for -mno-minimal-toc when pcrel.
> 
> > - SET_CMODEL (CMODEL_SMALL);\
> > + if (TARGET_MINIMAL_TOC\
> > + || !(TARGET_PCREL \
> > +  || (PCREL_SUPPORTED_BY_OS\
> > +  && (rs6000_isa_flags_explicit\
> > +  & OPTION_MASK_PCREL) == 0))) \
> > +   SET_CMODEL (CMODEL_SMALL);  \
> 
> Please write this in a more readable way?  With some "else" statements,
> perhaps.
> 
> It is also fine to SET_CMODEL twice if that makes for simpler code.

Committed as per your suggestion.  I was looking at it again today
with the aim of converting this ugly macro to a function, and spotted
the duplication in freebsd64.h.  Which has some bit-rot.

Do you like the following?  rs6000_linux64_override_options is
functionally the same as what was in linux64.h.  I built various
configurations to test the change, powerpc64-linux, powerpc64le-linux
without any 32-bit targets enabled, powerpc64-freebsd12.0.

* config/rs6000/freebsd64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Use
rs6000_linux64_override_options.
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Break
out to..
* config/rs6000/rs6000.c (rs6000_linux64_override_options): ..this,
new function.  Tweak non-biarch test and clearing of
profile_kernel to work with freebsd64.h.

diff --git a/gcc/config/rs6000/freebsd64.h b/gcc/config/rs6000/freebsd64.h
index c9913638ffb..6984ca5a107 100644
--- a/gcc/config/rs6000/freebsd64.h
+++ b/gcc/config/rs6000/freebsd64.h
@@ -78,65 +78,7 @@ extern int dot_symbols;
 
 #undef  SUBSUBTARGET_OVERRIDE_OPTIONS
 #define SUBSUBTARGET_OVERRIDE_OPTIONS  \
-  do   \
-{  \
-  if (!global_options_set.x_rs6000_alignment_flags)\
-   rs6000_alignment_flags = MASK_ALIGN_NATURAL;\
-  if (TARGET_64BIT)\
-   {   \
- if (DEFAULT_ABI != ABI_AIX)   \
-   {   \
- rs6000_current_abi = ABI_AIX; \
- error (INVALID_64BIT, "call");\
-   }   \
- dot_symbols = !strcmp (rs6000_abi_name, "aixdesc");   \
- if (rs6000_isa_flags & OPTION_MASK_RELOCATABLE)   \
-   {   \
- rs6000_isa_flags &= ~OPTION_MASK_RELOCATABLE; \
- error (INVALID_64BIT, "relocatable"); \
-   }   \
- if (ELFv2_ABI_CHECK)  \
-   {   \
- rs6000_current_abi = ABI_ELFv2;   \
- if (dot_symbols)  \
-   error ("%<-mcall-aixdesc%> incompatible with %<-mabi=elfv2%>"); 
\
-   }   \
- if (rs6000_isa_flags & OPTION_MASK_EABI)  \
-   {   \
- rs6000_isa_flags &= ~OPTION_MASK_EABI;\
- error (INVALID_64BIT, "eabi");\
-   }   \
- if (TARGET_PROTOTYPE) \
-   {   \
- target_prototype = 0; \
- error (INVALID_64BIT, "prototype");   \
-   }   \
- if ((rs6000_isa_flags & OPTION_MASK_POWERPC64) == 0)  \
-   {   \
- rs6000_isa_flags |= OPTION_MASK_POWERPC64;\
- error ("%<-m64%> requires a PowerPC64 cpu");  \
-   }   \
-  if ((rs6000_isa_flags_explicit   \
-   & OPTION_MASK_MINIMAL_TOC) != 0)\
-   {   \
- if (global_options_set.x_rs6000_current_cmodel\
- && rs6000_current_cmodel != CMODEL_SMALL) \
-   error ("%<-mcmodel%> incompatible 

Re: [PATCH] generalized range_query class for multiple contexts

2020-10-01 Thread Andrew MacLeod via Gcc-patches

On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:

Thanks for doing all this!  There isn't anything I don't understand
in the sprintf changes so no questions from me (well, almost none).
Just some comments:

Thanks for your comments on the sprintf/strlen API conversion.


The current call statement is available in all functions that take
a directive argument, as dir->info.callstmt.  There should be no need
to also add it as a new argument to the functions that now need it.

Fixed.


The change adds code along these lines in a bunch of places:

+ value_range vr;
+ if (!query->range_of_expr (vr, arg, stmt))
+   vr.set_varying (TREE_TYPE (arg));

I thought under the new Ranger APIs when a range couldn't be
determined it would be automatically set to the maximum for
the type.  I like that and have been moving in that direction
with my code myself (rather than having an API fail, have it
set the max range and succeed).

I went through all the above idioms and noticed all are being used on
supported types (integers or pointers).  So range_of_expr will always
return true.  I've removed the if() and the set_varying.


Since that isn't so in this case, I think it would still be nice
if the added code could be written as if the range were set to
varying in this case and (ideally) reduced to just initialization:

value_range vr = some-function (query, stmt, arg);

some-function could be an inline helper defined just for the sprintf
pass (and maybe also strlen which also seems to use the same pattern),
or it could be a value_range AKA irange ctor, or it could be a member
of range_query, whatever is the most appropriate.

(If assigning/copying a value_range is thought to be too expensive,
declaring it first and then passing it to that helper to set it
would work too).

In strlen, is the removed comment no longer relevant?  (I.e., does
the ranger solve the problem?)

-  /* The range below may be "inaccurate" if a constant has been
-substituted earlier for VAL by this pass that hasn't been
-propagated through the CFG.  This shoud be fixed by the new
-on-demand VRP if/when it becomes available (hopefully in
-GCC 11).  */

It should.


I'm wondering about the comment added to get_range_strlen_dynamic
and other places:

+ // FIXME: Use range_query instead of global ranges.

Is that something you're planning to do in a followup or should
I remember to do it at some point?

I'm not planning on doing it.  It's just a reminder that it would be
beneficial to do so.


Otherwise I have no concern with the changes.

It's not cleared whether Andrew approved all 3 parts of the patchset
or just the valuation part.  I'll wait for his nod before committing
this chunk.

Aldy


I have no issue with it, so OK.

Just an observation that should be pointed out, I believe Aldy has all 
the code for converting to a ranger, but we have not pursued that any 
further yet since there is a regression due to our lack of equivalence 
processing I think?  That should be resolved in the coming month, but at 
the moment is a holdback/concern for converting these passes...  iirc.


Andrew




[PATCH] tree-optimization/97236 - fix bad use of VMAT_CONTIGUOUS

2020-10-01 Thread Richard Biener
This avoids using VMAT_CONTIGUOUS with single-element interleaving
when using V1mode vectors.  Instead keep VMAT_ELEMENTWISE but
continue to avoid load-lanes and gathers.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2020-10-01  Richard Biener  

PR tree-optimization/97236
* tree-vect-stmts.c (get_group_load_store_type): Keep
VMAT_ELEMENTWISE for single-element vectors.

* gcc.dg/vect/pr97236.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr97236.c | 43 +
 gcc/tree-vect-stmts.c   | 20 ++
 2 files changed, 52 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr97236.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr97236.c 
b/gcc/testsuite/gcc.dg/vect/pr97236.c
new file mode 100644
index 000..03e0cc38984
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr97236.c
@@ -0,0 +1,43 @@
+typedef unsigned char __uint8_t;
+typedef __uint8_t uint8_t;
+typedef struct plane_t {
+  uint8_t *p_pixels;
+  int i_lines;
+  int i_pitch;
+} plane_t;
+
+typedef struct {
+  plane_t p[5];
+} picture_t;
+
+#define N 4
+
+void __attribute__((noipa))
+picture_Clone(picture_t *picture, picture_t *res)
+{
+  for (int i = 0; i < N; i++) {
+res->p[i].p_pixels = picture->p[i].p_pixels;
+res->p[i].i_lines = picture->p[i].i_lines;
+res->p[i].i_pitch = picture->p[i].i_pitch;
+  }
+}
+
+int
+main()
+{
+  picture_t aaa, bbb;
+  uint8_t pixels[10] = {1, 1, 1, 1, 1, 1, 1, 1};
+
+  for (unsigned i = 0; i < N; i++)
+aaa.p[i].p_pixels = pixels;
+
+  picture_Clone (, );
+
+  uint8_t c;
+  for (unsigned i = 0; i < N; i++)
+c += bbb.p[i].p_pixels[0];
+
+  if (c != N)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 191957c3543..3575f25241f 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -2235,25 +2235,23 @@ get_group_load_store_type (vec_info *vinfo, 
stmt_vec_info stmt_info,
  /* First cope with the degenerate case of a single-element
 vector.  */
  if (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 1U))
-   *memory_access_type = VMAT_CONTIGUOUS;
+   ;
 
  /* Otherwise try using LOAD/STORE_LANES.  */
- if (*memory_access_type == VMAT_ELEMENTWISE
- && (vls_type == VLS_LOAD
- ? vect_load_lanes_supported (vectype, group_size, masked_p)
- : vect_store_lanes_supported (vectype, group_size,
-   masked_p)))
+ else if (vls_type == VLS_LOAD
+  ? vect_load_lanes_supported (vectype, group_size, masked_p)
+  : vect_store_lanes_supported (vectype, group_size,
+masked_p))
{
  *memory_access_type = VMAT_LOAD_STORE_LANES;
  overrun_p = would_overrun_p;
}
 
  /* If that fails, try using permuting loads.  */
- if (*memory_access_type == VMAT_ELEMENTWISE
- && (vls_type == VLS_LOAD
- ? vect_grouped_load_supported (vectype, single_element_p,
-group_size)
- : vect_grouped_store_supported (vectype, group_size)))
+ else if (vls_type == VLS_LOAD
+  ? vect_grouped_load_supported (vectype, single_element_p,
+ group_size)
+  : vect_grouped_store_supported (vectype, group_size))
{
  *memory_access_type = VMAT_CONTIGUOUS_PERMUTE;
  overrun_p = would_overrun_p;
-- 
2.26.2


Fix ICE in ipa_edge_args_sum_t::duplicate

2020-10-01 Thread Jan Hubicka
Hi
the ICE with -fno-ipa-modref is caused by fact that execute method has
early exit for non-existent summary that forgets to free ipa-prop edge
summaries.  I moved the freeing code to free-fnsummary pass that is more
robust way of handling this.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

2020-10-01  Jan Hubicka  

PR ipa/97244
* ipa-fnsummary.c (pass_free_fnsummary::execute): Free
also indirect inlining datastructure.
* ipa-modref.c (pass_ipa_modref::execute): Do not free them here.
* ipa-prop.c (ipa_free_all_node_params): Do not crash when info does
not exist.
(ipa_unregister_cgraph_hooks): Likewise.

gcc/testsuite/ChangeLog:

2020-10-01  Jan Hubicka  

PR ipa/97244
* gcc.dg/ipa/remref-2a.c: Add -fno-ipa-modref

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 4c1c1f91482..8285cc00d33 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -4680,6 +4680,8 @@ public:
   virtual unsigned int execute (function *)
 {
   ipa_free_fn_summary ();
+  /* Free ipa-prop structures if they are no longer needed.  */
+  ipa_free_all_structures_after_iinln ();
   if (!flag_wpa)
ipa_free_size_summary ();
   return 0;
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index 6225552e41a..2f4da8f2a14 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -1681,8 +1681,6 @@ pass_ipa_modref::execute (function *)
 }
   ((modref_summaries *)summaries)->ipa = false;
   ipa_free_postorder_info ();
-  /* Free ipa-prop structures if they are no longer needed.  */
-  ipa_free_all_structures_after_iinln ();
   return 0;
 }
 
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index b28c78eeab4..ea88fd3fd95 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -4124,7 +4124,8 @@ ipa_free_all_edge_args (void)
 void
 ipa_free_all_node_params (void)
 {
-  ggc_delete (ipa_node_params_sum);
+  if (ipa_node_params_sum)
+ggc_delete (ipa_node_params_sum);
   ipa_node_params_sum = NULL;
 }
 
@@ -4368,7 +4369,8 @@ ipa_register_cgraph_hooks (void)
 static void
 ipa_unregister_cgraph_hooks (void)
 {
-  symtab->remove_cgraph_insertion_hook (function_insertion_hook_holder);
+  if (function_insertion_hook_holder)
+symtab->remove_cgraph_insertion_hook (function_insertion_hook_holder);
   function_insertion_hook_holder = NULL;
 }
 
diff --git a/gcc/testsuite/gcc.dg/ipa/remref-2a.c 
b/gcc/testsuite/gcc.dg/ipa/remref-2a.c
index 34a6188249f..c2f3eac98a9 100644
--- a/gcc/testsuite/gcc.dg/ipa/remref-2a.c
+++ b/gcc/testsuite/gcc.dg/ipa/remref-2a.c
@@ -1,7 +1,7 @@
 /* Verify that indirect inlining can also remove references of the functions it
discovers calls for.  */
 /* { dg-do compile } */
-/* { dg-options "-O3 -fno-early-inlining -fno-ipa-cp -fdump-ipa-inline 
-fdump-tree-optimized -fno-ipa-icf"  } */
+/* { dg-options "-O3 -fno-early-inlining -fno-ipa-cp -fdump-ipa-inline 
-fdump-tree-optimized -fno-ipa-icf -fno-ipa-modref"  } */
 
 int global;
 


Re: [committed] s390: Fix up s390_atomic_assign_expand_fenv

2020-10-01 Thread Andreas Krebbel via Gcc-patches
On 01.10.20 11:13, Jakub Jelinek wrote:
> Hi!
> 
> The following patch fixes
> -FAIL: gcc.dg/pr94780.c (internal compiler error)
> -FAIL: gcc.dg/pr94780.c (test for excess errors)
> -FAIL: gcc.dg/pr94842.c (internal compiler error)
> -FAIL: gcc.dg/pr94842.c (test for excess errors)
> on s390x-linux.  The fix is essentially the same as has been applied to many
> other targets (i386, aarch64, arm, rs6000, alpha, riscv).
> 
> Bootstrapped/regtested on s390x-linux, committed to trunk and release
> branches as obvious.
> 
> 2020-10-01  Jakub Jelinek  
> 
>   * config/s390/s390.c (s390_atomic_assign_expand_fenv): Use
>   TARGET_EXPR instead of MODIFY_EXPR for the first assignments to
>   fenv_var and old_fpc.  Formatting fixes.

Thanks!

Andreas

> 
> --- gcc/config/s390/s390.c.jj 2020-09-14 09:04:36.086851054 +0200
> +++ gcc/config/s390/s390.c2020-09-30 10:22:50.579603271 +0200
> @@ -16082,12 +16082,13 @@ s390_atomic_assign_expand_fenv (tree *ho
>  
>   fenv_var = __builtin_s390_efpc ();
>   __builtin_s390_sfpc (fenv_var & mask) */
> -  tree old_fpc = build2 (MODIFY_EXPR, unsigned_type_node, fenv_var, 
> call_efpc);
> -  tree new_fpc =
> -build2 (BIT_AND_EXPR, unsigned_type_node, fenv_var,
> - build_int_cst (unsigned_type_node,
> -~(FPC_DXC_MASK | FPC_FLAGS_MASK |
> -  FPC_EXCEPTION_MASK)));
> +  tree old_fpc = build4 (TARGET_EXPR, unsigned_type_node, fenv_var, 
> call_efpc,
> +  NULL_TREE, NULL_TREE);
> +  tree new_fpc
> += build2 (BIT_AND_EXPR, unsigned_type_node, fenv_var,
> +   build_int_cst (unsigned_type_node,
> +  ~(FPC_DXC_MASK | FPC_FLAGS_MASK
> +| FPC_EXCEPTION_MASK)));
>tree set_new_fpc = build_call_expr (sfpc, 1, new_fpc);
>*hold = build2 (COMPOUND_EXPR, void_type_node, old_fpc, set_new_fpc);
>  
> @@ -16106,8 +16107,8 @@ s390_atomic_assign_expand_fenv (tree *ho
>__atomic_feraiseexcept ((old_fpc & FPC_FLAGS_MASK) >> FPC_FLAGS_SHIFT);  */
>  
>old_fpc = create_tmp_var_raw (unsigned_type_node);
> -  tree store_old_fpc = build2 (MODIFY_EXPR, void_type_node,
> -old_fpc, call_efpc);
> +  tree store_old_fpc = build4 (TARGET_EXPR, void_type_node, old_fpc, 
> call_efpc,
> +NULL_TREE, NULL_TREE);
>  
>set_new_fpc = build_call_expr (sfpc, 1, fenv_var);
>  
> 
> 
>   Jakub
> 



Commonize handling of attr-fnspec

2020-10-01 Thread Jan Hubicka
Hi,
this patch adds the simple class for parsing fnspec attribute.  I plan
to add support for generating and modifying it too (it is used by
fortran and I plan to make modref to detect noclobbers and stuff).
Verification is disabled until we fix remaining fortran specifier
(I got a promised help at monday)

Bootstrapped/regtested x86_64-linux, OK?

Honza

gcc/ChangeLog:

2020-10-01  Jan Hubicka  

* calls.c (decl_return_flags): Implement using attr_fnspec.
* gimple.c (gimple_call_arg_flags): Likewise
(gimple_call_return_flags): Likewise
* tree-into-ssa.c (pass_build_ssa::execute): Likewise.
* tree-ssa-alias.c (attr_fnspec::verify): New
* attr-fnspec.h: New file.

diff --git a/gcc/calls.c b/gcc/calls.c
index ed4363811c8..94f433685b9 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "builtins.h"
 #include "gimple-fold.h"
+#include "attr-fnspec.h"
 
 #include "tree-pretty-print.h"
 
@@ -642,25 +643,14 @@ decl_return_flags (tree fndecl)
   if (!attr)
 return 0;
 
-  attr = TREE_VALUE (TREE_VALUE (attr));
-  if (!attr || TREE_STRING_LENGTH (attr) < 1)
-return 0;
-
-  switch (TREE_STRING_POINTER (attr)[0])
-{
-case '1':
-case '2':
-case '3':
-case '4':
-  return ERF_RETURNS_ARG | (TREE_STRING_POINTER (attr)[0] - '1');
+  attr_fnspec fnspec (TREE_VALUE (TREE_VALUE (attr)));
 
-case 'm':
-  return ERF_NOALIAS;
+  if (fnspec.returns_arg () >= 0)
+return ERF_RETURNS_ARG | fnspec.returns_arg ();
 
-case '.':
-default:
-  return 0;
-}
+  if (fnspec.returns_noalias_p ())
+return ERF_NOALIAS;
+  return 0;
 }
 
 /* Return nonzero when FNDECL represents a call to setjmp.  */
diff --git a/gcc/gimple.c b/gcc/gimple.c
index fd4e0fac0d4..2f2db309df5 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "asan.h"
 #include "langhooks.h"
+#include "attr-fnspec.h"
 
 
 /* All the tuples have their operand vector (if present) at the very bottom
@@ -1508,31 +1509,26 @@ gimple_call_arg_flags (const gcall *stmt, unsigned arg)
 {
   const_tree attr = gimple_call_fnspec (stmt);
 
-  if (!attr || 1 + arg >= (unsigned) TREE_STRING_LENGTH (attr))
+  if (!attr)
 return 0;
 
-  switch (TREE_STRING_POINTER (attr)[1 + arg])
-{
-case 'x':
-case 'X':
-  return EAF_UNUSED;
-
-case 'R':
-  return EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE;
-
-case 'r':
-  return EAF_NOCLOBBER | EAF_NOESCAPE;
-
-case 'W':
-  return EAF_DIRECT | EAF_NOESCAPE;
-
-case 'w':
-  return EAF_NOESCAPE;
+  int flags = 0;
+  attr_fnspec fnspec (attr);
 
-case '.':
-default:
-  return 0;
+  if (!fnspec.arg_specified_p (arg))
+;
+  else if (!fnspec.arg_used_p (arg))
+flags = EAF_UNUSED;
+  else
+{
+  if (!fnspec.arg_direct_p (arg))
+   flags |= EAF_DIRECT;
+  if (!fnspec.arg_noescape_p (arg))
+   flags |= EAF_NOESCAPE;
+  if (!fnspec.arg_readonly_p (arg))
+   flags |= EAF_NOCLOBBER;
 }
+  return flags;
 }
 
 /* Detects return flags for the call STMT.  */
@@ -1546,24 +1542,16 @@ gimple_call_return_flags (const gcall *stmt)
 return ERF_NOALIAS;
 
   attr = gimple_call_fnspec (stmt);
-  if (!attr || TREE_STRING_LENGTH (attr) < 1)
+  if (!attr)
 return 0;
+  attr_fnspec fnspec (attr);
 
-  switch (TREE_STRING_POINTER (attr)[0])
-{
-case '1':
-case '2':
-case '3':
-case '4':
-  return ERF_RETURNS_ARG | (TREE_STRING_POINTER (attr)[0] - '1');
-
-case 'm':
-  return ERF_NOALIAS;
+  if (fnspec.returns_arg () >= 0)
+return ERF_RETURNS_ARG | fnspec.returns_arg ();
 
-case '.':
-default:
-  return 0;
-}
+  if (fnspec.returns_noalias_p ())
+return ERF_NOALIAS;
+  return 0;
 }
 
 
diff --git a/gcc/tree-into-ssa.c b/gcc/tree-into-ssa.c
index 0d016134774..1493b323956 100644
--- a/gcc/tree-into-ssa.c
+++ b/gcc/tree-into-ssa.c
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h"
 #include "attribs.h"
 #include "asan.h"
+#include "attr-fnspec.h"
 
 #define PERCENT(x,y) ((float)(x) * 100.0 / (float)(y))
 
@@ -2492,19 +2493,19 @@ pass_build_ssa::execute (function *fun)
 }
 
   /* Initialize SSA_NAME_POINTS_TO_READONLY_MEMORY.  */
-  tree fnspec = lookup_attribute ("fn spec",
- TYPE_ATTRIBUTES (TREE_TYPE (fun->decl)));
-  if (fnspec)
+  tree fnspec_tree
+= lookup_attribute ("fn spec",
+TYPE_ATTRIBUTES (TREE_TYPE (fun->decl)));
+  if (fnspec_tree)
 {
-  fnspec = TREE_VALUE (TREE_VALUE (fnspec));
-  unsigned i = 1;
+  attr_fnspec fnspec (TREE_VALUE (TREE_VALUE (fnspec_tree)));
+  unsigned i = 0;
   for (tree arg = DECL_ARGUMENTS (cfun->decl);
   arg; arg = DECL_CHAIN (arg), ++i)
{
- 

[patch] Rework the condition variables support for VxWorks

2020-10-01 Thread Olivier Hainque
This change reworks the condition variables support for VxWorks
to address the very legit points raised by Rasmus in

  https://gcc.gnu.org/pipermail/gcc/2020-May/232524.html

While some of the issues were taken care of by the use of semFlush,
a few others were indeed calling for adjustments.

We first considered resorting to the condvarLib library available
in VxWorks7. Unfortunately, it is vx7 only and we wanted something working
for at least vx 6.9 as well. It also turned out requiring the use of
recursive mutexes for condVarWait, which seemed unnecessarily constraining.

Instead, this change corrects the sequencing logic in a few places and
leverages the semExchange API to ensure the key atomicity requirement on
cond_wait operations.

Thanks to Alex for coming up with the correction! 

And thanks again to Rasmus and Jonathan for raising the point
and clarifying the libstdc++ expectations.

As for the other patches in the series I'm sending, this has been
used for a few months in gcc-9 based production compilers targeting
vxworks 7. It also passed a build & test sequence for powerpc-vxworks
7 and 6.9 with gcc-10 and a sanity check build with a recent mainline.

Olivier

2020-10-01  Alexandre Oliva  

libgcc/
* config/gthr-vxworks-thread.c: Include stdlib.h.
(tls_delete_hook): Prototype it.
(__gthread_cond_signal): Return early if no waiters.  Consume
signal in case the semaphore got full.  Use semInfoGet instead
of kernel-mode-only semInfo.
(__gthread_cond_timedwait): Use semExchange.  Always take the
mutex again before returning.
* config/gthr-vxworks-cond.c (__ghtread_cond_wait): Likewise.



0001-Rework-the-gthr-cond-variables-support-for-VxWorks.patch
Description: Binary data


c++: Refactor lookup_and_check_tag

2020-10-01 Thread Nathan Sidwell


It turns out I'd already found lookup_and_check_tag's control flow
confusing, and had refactored it on the modules branch.  For instance,
it continually checks 'if (decl &&$ condition)' before finally getting
to 'else if (!decl)'.  why not just check !decl first and be done?
Well, it is done thusly.

gcc/cp/
* decl.c (lookup_and_check_tag): Refactor.

pushed to trunk

nathan
--
Nathan Sidwell
diff --git i/gcc/cp/decl.c w/gcc/cp/decl.c
index 14742c115ad..d2a8d4012ab 100644
--- i/gcc/cp/decl.c
+++ w/gcc/cp/decl.c
@@ -14885,71 +14885,73 @@ lookup_and_check_tag (enum tag_types tag_code, tree name,
   else
 decl = lookup_elaborated_type (name, how);
 
-  if (decl
-  && (DECL_CLASS_TEMPLATE_P (decl)
-	  /* If scope is TAG_how::CURRENT_ONLY we're defining a class,
-	 so ignore a template template parameter.  */
-	  || (how != TAG_how::CURRENT_ONLY
-	  && DECL_TEMPLATE_TEMPLATE_PARM_P (decl
-decl = DECL_TEMPLATE_RESULT (decl);
-
-  if (decl && TREE_CODE (decl) == TYPE_DECL)
-{
-  /* Look for invalid nested type:
-	   class C {
-	 class C {};
-	   };  */
-  if (how == TAG_how::CURRENT_ONLY && DECL_SELF_REFERENCE_P (decl))
-	{
-	  error ("%qD has the same name as the class in which it is "
-		 "declared", decl);
-	  return error_mark_node;
-	}
-
-  /* Two cases we need to consider when deciding if a class
-	 template is allowed as an elaborated type specifier:
-	 1. It is a self reference to its own class.
-	 2. It comes with a template header.
 
-	 For example:
-
-	   template  class C {
-	 class C *c1;		// DECL_SELF_REFERENCE_P is true
-	 class D;
-	   };
-	   template  class C; // template_header_p is true
-	   template  class C::D {
-	 class C *c2;		// DECL_SELF_REFERENCE_P is true
-	   };  */
-
-  tree t = check_elaborated_type_specifier (tag_code,
-		decl,
-		template_header_p
-		| DECL_SELF_REFERENCE_P (decl));
-  if (template_header_p && t && CLASS_TYPE_P (t)
-	  && (!CLASSTYPE_TEMPLATE_INFO (t)
-	  || (!PRIMARY_TEMPLATE_P (CLASSTYPE_TI_TEMPLATE (t)
-	{
-	  error ("%qT is not a template", t);
-	  inform (location_of (t), "previous declaration here");
-	  if (TYPE_CLASS_SCOPE_P (t)
-	  && CLASSTYPE_TEMPLATE_INFO (TYPE_CONTEXT (t)))
-	inform (input_location,
-		"perhaps you want to explicitly add %<%T::%>",
-		TYPE_CONTEXT (t));
-	  t = error_mark_node;
-	}
+  if (!decl)
+/* We found nothing.  */
+return NULL_TREE;
 
-  return t;
-}
-  else if (decl && TREE_CODE (decl) == TREE_LIST)
+  if (TREE_CODE (decl) == TREE_LIST)
 {
   error ("reference to %qD is ambiguous", name);
   print_candidates (decl);
   return error_mark_node;
 }
-  else
+
+  if (DECL_CLASS_TEMPLATE_P (decl)
+  /* If scope is TAG_how::CURRENT_ONLY we're defining a class,
+	 so ignore a template template parameter.  */
+  || (how != TAG_how::CURRENT_ONLY && DECL_TEMPLATE_TEMPLATE_PARM_P (decl)))
+decl = DECL_TEMPLATE_RESULT (decl);
+
+  if (TREE_CODE (decl) != TYPE_DECL)
+/* Found not-a-type.  */
 return NULL_TREE;
+
+/* Look for invalid nested type:
+ class C {
+ class C {};
+ };  */
+  if (how == TAG_how::CURRENT_ONLY && DECL_SELF_REFERENCE_P (decl))
+{
+  error ("%qD has the same name as the class in which it is "
+	 "declared", decl);
+  return error_mark_node;
+}
+
+  /* Two cases we need to consider when deciding if a class
+ template is allowed as an elaborated type specifier:
+ 1. It is a self reference to its own class.
+ 2. It comes with a template header.
+
+ For example:
+
+ template  class C {
+   class C *c1;		// DECL_SELF_REFERENCE_P is true
+   class D;
+ };
+ template  class C; // template_header_p is true
+ template  class C::D {
+   class C *c2;		// DECL_SELF_REFERENCE_P is true
+ };  */
+
+  tree t = check_elaborated_type_specifier (tag_code, decl,
+	template_header_p
+	| DECL_SELF_REFERENCE_P (decl));
+  if (template_header_p && t && CLASS_TYPE_P (t)
+  && (!CLASSTYPE_TEMPLATE_INFO (t)
+	  || (!PRIMARY_TEMPLATE_P (CLASSTYPE_TI_TEMPLATE (t)
+{
+  error ("%qT is not a template", t);
+  inform (location_of (t), "previous declaration here");
+  if (TYPE_CLASS_SCOPE_P (t)
+	  && CLASSTYPE_TEMPLATE_INFO (TYPE_CONTEXT (t)))
+	inform (input_location,
+		"perhaps you want to explicitly add %<%T::%>",
+		TYPE_CONTEXT (t));
+  return error_mark_node;
+}
+
+  return t;
 }
 
 /* Get the struct, enum or union (TAG_CODE says which) with tag NAME.


Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]

2020-10-01 Thread Jonathan Wakely via Gcc-patches

On 01/10/20 08:50 +0100, Jonathan Wakely wrote:

On 01/10/20 09:30 +0200, Christophe Lyon via Libstdc++ wrote:

On Wed, 30 Sep 2020 at 22:44, Jonathan Wakely  wrote:


On 30/09/20 16:03 +0100, Jonathan Wakely wrote:

On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:

On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
 wrote:


Glibc 2.32 adds a global variable that says whether the process is
single-threaded. We can use this to decide whether to elide atomic
operations, as a more precise and reliable indicator than
__gthread_active_p.

This means that guard variables for statics and reference counting in
shared_ptr can use less expensive, non-atomic ops even in processes that
are linked to libpthread, as long as no threads have been created yet.
It also means that we switch to using atomics if libpthread gets loaded
later via dlopen (this still isn't supported in general, for other
reasons).

We can't use __libc_single_threaded to replace __gthread_active_p
everywhere. If we replaced the uses of __gthread_active_p in std::mutex
then we would elide the pthread_mutex_lock in the code below, but not
the pthread_mutex_unlock:

 std::mutex m;
 m.lock();// pthread_mutex_lock
 std::thread t([]{}); // __libc_single_threaded = false
 t.join();
 m.unlock();  // pthread_mutex_unlock

We need the lock and unlock to use the same "is threading enabled"
predicate, and similarly for init/destroy pairs for mutexes and
condition variables, so that we don't try to release resources that were
never acquired.

There are other places that could use __libc_single_threaded, such as
_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
they can be changed later.

libstdc++-v3/ChangeLog:

   PR libstdc++/96817
   * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
   New function wrapping __libc_single_threaded if available.
   (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
   * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
   (__cxa_guard_release): Likewise.
   * testsuite/18_support/96817.cc: New test.

Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.


Hi,

This patch introduced regressions on armeb-linux-gnueabhf:
--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
  g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
  g++.dg/cpp2a/decomp1.C  -std=gnu++14 execution test
  g++.dg/cpp2a/decomp1.C  -std=gnu++17 execution test
  g++.dg/cpp2a/decomp1.C  -std=gnu++2a execution test
  g++.dg/init/init-ref2.C  -std=c++14 execution test
  g++.dg/init/init-ref2.C  -std=c++17 execution test
  g++.dg/init/init-ref2.C  -std=c++2a execution test
  g++.dg/init/init-ref2.C  -std=c++98 execution test
  g++.dg/init/ref15.C  -std=c++14 execution test
  g++.dg/init/ref15.C  -std=c++17 execution test
  g++.dg/init/ref15.C  -std=c++2a execution test
  g++.dg/init/ref15.C  -std=c++98 execution test
  g++.old-deja/g++.jason/pmf7.C  -std=c++98 execution test
  g++.old-deja/g++.mike/leak1.C  -std=c++14 execution test
  g++.old-deja/g++.mike/leak1.C  -std=c++17 execution test
  g++.old-deja/g++.mike/leak1.C  -std=c++2a execution test
  g++.old-deja/g++.mike/leak1.C  -std=c++98 execution test
  g++.old-deja/g++.other/init19.C  -std=c++14 execution test
  g++.old-deja/g++.other/init19.C  -std=c++17 execution test
  g++.old-deja/g++.other/init19.C  -std=c++2a execution test
  g++.old-deja/g++.other/init19.C  -std=c++98 execution test

and probably some (280) in libstdc++ tests: (I didn't bisect those):
  19_diagnostics/error_category/generic_category.cc execution test
  19_diagnostics/error_category/system_category.cc execution test
  20_util/scoped_allocator/1.cc execution test
  20_util/scoped_allocator/2.cc execution test
  20_util/scoped_allocator/construct_pair_c++2a.cc execution test
  20_util/to_address/debug.cc execution test
  20_util/variant/run.cc execution test


I think this is a latent bug in the static initialization code for
EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
have:

# ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE

// Test the guard variable with a memory load with
// acquire semantics.

inline bool
__test_and_acquire (__cxxabiv1::__guard *g)
{
 unsigned char __c;
 unsigned char *__p = reinterpret_cast(g);
 __atomic_load (__p, &__c,  __ATOMIC_ACQUIRE);
 (void) __p;
 return _GLIBCXX_GUARD_TEST(&__c);
}
#  define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
# endif

That inspects the first byte of the guard variable. But for EABI the
"is initialized" bit is the least significant bit of the guard
variable. For little endian that's fine, the least significant bit is
in the first byte. But for big endian, it's not in the first byte, so
we are looking in the wrong place. This means that the initial check
in __cxa_guard_acquire is wrong:

 extern "C"
 int __cxa_guard_acquire (__guard *g)
 {
#ifdef __GTHREADS
   // If the target can 

[committed] arm: Fix ordering in arm-cpus.in

2020-10-01 Thread Alex Coplan via Gcc-patches
This patch moves the recent entry for Neoverse N2 down and adds a
comment in order to preserve the existing order/structure in
arm-cpus.in.

Bootstrapped and tested on arm-linux-gnueabihf.

Committing as obvious.

Alex

---

gcc/ChangeLog:

* config/arm/arm-cpus.in: Fix ordering, move Neoverse N2 down.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index d47f9439ed1..9abb59a00ba 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1492,17 +1492,6 @@ begin cpu neoverse-n1
  part d0c
 end cpu neoverse-n1
 
-begin cpu neoverse-n2
-  cname neoversen2
-  tune for cortex-a57
-  tune flags LDSCHED
-  architecture armv8.5-a+fp16+bf16+i8mm
-  option crypto add FP_ARMv8 CRYPTO
-  costs cortex_a57
-  vendor 41
-  part 0xd49
-end cpu neoverse-n2
-
 # ARMv8.2 A-profile ARM DynamIQ big.LITTLE implementations
 begin cpu cortex-a75.cortex-a55
  cname cortexa75cortexa55
@@ -1532,6 +1521,18 @@ begin cpu neoverse-v1
   costs cortex_a57
 end cpu neoverse-v1
 
+# Armv8.5 A-profile Architecture Processors
+begin cpu neoverse-n2
+  cname neoversen2
+  tune for cortex-a57
+  tune flags LDSCHED
+  architecture armv8.5-a+fp16+bf16+i8mm
+  option crypto add FP_ARMv8 CRYPTO
+  costs cortex_a57
+  vendor 41
+  part 0xd49
+end cpu neoverse-n2
+
 # V8 M-profile implementations.
 begin cpu cortex-m23
  cname cortexm23
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 9f658244053..05f5c08400b 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -252,9 +252,6 @@ Enum(processor_type) String(cortex-x1) Value( 
TARGET_CPU_cortexx1)
 EnumValue
 Enum(processor_type) String(neoverse-n1) Value( TARGET_CPU_neoversen1)
 
-EnumValue
-Enum(processor_type) String(neoverse-n2) Value( TARGET_CPU_neoversen2)
-
 EnumValue
 Enum(processor_type) String(cortex-a75.cortex-a55) Value( 
TARGET_CPU_cortexa75cortexa55)
 
@@ -264,6 +261,9 @@ Enum(processor_type) String(cortex-a76.cortex-a55) Value( 
TARGET_CPU_cortexa76co
 EnumValue
 Enum(processor_type) String(neoverse-v1) Value( TARGET_CPU_neoversev1)
 
+EnumValue
+Enum(processor_type) String(neoverse-n2) Value( TARGET_CPU_neoversen2)
+
 EnumValue
 Enum(processor_type) String(cortex-m23) Value( TARGET_CPU_cortexm23)
 
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 269e627626a..32657da48a5 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -46,8 +46,8 @@ (define_attr "tune"
cortexa73cortexa53,cortexa55,cortexa75,
cortexa76,cortexa76ae,cortexa77,
cortexa78,cortexa78ae,cortexx1,
-   neoversen1,neoversen2,cortexa75cortexa55,
-   cortexa76cortexa55,neoversev1,cortexm23,
+   neoversen1,cortexa75cortexa55,cortexa76cortexa55,
+   neoversev1,neoversen2,cortexm23,
cortexm33,cortexm35p,cortexm55,
cortexr52"
(const (symbol_ref "((enum attr_tune) arm_tune)")))


[committed][testsuite] Enable pr94600-{1,3}.c tests for nvptx

2020-10-01 Thread Tom de Vries
[ was: Re: [committed][testsuite] Re-enable pr94600-{1,3}.c tests for arm ]

On 10/1/20 7:38 AM, Hans-Peter Nilsson wrote:
> On Wed, 30 Sep 2020, Tom de Vries wrote:
> 
>> [ was: Re: [committed][testsuite] Require non_strict_align in
>> pr94600-{1,3}.c ]
>>
>> On 9/30/20 4:53 AM, Hans-Peter Nilsson wrote:
>>> On Thu, 24 Sep 2020, Tom de Vries wrote:
>>>
 Hi,

 With the nvptx target, we run into:
 ...
 FAIL: gcc.dg/pr94600-1.c scan-rtl-dump-times final "\\(mem/v" 6
 FAIL: gcc.dg/pr94600-1.c scan-rtl-dump-times final "\\(set \\(mem/v" 6
 FAIL: gcc.dg/pr94600-3.c scan-rtl-dump-times final "\\(mem/v" 1
 FAIL: gcc.dg/pr94600-3.c scan-rtl-dump-times final "\\(set \\(mem/v" 1
 ...
 The scans attempt to check for volatile stores, but on nvptx we have memcpy
 instead.

 This is due to nvptx being a STRICT_ALIGNMENT target, which has the effect
 that the TYPE_MODE for the store target is set to BKLmode in
 compute_record_mode.

 Fix the FAILs by requiring effective target non_strict_align.
>>>
>>> No, that's wrong.  There's more than that at play; it worked for
>>> the strict-alignment targets where it was tested at the time.
>>>
>>
>> Hi,
>>
>> thanks for letting me know.
>>
>>> The test is a valuable canary for this kind of bug.  You now
>>> disabled it for strict-alignment targets.
>>>
>>> Please revert and add your target specifier instead, if you
>>> don't feel like investigating further.
>>
>> I've analyzed the compilation on strict-alignment target arm-eabi, and
> 
> An analysis should result in more than that statement.
> 

Well, it refers to the analysis in the commit log of the patch, sorry if
that was not obvious.

>> broadened the effective target to (non_strict_align ||
>> pcc_bitfield_type_matters).
> 
> That's *also* not right.  I'm guessing your nvptx fails because
> it has 64-bit alignment requirement, but no 32-bit writes.
> ...um, no that can't be it, nvptx seems to have them.  Costs?
> Yes, probably your #define MOVE_RATIO(SPEED) 4.
> 
> The writes are to 32-bit aligned addresses which gcc can deduce
> (also for strict-alignment targets) because it's a literal,
> where it isn't explicitly declared to be attribute-aligned
> 
> You should have noticed the weirness in that you "only" needed
> to tweak pr94600-1.c and -3.c, not even pr94600-2.c, which
> should be the case if it was just the test-case getting the
> predicates wrong.  This points at your MOVE_RATIO, together with
> middle-end not applying it consistently for -2.c.
> 
> Again, please just skip for nvptx (don't mix-n-match general
> predicates) unless you really look into the reason you don't get
> 6 single 32-bit-writes only in *some* of the cases.

Thanks for the pointer to pr94600-2.c.  I've compared the behaviour
between pr94600-1.c and pr94600-2.c and figured out why in one case we
get the load/store pair, and in the other the memcpy.  See rationale in
commit below.

Committed to trunk.

Thanks,
- Tom

[testsuite] Enable pr94600-{1,3}.c tests for nvptx

When compiling test-case pr94600-1.c for nvptx, this gimple mem move:
...
  MEM[(volatile struct t0 *)655404B] ={v} a0[0];
...
is expanded into a memcpy, but when compiling pr94600-2.c instead, this similar
gimple mem move:
...
  MEM[(volatile struct t0 *)655404B] ={v} a00;
...
is expanded into a 32-bit load/store pair.

In both cases, emit_block_move is called.

In the latter case, can_move_by_pieces (4 /* byte-size */, 32 /* bit-align */)
is called, which returns true (because by_pieces_ninsns returns 1, which is
smaller than the MOVE_RATIO of 4).

In the former case, can_move_by_pieces (4 /* byte-size */, 8 /* bit-align */)
is called, which returns false (because by_pieces_ninsns returns 4, which is
not smaller than the MOVE_RATIO of 4).

So the difference in code generation is explained by the alignment.  The
difference in alignment comes from the move sources: a0[0] vs. a00.  Both
have the same type with 8-bit alignment, but a00 is on stack, which based on
the base stack align and stack variable placement happens to result in a
32-bit alignment.

Enable test-cases pr94600-{1,3}.c for nvptx by forcing the currently 8-byte
aligned variables to have a 32-bit alignment for STRICT_ALIGNMENT targets.

Tested on nvptx.

gcc/testsuite/ChangeLog:

2020-10-01  Tom de Vries  

	* gcc.dg/pr94600-1.c: Force 32-bit alignment for a0 for !non_strict_align
	targets.  Remove target clauses from scan tests.
	* gcc.dg/pr94600-3.c: Same.

---
 gcc/testsuite/gcc.dg/pr94600-1.c | 11 ---
 gcc/testsuite/gcc.dg/pr94600-3.c | 11 ---
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pr94600-1.c b/gcc/testsuite/gcc.dg/pr94600-1.c
index c9a7bb9007e..149e4f35dbe 100644
--- a/gcc/testsuite/gcc.dg/pr94600-1.c
+++ b/gcc/testsuite/gcc.dg/pr94600-1.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target size32plus } */
 /* { dg-options "-fdump-rtl-final -O2" } */
+/* { 

[COMMITTED][GCC-10 backport] arm: Fix MVE intrinsics polymorphic variants wrongly generating __ARM_undef type (pr96795).

2020-10-01 Thread Srinath Parvathaneni via Gcc-patches
Hello,

This patch fixes (PR96795) MVE intrinsic polymorphic variants vaddq, vaddq_m, 
vaddq_x, vcmpeqq_m,
vcmpeqq, vcmpgeq_m, vcmpgeq, vcmpgtq_m, vcmpgtq, vcmpleq_m, vcmpleq, vcmpltq_m, 
vcmpltq,
vcmpneq_m, vcmpneq, vfmaq_m, vfmaq, vfmasq_m, vfmasq, vmaxnmavq, vmaxnmavq_p, 
vmaxnmvq,
vmaxnmvq_p, vminnmavq, vminnmavq_p, vminnmvq, vminnmvq_p, vmulq_m, vmulq, 
vmulq_x, vsetq_lane,
vsubq_m, vsubq and vsubq_x which are incorrectly generating __ARM_undef and 
mismatching the passed
floating point scalar arguments.

This patch get cleanly applied on releases/gcc-10 branch.
Bootstrapped on arm-none-linux-gnueabihf and regression tested on arm-none-eabi 
and found no regressions.

Patch already approved in 
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/555185.html ,
so committed this patch to releases/gcc-10 branch.

Regards,
Srinath.

gcc/ChangeLog:

2020-09-30  Srinath Parvathaneni  

PR target/96795
* config/arm/arm_mve.h (__ARM_mve_coerce2): Define.
(__arm_vaddq): Correct the scalar argument.
(__arm_vaddq_m): Likewise.
(__arm_vaddq_x): Likewise.
(__arm_vcmpeqq_m): Likewise.
(__arm_vcmpeqq): Likewise.
(__arm_vcmpgeq_m): Likewise.
(__arm_vcmpgeq): Likewise.
(__arm_vcmpgtq_m): Likewise.
(__arm_vcmpgtq): Likewise.
(__arm_vcmpleq_m): Likewise.
(__arm_vcmpleq): Likewise.
(__arm_vcmpltq_m): Likewise.
(__arm_vcmpltq): Likewise.
(__arm_vcmpneq_m): Likewise.
(__arm_vcmpneq): Likewise.
(__arm_vfmaq_m): Likewise.
(__arm_vfmaq): Likewise.
(__arm_vfmasq_m): Likewise.
(__arm_vfmasq): Likewise.
(__arm_vmaxnmavq): Likewise.
(__arm_vmaxnmavq_p): Likewise.
(__arm_vmaxnmvq): Likewise.
(__arm_vmaxnmvq_p): Likewise.
(__arm_vminnmavq): Likewise.
(__arm_vminnmavq_p): Likewise.
(__arm_vminnmvq): Likewise.
(__arm_vminnmvq_p): Likewise.
(__arm_vmulq_m): Likewise.
(__arm_vmulq): Likewise.
(__arm_vmulq_x): Likewise.
(__arm_vsetq_lane): Likewise.
(__arm_vsubq_m): Likewise.
(__arm_vsubq): Likewise.
(__arm_vsubq_x): Likewise.

gcc/testsuite/ChangeLog:

PR target/96795
* gcc.target/arm/mve/intrinsics/mve_fp_vaddq_n.c: New Test.
* gcc.target/arm/mve/intrinsics/mve_vaddq_n.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpeqq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpeqq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgeq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgeq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgtq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgtq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpleq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpleq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpleq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpleq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpltq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpltq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpltq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpltq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_f16-1.c: Likewise.
* 

[patch] Rework CPP_BUILTINS_SPEC for powerpc-vxworks

2020-10-01 Thread Olivier Hainque
This change reworks CPP_BUILTINS_SPEC for powerpc-vxworks to
prepare for the upcoming addition of 32 and 64 bit ports for
VxWorks 7r2.

This has been used in gcc-9 based production compilers for a
year on both vxworks 7 and 6.9. Also passed a build & test sequence
for powerpc-vxworks 7 and 6.9 with gcc-10 and a sanity check build
with a recent mainline.

Will commit to mainline shortly.

Olivier

2020-10-01  Olivier Hainque  

* config/rs6000/vxworks.h (TARGET_OS_CPP_BUILTINS): Accommodate
expectations from different versions of VxWorks, for 32 or 64bit
configurations.



0006-Augment-CPP_BUILTINS_SPEC-for-powerpc-vxworks.diff
Description: Binary data







Re: [PATCH] tree-optimization/97151 - improve PTA for C++ operator delete

2020-10-01 Thread Richard Biener
On Wed, 30 Sep 2020, Jason Merrill wrote:

> On 9/28/20 3:09 PM, Jason Merrill wrote:
> > On 9/28/20 3:56 AM, Richard Biener wrote:
> >> On Fri, 25 Sep 2020, Jason Merrill wrote:
> >>
> >>> On 9/25/20 2:30 AM, Richard Biener wrote:
>  On Thu, 24 Sep 2020, Jason Merrill wrote:
> 
> > On 9/24/20 3:43 AM, Richard Biener wrote:
> >> On Wed, 23 Sep 2020, Jason Merrill wrote:
> >>
> >>> On 9/23/20 2:42 PM, Richard Biener wrote:
>  On September 23, 2020 7:53:18 PM GMT+02:00, Jason Merrill
>  
>  wrote:
> > On 9/23/20 4:14 AM, Richard Biener wrote:
> >> C++ operator delete, when DECL_IS_REPLACEABLE_OPERATOR_DELETE_P,
> >> does not cause the deleted object to be escaped.? It also has no
> >> other interesting side-effects for PTA so skip it like we do
> >> for BUILT_IN_FREE.
> >
> > Hmm, this is true of the default implementation, but since the
> > function
> >
> > is replaceable, we don't know what a user definition might do with
> > the
> > pointer.
> 
>  But can the object still be 'used' after delete? Can delete fail /
>  throw?
> 
>  What guarantee does the predicate give us?
> >>>
> >>> The deallocation function is called as part of a delete expression in
> >>> order
> >>> to
> >>> release the storage for an object, ending its lifetime (if it was not
> >>> ended
> >>> by
> >>> a destructor), so no, the object can't be used afterward.
> >>
> >> OK, but the delete operator can access the object contents if there
> >> wasn't a destructor ...
> >
> >>> A deallocation function that throws has undefined behavior.
> >>
> >> OK, so it seems the 'replaceable' operators are the global ones
> >> (for user-defined/class-specific placement variants I see arbitrary
> >> extra arguments that we'd possibly need to handle).
> >>
> >> I'm happy to revert but I'd like to have a testcase that FAILs
> >> with the patch ;)
> >>
> >> Now, the following aborts:
> >>
> >> struct X {
> >>  static struct X saved;
> >>  int *p;
> >>  X() { __builtin_memcpy (this, , sizeof (X)); }
> >> };
> >> void operator delete (void *p)
> >> {
> >>  __builtin_memcpy (::saved, p, sizeof (X));
> >> }
> >> int main()
> >> {
> >>  int y = 1;
> >>  X *p = new X;
> >>  p->p = 
> >>  delete p;
> >>  X *q = new X;
> >>  *(q->p) = 2;
> >>  if (y != 2)
> >> ?? __builtin_abort ();
> >> }
> >>
> >> and I could fix this by not making *p but what *p points to escape.
> >> The testcase is of course maximally awkward, but hey ... ;)
> >>
> >> Now this would all be moot if operator delete may not access
> >> the object (or if the object contents are undefined at that point).
> >>
> >> Oh, and the testcase segfaults when compiled with GCC 10 because
> >> there we elide the new X / delete p pair ... which is invalid then?
> >> Hmm, we emit
> >>
> >>  MEM[(struct X *)_8] ={v} {CLOBBER};
> >>  operator delete (_8, 8);
> >>
> >> so the object contents are undefined _before_ calling delete
> >> even when I do not have a DTOR?? That is, the above,
> >> w/o -fno-lifetime-dse, makes the PTA patch OK for the testcase.
> >
> > Yes, all classes have a destructor, even if it's trivial, so the
> > object's
> > lifetime definitely ends before the call to operator delete. This is
> > less
> > clear for scalar objects, but treating them similarly would be
> > consistent
> > with
> > other recent changes, so I think it's fine for us to assume that scalar
> > objects are also invalidated before the call to operator delete.  But of
> > course this doesn't apply to explicit calls to operator delete outside
> > of a
> > delete expression.
> 
>  OK, so change the testcase main slightly to
> 
>  int main()
>  {
>  ??? int y = 1;
>  ??? X *p = new X;
>  ??? p->p = 
>  ??? ::operator delete(p);
>  ??? X *q = new X;
>  ??? *(q->p) = 2;
>  ??? if (y != 2)
>  ? __builtin_abort ();
>  }
> 
>  in this case the lifetime of *p does not end before calling
>  ::operator delete() and delete can stash the object contents
>  somewhere before ending its lifetime.? For the very same reason
>  we may not elide a new/delete pair like in
> 
>  int main()
>  {
>  ??? int *p = new int;
>  ??? *p = 1;
>  ??? ::operator delete (p);
>  }
> >>>
> >>> Correct; the permission to elide new/delete pairs are for the expressions,
> >>> not
> >>> the functions.
> >>>
>  which we before the change did not do only because calling
>  operator delete made p escape.? Unfortunately points-to 

[committed] s390: Fix up s390_atomic_assign_expand_fenv

2020-10-01 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch fixes
-FAIL: gcc.dg/pr94780.c (internal compiler error)
-FAIL: gcc.dg/pr94780.c (test for excess errors)
-FAIL: gcc.dg/pr94842.c (internal compiler error)
-FAIL: gcc.dg/pr94842.c (test for excess errors)
on s390x-linux.  The fix is essentially the same as has been applied to many
other targets (i386, aarch64, arm, rs6000, alpha, riscv).

Bootstrapped/regtested on s390x-linux, committed to trunk and release
branches as obvious.

2020-10-01  Jakub Jelinek  

* config/s390/s390.c (s390_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for the first assignments to
fenv_var and old_fpc.  Formatting fixes.

--- gcc/config/s390/s390.c.jj   2020-09-14 09:04:36.086851054 +0200
+++ gcc/config/s390/s390.c  2020-09-30 10:22:50.579603271 +0200
@@ -16082,12 +16082,13 @@ s390_atomic_assign_expand_fenv (tree *ho
 
  fenv_var = __builtin_s390_efpc ();
  __builtin_s390_sfpc (fenv_var & mask) */
-  tree old_fpc = build2 (MODIFY_EXPR, unsigned_type_node, fenv_var, call_efpc);
-  tree new_fpc =
-build2 (BIT_AND_EXPR, unsigned_type_node, fenv_var,
-   build_int_cst (unsigned_type_node,
-  ~(FPC_DXC_MASK | FPC_FLAGS_MASK |
-FPC_EXCEPTION_MASK)));
+  tree old_fpc = build4 (TARGET_EXPR, unsigned_type_node, fenv_var, call_efpc,
+NULL_TREE, NULL_TREE);
+  tree new_fpc
+= build2 (BIT_AND_EXPR, unsigned_type_node, fenv_var,
+ build_int_cst (unsigned_type_node,
+~(FPC_DXC_MASK | FPC_FLAGS_MASK
+  | FPC_EXCEPTION_MASK)));
   tree set_new_fpc = build_call_expr (sfpc, 1, new_fpc);
   *hold = build2 (COMPOUND_EXPR, void_type_node, old_fpc, set_new_fpc);
 
@@ -16106,8 +16107,8 @@ s390_atomic_assign_expand_fenv (tree *ho
   __atomic_feraiseexcept ((old_fpc & FPC_FLAGS_MASK) >> FPC_FLAGS_SHIFT);  */
 
   old_fpc = create_tmp_var_raw (unsigned_type_node);
-  tree store_old_fpc = build2 (MODIFY_EXPR, void_type_node,
-  old_fpc, call_efpc);
+  tree store_old_fpc = build4 (TARGET_EXPR, void_type_node, old_fpc, call_efpc,
+  NULL_TREE, NULL_TREE);
 
   set_new_fpc = build_call_expr (sfpc, 1, fenv_var);
 


Jakub



Re: [PATCH] generalized range_query class for multiple contexts

2020-10-01 Thread Aldy Hernandez via Gcc-patches
> Thanks for doing all this!  There isn't anything I don't understand
> in the sprintf changes so no questions from me (well, almost none).
> Just some comments:

Thanks for your comments on the sprintf/strlen API conversion.

>
> The current call statement is available in all functions that take
> a directive argument, as dir->info.callstmt.  There should be no need
> to also add it as a new argument to the functions that now need it.

Fixed.

>
> The change adds code along these lines in a bunch of places:
>
> + value_range vr;
> + if (!query->range_of_expr (vr, arg, stmt))
> +   vr.set_varying (TREE_TYPE (arg));
>
> I thought under the new Ranger APIs when a range couldn't be
> determined it would be automatically set to the maximum for
> the type.  I like that and have been moving in that direction
> with my code myself (rather than having an API fail, have it
> set the max range and succeed).

I went through all the above idioms and noticed all are being used on
supported types (integers or pointers).  So range_of_expr will always
return true.  I've removed the if() and the set_varying.

>
> Since that isn't so in this case, I think it would still be nice
> if the added code could be written as if the range were set to
> varying in this case and (ideally) reduced to just initialization:
>
>value_range vr = some-function (query, stmt, arg);
>
> some-function could be an inline helper defined just for the sprintf
> pass (and maybe also strlen which also seems to use the same pattern),
> or it could be a value_range AKA irange ctor, or it could be a member
> of range_query, whatever is the most appropriate.
>
> (If assigning/copying a value_range is thought to be too expensive,
> declaring it first and then passing it to that helper to set it
> would work too).
>
> In strlen, is the removed comment no longer relevant?  (I.e., does
> the ranger solve the problem?)
>
> -  /* The range below may be "inaccurate" if a constant has been
> -substituted earlier for VAL by this pass that hasn't been
> -propagated through the CFG.  This shoud be fixed by the new
> -on-demand VRP if/when it becomes available (hopefully in
> -GCC 11).  */

It should.

>
> I'm wondering about the comment added to get_range_strlen_dynamic
> and other places:
>
> + // FIXME: Use range_query instead of global ranges.
>
> Is that something you're planning to do in a followup or should
> I remember to do it at some point?

I'm not planning on doing it.  It's just a reminder that it would be
beneficial to do so.

>
> Otherwise I have no concern with the changes.

It's not cleared whether Andrew approved all 3 parts of the patchset
or just the valuation part.  I'll wait for his nod before committing
this chunk.

Aldy



RE: [PATCH][GCC 8] aarch64: Add support for Neoverse N2 CPU

2020-10-01 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Alex Coplan 
> Sent: 01 October 2020 09:28
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Richard Sandiford
> ; Kyrylo Tkachov 
> Subject: [PATCH][GCC 8] aarch64: Add support for Neoverse N2 CPU
> 
> This patch backports the AArch64 support for Arm's Neoverse N2 CPU to
> GCC 8.
> 
> Testing:
>  * Bootstrapped and regtested on aarch64-none-linux-gnu.
> 
> OK for GCC 8 branch?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Alex
> 
> ---
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-cores.def: Add Neoverse N2.
>   * config/aarch64/aarch64-tune.md: Regenerate.
>   * doc/invoke.texi: Document AArch64 support for Neoverse N2.


RE: [PATCH][GCC 9] aarch64: Add support for Neoverse N2 CPU

2020-10-01 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Alex Coplan 
> Sent: 01 October 2020 09:25
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Richard Sandiford
> ; Kyrylo Tkachov 
> Subject: [PATCH][GCC 9] aarch64: Add support for Neoverse N2 CPU
> 
> This patch backports the AArch64 support for Arm's Neoverse N2 CPU to
> GCC 9.
> 
> Testing:
>  * Bootstrapped and regtested on aarch64-none-linux-gnu.
> 
> OK for GCC 9 branch?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Alex
> 
> ---
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-cores.def: Add Neoverse N2.
>   * config/aarch64/aarch64-tune.md: Regenerate.
>   * doc/invoke.texi: Document AArch64 support for Neoverse N2.


[PATCH][GCC 8] aarch64: Add support for Neoverse N2 CPU

2020-10-01 Thread Alex Coplan via Gcc-patches
This patch backports the AArch64 support for Arm's Neoverse N2 CPU to
GCC 8.

Testing:
 * Bootstrapped and regtested on aarch64-none-linux-gnu.

OK for GCC 8 branch?

Thanks,
Alex

---

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def: Add Neoverse N2.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Document AArch64 support for Neoverse N2.
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 35ce68ad077..c6c1e3739de 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -100,6 +100,9 @@ AARCH64_CORE("saphira", saphira,falkor,8_3A,  
AARCH64_FL_FOR_ARCH8_3
 AARCH64_CORE("zeus", zeus, cortexa57, 8_4A,  AARCH64_FL_FOR_ARCH8_4 | 
AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_SVE, cortexa72, 0x41, 0xd40, -1)
 AARCH64_CORE("neoverse-v1", neoversev1, cortexa57, 8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_SVE, 
cortexa72, 0x41, 0xd40, -1)
 
+/* Armv8.5-A Architecture Processors.  */
+AARCH64_CORE("neoverse-n2", neoversen2, cortexa57, 8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_F16 | AARCH64_FL_SVE | AARCH64_FL_RNG, 
cortexa72, 0x41, 0xd49, -1)
+
 /* ARMv8-A big.LITTLE implementations.  */
 
 AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE 
(0xd07, 0xd03), -1)
diff --git a/gcc/config/aarch64/aarch64-tune.md 
b/gcc/config/aarch64/aarch64-tune.md
index e8894ee4a9d..2d7c9aa4740 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,neoversen1,saphira,zeus,neoversev1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55"
+   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,neoversen1,saphira,zeus,neoversev1,neoversen2,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55"
(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6b40362e412..b91366daafd 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14771,9 +14771,9 @@ Specify the name of the target processor for which GCC 
should tune the
 performance of the code.  Permissible values for this option are:
 @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
 @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
-@samp{cortex-a76}, @samp{ares}, @samp{neoverse-n1}, @samp{neoverse-v1},
-@samp{zeus}, @samp{exynos-m1}, @samp{falkor}, @samp{qdf24xx}, @samp{saphira},
-@samp{xgene1}, @samp{vulcan}, @samp{thunderx},
+@samp{cortex-a76}, @samp{ares}, @samp{neoverse-n1}, @samp{neoverse-n2},
+@samp{neoverse-v1}, @samp{zeus}, @samp{exynos-m1}, @samp{falkor},
+@samp{qdf24xx}, @samp{saphira}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
 @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},
 @samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53},
 @samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35},


Re: [committed] aarch64: Tweak movti and movtf patterns

2020-10-01 Thread Christophe Lyon via Gcc-patches
On Wed, 30 Sep 2020 at 12:53, Richard Sandiford via Gcc-patches
 wrote:
>
> movti lacked an way of zeroing an FPR, meaning that we'd do:
>
> mov x0, 0
> mov x1, 0
> fmovd0, x0
> fmovv0.d[1], x1
>
> instead of just:
>
> moviv0.2d, #0
>
> movtf had the opposite problem for GPRs: we'd generate:
>
> moviv0.2d, #0
> fmovx0, d0
> fmovx1, v0.d[1]
>
> instead of just:
>
> mov x0, 0
> mov x1, 0
>
> Also, there was an unnecessary earlyclobber on the GPR<-GPR movtf
> alternative (but not the movti one).  The splitter handles overlap
> correctly.
>
> The TF splitter used aarch64_reg_or_imm, but the _imm part only
> accepts integer constants, not floating-point ones.  The patch
> changes it to nonmemory_operand instead.
>
> Tested on aarch64-linux-gnu, pushed.
>
> Richard
>
>
> gcc/
> * config/aarch64/aarch64.c (aarch64_split_128bit_move_p): Add a
> function comment.  Tighten check for FP moves.
> * config/aarch64/aarch64.md (*movti_aarch64): Add a w<-Z alternative.
> (*movtf_aarch64): Handle r<-Y like r<-r.  Remove unnecessary
> earlyclobber.  Change splitter predicate from aarch64_reg_or_imm
> to nonmemory_operand.
>
> gcc/testsuite/
> * gcc.target/aarch64/movtf_1.c: New test.
> * gcc.target/aarch64/movti_1.c: Likewise.

Sorry to bother you, the new tests fail with -mabi=ilp32 :-(
gcc.target/aarch64/movtf_1.c check-function-bodies load_q
gcc.target/aarch64/movtf_1.c check-function-bodies load_x
gcc.target/aarch64/movtf_1.c check-function-bodies store_q
gcc.target/aarch64/movtf_1.c check-function-bodies store_x
gcc.target/aarch64/movti_1.c check-function-bodies load_q
gcc.target/aarch64/movti_1.c check-function-bodies load_x
gcc.target/aarch64/movti_1.c check-function-bodies store_q
gcc.target/aarch64/movti_1.c check-function-bodies store_x

I don't think that's high priority though.

Christophe

> ---
>  gcc/config/aarch64/aarch64.c   |  9 ++-
>  gcc/config/aarch64/aarch64.md  | 17 +++--
>  gcc/testsuite/gcc.target/aarch64/movtf_1.c | 87 ++
>  gcc/testsuite/gcc.target/aarch64/movti_1.c | 87 ++
>  4 files changed, 190 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/movtf_1.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/movti_1.c
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 491fc582dab..9e88438b3c3 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -3422,11 +3422,16 @@ aarch64_split_128bit_move (rtx dst, rtx src)
>  }
>  }
>
> +/* Return true if we should split a move from 128-bit value SRC
> +   to 128-bit register DEST.  */
> +
>  bool
>  aarch64_split_128bit_move_p (rtx dst, rtx src)
>  {
> -  return (! REG_P (src)
> - || ! (FP_REGNUM_P (REGNO (dst)) && FP_REGNUM_P (REGNO (src;
> +  if (FP_REGNUM_P (REGNO (dst)))
> +return REG_P (src) && !FP_REGNUM_P (REGNO (src));
> +  /* All moves to GPRs need to be split.  */
> +  return true;
>  }
>
>  /* Split a complex SIMD combine.  */
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 19ec9e33f9f..78fe7c43a00 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1361,13 +1361,14 @@ (define_expand "movti"
>
>  (define_insn "*movti_aarch64"
>[(set (match_operand:TI 0
> -"nonimmediate_operand"  "=   r,w, r,w,r,m,m,w,m")
> +"nonimmediate_operand"  "=   r,w,w, r,w,r,m,m,w,m")
> (match_operand:TI 1
> -"aarch64_movti_operand" " rUti,r, w,w,m,r,Z,m,w"))]
> +"aarch64_movti_operand" " rUti,Z,r, w,w,m,r,Z,m,w"))]
>"(register_operand (operands[0], TImode)
>  || aarch64_reg_or_zero (operands[1], TImode))"
>"@
> #
> +   movi\\t%0.2d, #0
> #
> #
> mov\\t%0.16b, %1.16b
> @@ -1376,11 +1377,11 @@ (define_insn "*movti_aarch64"
> stp\\txzr, xzr, %0
> ldr\\t%q0, %1
> str\\t%q1, %0"
> -  [(set_attr "type" "multiple,f_mcr,f_mrc,neon_logic_q, \
> +  [(set_attr "type" "multiple,neon_move,f_mcr,f_mrc,neon_logic_q, \
>  load_16,store_16,store_16,\
>   load_16,store_16")
> -   (set_attr "length" "8,8,8,4,4,4,4,4,4")
> -   (set_attr "arch" "*,*,*,simd,*,*,*,fp,fp")]
> +   (set_attr "length" "8,4,8,8,4,4,4,4,4,4")
> +   (set_attr "arch" "*,simd,*,*,simd,*,*,*,fp,fp")]
>  )
>
>  ;; Split a TImode register-register or register-immediate move into
> @@ -1511,9 +1512,9 @@ (define_split
>
>  (define_insn "*movtf_aarch64"
>[(set (match_operand:TF 0
> -"nonimmediate_operand" "=w,?,w ,?r,w,?w,w,m,?r,m ,m")
> +"nonimmediate_operand" "=w,?r ,w ,?r,w,?w,w,m,?r,m ,m")
> (match_operand:TF 1
> -"general_operand"  " w,?r, ?r,w ,Y,Y ,m,w,m ,?r,Y"))]
> +

[PATCH][GCC 9] aarch64: Add support for Neoverse N2 CPU

2020-10-01 Thread Alex Coplan via Gcc-patches
This patch backports the AArch64 support for Arm's Neoverse N2 CPU to
GCC 9.

Testing:
 * Bootstrapped and regtested on aarch64-none-linux-gnu.

OK for GCC 9 branch?

Thanks,
Alex

---

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def: Add Neoverse N2.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Document AArch64 support for Neoverse N2.
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 48f1ac3ecf1..99198e1e538 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -120,6 +120,9 @@ AARCH64_CORE("neoverse-v1", neoversev1, cortexa57, 8_4A,  
AARCH64_FL_FOR_ARCH8_4
 /* Qualcomm ('Q') cores. */
 AARCH64_CORE("saphira", saphira,saphira,8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 
0xC01, -1)
 
+/* Armv8.5-A Architecture Processors.  */
+AARCH64_CORE("neoverse-n2", neoversen2, cortexa57, 8_5A, 
AARCH64_FL_FOR_ARCH8_5 | AARCH64_FL_F16 | AARCH64_FL_SVE | AARCH64_FL_RNG | 
AARCH64_FL_MEMTAG, neoversen1, 0x41, 0xd49, -1)
+
 /* ARMv8-A big.LITTLE implementations.  */
 
 AARCH64_CORE("cortex-a57.cortex-a53",  cortexa57cortexa53, cortexa53, 8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE 
(0xd07, 0xd03), -1)
diff --git a/gcc/config/aarch64/aarch64-tune.md 
b/gcc/config/aarch64/aarch64-tune.md
index f5d62de5940..0a73e105e08 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,neoversen1,neoversee1,a64fx,tsv110,zeus,neoversev1,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
+   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,neoversen1,neoversee1,a64fx,tsv110,zeus,neoversev1,saphira,neoversen2,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c85e31fb02c..e4cc83ba5cb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15851,9 +15851,9 @@ performance of the code.  Permissible values for this 
option are:
 @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
 @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
 @samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{emag}, @samp{falkor},
-@samp{neoverse-e1},@samp{neoverse-n1},@samp{neoverse-v1},@samp{qdf24xx},
-@samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{octeontx},
-@samp{octeontx81},  @samp{octeontx83},
+@samp{neoverse-e1}, @samp{neoverse-n1}, @samp{neoverse-n2}, @samp{neoverse-v1},
+@samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan},
+@samp{octeontx}, @samp{octeontx81},  @samp{octeontx83},
 @samp{a64fx},
 @samp{thunderx}, @samp{thunderxt88},
 @samp{thunderxt88p1}, @samp{thunderxt81}, @samp{tsv110},


[PATCH] tree-optimization/97255 - missing vector bool pattern of SRAed bool

2020-10-01 Thread Richard Biener
SRA tends to use VIEW_CONVERT_EXPR when replacing bool fields with
unsigned char fields.  Those are not handled in vector bool pattern
detection causing vector true values to leak.  The following fixes
this by turning those into b ? 1 : 0 as well.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk.

2020-10-01  Richard Biener  

* tree-vect-patterns.c (vect_recog_bool_pattern): Also handle
VIEW_CONVERT_EXPR.

* g++.dg/vect/pr97255.cc: New testcase.
---
 gcc/testsuite/g++.dg/vect/pr97255.cc | 44 
 gcc/tree-vect-patterns.c |  8 +++--
 2 files changed, 50 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/vect/pr97255.cc

diff --git a/gcc/testsuite/g++.dg/vect/pr97255.cc 
b/gcc/testsuite/g++.dg/vect/pr97255.cc
new file mode 100644
index 000..efb7f53fd27
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr97255.cc
@@ -0,0 +1,44 @@
+// { dg-require-effective-target c++11 }
+// { dg-additional-options "-O3" }
+
+template
+class Array{
+public:
+T& operator[](unsigned x) {return m_arr[x];}
+private:
+T m_arr[N];
+};
+
+int
+__attribute__((noipa))
+logicalOr(Array< char, 4 > in1[60],
+  Array< bool, 4 > out[60])
+{
+  for (unsigned k0 = 0u; k0 < 60u; ++k0) {
+  Array< char, 4 > in1m = in1[k0];
+  Array< bool, 4 > x;
+  for (unsigned k1 = 0u; k1 < 4u; ++k1) {
+  char in1s = in1m[k1];
+  x[k1] = in1s != char(0) || in1s != char(0);
+  }
+  out[k0] = x;
+  }
+  return out[0][0];
+}
+
+
+int main()
+{
+  Array< char, 4 > In1[60]{};
+  Array< bool, 4 > Out7[60]{};
+
+  for( int i = 0; i < 60; ++i){
+  for( int j = 0; j < 4; ++j){
+  In1[i][j] = 240 - i*4 - j;
+  }
+  }
+
+  if (logicalOr(In1, Out7) != 1)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index db45740da3c..d626c5f7362 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -4028,14 +4028,18 @@ vect_recog_bool_pattern (vec_info *vinfo,
 
   var = gimple_assign_rhs1 (last_stmt);
   lhs = gimple_assign_lhs (last_stmt);
+  rhs_code = gimple_assign_rhs_code (last_stmt);
+
+  if (rhs_code == VIEW_CONVERT_EXPR)
+var = TREE_OPERAND (var, 0);
 
   if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (var)))
 return NULL;
 
   hash_set bool_stmts;
 
-  rhs_code = gimple_assign_rhs_code (last_stmt);
-  if (CONVERT_EXPR_CODE_P (rhs_code))
+  if (CONVERT_EXPR_CODE_P (rhs_code)
+  || rhs_code == VIEW_CONVERT_EXPR)
 {
   if (! INTEGRAL_TYPE_P (TREE_TYPE (lhs))
  || TYPE_PRECISION (TREE_TYPE (lhs)) == 1)
-- 
2.26.2


Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]

2020-10-01 Thread Jonathan Wakely via Gcc-patches

On 01/10/20 09:30 +0200, Christophe Lyon via Libstdc++ wrote:

On Wed, 30 Sep 2020 at 22:44, Jonathan Wakely  wrote:


On 30/09/20 16:03 +0100, Jonathan Wakely wrote:
>On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:
>>On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
>> wrote:
>>>
>>>Glibc 2.32 adds a global variable that says whether the process is
>>>single-threaded. We can use this to decide whether to elide atomic
>>>operations, as a more precise and reliable indicator than
>>>__gthread_active_p.
>>>
>>>This means that guard variables for statics and reference counting in
>>>shared_ptr can use less expensive, non-atomic ops even in processes that
>>>are linked to libpthread, as long as no threads have been created yet.
>>>It also means that we switch to using atomics if libpthread gets loaded
>>>later via dlopen (this still isn't supported in general, for other
>>>reasons).
>>>
>>>We can't use __libc_single_threaded to replace __gthread_active_p
>>>everywhere. If we replaced the uses of __gthread_active_p in std::mutex
>>>then we would elide the pthread_mutex_lock in the code below, but not
>>>the pthread_mutex_unlock:
>>>
>>>  std::mutex m;
>>>  m.lock();// pthread_mutex_lock
>>>  std::thread t([]{}); // __libc_single_threaded = false
>>>  t.join();
>>>  m.unlock();  // pthread_mutex_unlock
>>>
>>>We need the lock and unlock to use the same "is threading enabled"
>>>predicate, and similarly for init/destroy pairs for mutexes and
>>>condition variables, so that we don't try to release resources that were
>>>never acquired.
>>>
>>>There are other places that could use __libc_single_threaded, such as
>>>_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
>>>they can be changed later.
>>>
>>>libstdc++-v3/ChangeLog:
>>>
>>>PR libstdc++/96817
>>>* include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
>>>New function wrapping __libc_single_threaded if available.
>>>(__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
>>>* libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
>>>(__cxa_guard_release): Likewise.
>>>* testsuite/18_support/96817.cc: New test.
>>>
>>>Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.
>>
>>Hi,
>>
>>This patch introduced regressions on armeb-linux-gnueabhf:
>>--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
>>   g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
>>   g++.dg/cpp2a/decomp1.C  -std=gnu++14 execution test
>>   g++.dg/cpp2a/decomp1.C  -std=gnu++17 execution test
>>   g++.dg/cpp2a/decomp1.C  -std=gnu++2a execution test
>>   g++.dg/init/init-ref2.C  -std=c++14 execution test
>>   g++.dg/init/init-ref2.C  -std=c++17 execution test
>>   g++.dg/init/init-ref2.C  -std=c++2a execution test
>>   g++.dg/init/init-ref2.C  -std=c++98 execution test
>>   g++.dg/init/ref15.C  -std=c++14 execution test
>>   g++.dg/init/ref15.C  -std=c++17 execution test
>>   g++.dg/init/ref15.C  -std=c++2a execution test
>>   g++.dg/init/ref15.C  -std=c++98 execution test
>>   g++.old-deja/g++.jason/pmf7.C  -std=c++98 execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++14 execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++17 execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++2a execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++98 execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++14 execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++17 execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++2a execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++98 execution test
>>
>>and probably some (280) in libstdc++ tests: (I didn't bisect those):
>>   19_diagnostics/error_category/generic_category.cc execution test
>>   19_diagnostics/error_category/system_category.cc execution test
>>   20_util/scoped_allocator/1.cc execution test
>>   20_util/scoped_allocator/2.cc execution test
>>   20_util/scoped_allocator/construct_pair_c++2a.cc execution test
>>   20_util/to_address/debug.cc execution test
>>   20_util/variant/run.cc execution test
>
>I think this is a latent bug in the static initialization code for
>EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
>have:
>
># ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
>
>// Test the guard variable with a memory load with
>// acquire semantics.
>
>inline bool
>__test_and_acquire (__cxxabiv1::__guard *g)
>{
>  unsigned char __c;
>  unsigned char *__p = reinterpret_cast(g);
>  __atomic_load (__p, &__c,  __ATOMIC_ACQUIRE);
>  (void) __p;
>  return _GLIBCXX_GUARD_TEST(&__c);
>}
>#  define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
># endif
>
>That inspects the first byte of the guard variable. But for EABI the
>"is initialized" bit is the least significant bit of the guard
>variable. For little endian that's fine, the least significant bit is
>in the first byte. But for big endian, it's not 

Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]

2020-10-01 Thread Christophe Lyon via Gcc-patches
On Wed, 30 Sep 2020 at 22:44, Jonathan Wakely  wrote:
>
> On 30/09/20 16:03 +0100, Jonathan Wakely wrote:
> >On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:
> >>On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
> >> wrote:
> >>>
> >>>Glibc 2.32 adds a global variable that says whether the process is
> >>>single-threaded. We can use this to decide whether to elide atomic
> >>>operations, as a more precise and reliable indicator than
> >>>__gthread_active_p.
> >>>
> >>>This means that guard variables for statics and reference counting in
> >>>shared_ptr can use less expensive, non-atomic ops even in processes that
> >>>are linked to libpthread, as long as no threads have been created yet.
> >>>It also means that we switch to using atomics if libpthread gets loaded
> >>>later via dlopen (this still isn't supported in general, for other
> >>>reasons).
> >>>
> >>>We can't use __libc_single_threaded to replace __gthread_active_p
> >>>everywhere. If we replaced the uses of __gthread_active_p in std::mutex
> >>>then we would elide the pthread_mutex_lock in the code below, but not
> >>>the pthread_mutex_unlock:
> >>>
> >>>  std::mutex m;
> >>>  m.lock();// pthread_mutex_lock
> >>>  std::thread t([]{}); // __libc_single_threaded = false
> >>>  t.join();
> >>>  m.unlock();  // pthread_mutex_unlock
> >>>
> >>>We need the lock and unlock to use the same "is threading enabled"
> >>>predicate, and similarly for init/destroy pairs for mutexes and
> >>>condition variables, so that we don't try to release resources that were
> >>>never acquired.
> >>>
> >>>There are other places that could use __libc_single_threaded, such as
> >>>_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
> >>>they can be changed later.
> >>>
> >>>libstdc++-v3/ChangeLog:
> >>>
> >>>PR libstdc++/96817
> >>>* include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
> >>>New function wrapping __libc_single_threaded if available.
> >>>(__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
> >>>* libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
> >>>(__cxa_guard_release): Likewise.
> >>>* testsuite/18_support/96817.cc: New test.
> >>>
> >>>Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.
> >>
> >>Hi,
> >>
> >>This patch introduced regressions on armeb-linux-gnueabhf:
> >>--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
> >>   g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
> >>   g++.dg/cpp2a/decomp1.C  -std=gnu++14 execution test
> >>   g++.dg/cpp2a/decomp1.C  -std=gnu++17 execution test
> >>   g++.dg/cpp2a/decomp1.C  -std=gnu++2a execution test
> >>   g++.dg/init/init-ref2.C  -std=c++14 execution test
> >>   g++.dg/init/init-ref2.C  -std=c++17 execution test
> >>   g++.dg/init/init-ref2.C  -std=c++2a execution test
> >>   g++.dg/init/init-ref2.C  -std=c++98 execution test
> >>   g++.dg/init/ref15.C  -std=c++14 execution test
> >>   g++.dg/init/ref15.C  -std=c++17 execution test
> >>   g++.dg/init/ref15.C  -std=c++2a execution test
> >>   g++.dg/init/ref15.C  -std=c++98 execution test
> >>   g++.old-deja/g++.jason/pmf7.C  -std=c++98 execution test
> >>   g++.old-deja/g++.mike/leak1.C  -std=c++14 execution test
> >>   g++.old-deja/g++.mike/leak1.C  -std=c++17 execution test
> >>   g++.old-deja/g++.mike/leak1.C  -std=c++2a execution test
> >>   g++.old-deja/g++.mike/leak1.C  -std=c++98 execution test
> >>   g++.old-deja/g++.other/init19.C  -std=c++14 execution test
> >>   g++.old-deja/g++.other/init19.C  -std=c++17 execution test
> >>   g++.old-deja/g++.other/init19.C  -std=c++2a execution test
> >>   g++.old-deja/g++.other/init19.C  -std=c++98 execution test
> >>
> >>and probably some (280) in libstdc++ tests: (I didn't bisect those):
> >>   19_diagnostics/error_category/generic_category.cc execution test
> >>   19_diagnostics/error_category/system_category.cc execution test
> >>   20_util/scoped_allocator/1.cc execution test
> >>   20_util/scoped_allocator/2.cc execution test
> >>   20_util/scoped_allocator/construct_pair_c++2a.cc execution test
> >>   20_util/to_address/debug.cc execution test
> >>   20_util/variant/run.cc execution test
> >
> >I think this is a latent bug in the static initialization code for
> >EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
> >have:
> >
> ># ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
> >
> >// Test the guard variable with a memory load with
> >// acquire semantics.
> >
> >inline bool
> >__test_and_acquire (__cxxabiv1::__guard *g)
> >{
> >  unsigned char __c;
> >  unsigned char *__p = reinterpret_cast(g);
> >  __atomic_load (__p, &__c,  __ATOMIC_ACQUIRE);
> >  (void) __p;
> >  return _GLIBCXX_GUARD_TEST(&__c);
> >}
> >#  define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
> ># endif
> >
> >That inspects the first byte of the guard variable. But for EABI the
> >"is initialized" bit is the least 

Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-10-01 Thread Uros Bizjak via Gcc-patches
On Wed, Sep 30, 2020 at 6:28 PM Jakub Jelinek  wrote:
>
> On Wed, Sep 30, 2020 at 06:06:31PM +0200, Florian Weimer wrote:
> > This is what I came up with.  It is not valid to set ix86_arch to
> > PROCESSOR_GENERIC, which is why PTA_NO_TUNE is still needed.
>
> Ok, LGTM, but would prefer Uros to have final voice.

OK from my side.

Thanks,
Uros.


Re: [PATCH] avoid modifying type in place (PR 97206)

2020-10-01 Thread Richard Biener via Gcc-patches
On Wed, Sep 30, 2020 at 9:20 PM Martin Sebor via Gcc-patches
 wrote:
>
> On 9/30/20 3:57 AM, Jakub Jelinek wrote:
> > On Tue, Sep 29, 2020 at 03:40:40PM -0600, Martin Sebor via Gcc-patches 
> > wrote:
> >> I will commit this patch later this week unless I hear concerns
> >> or suggestions for changes.
> >
> > That is not how the patch review process works.
>
> The review process hasn't been working well for me, but thankfully,
> the commit policy lets me make these types of "obvious" fixes on
> my own, without waiting for approval.

Guess it would help if you'd simply say

"I will commit this patch as obvious later this week."

which makes clear you view the change as obvious.

Richard.

>  But if I could get simple
> changes reviewed in a few days instead of having to ping them for
> weeks there would be no reason for me to take advantage of this
> latitude (and for us to rehash this topic yet again).
> >> +  arat = tree_cons (get_identifier ("array"), flag, NULL_TREE);
> >
> > Better
> > arat = build_tree_list (get_identifier ("array"), flag);
> > then, tree_cons is when you have a meaningful TREE_CHAIN you want to supply
> > too.
>
> Okay.  I checked to make sure they both do the same thing and
> create a tree with the size and committed the updated patch in
> r11-3570.
>
> Martin
>
> >>  }
> >>
> >> -  TYPE_ATOMIC (artype) = TYPE_ATOMIC (type);
> >> -  TYPE_READONLY (artype) = TYPE_READONLY (type);
> >> -  TYPE_RESTRICT (artype) = TYPE_RESTRICT (type);
> >> -  TYPE_VOLATILE (artype) = TYPE_VOLATILE (type);
> >> -  type = artype;
> >> +  const int quals = TYPE_QUALS (type);
> >> +  type = build_array_type (eltype, index_type);
> >> +  type = build_type_attribute_qual_variant (type, arat, quals);
> >>   }
> >>
> >> /* Format the type using the current pretty printer.  The generic tree
> >> @@ -2309,10 +2304,6 @@ attr_access::array_as_string (tree type) const
> >> typstr = pp_formatted_text (pp);
> >> delete pp;
> >>
> >> -  if (this->str)
> >> -/* Remove the attribute that wasn't installed by decl_attributes.  */
> >> -TYPE_ATTRIBUTES (type) = NULL_TREE;
> >> -
> >> return typstr;
> >>   }
> >
> > Otherwise LGTM.
> >
> >   Jakub
> >
>


Re: [PATCH v2] builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-10-01 Thread Richard Biener
On Wed, 30 Sep 2020, Segher Boessenkool wrote:

> On Wed, Sep 30, 2020 at 09:02:34AM +0200, Richard Biener wrote:
> > On Tue, 29 Sep 2020, Segher Boessenkool wrote:
> > > I don't see much about optabs in the docs either.  Add some text to
> > > optabs.def itself then?
> > 
> > All optabs are documented in doc/md.texi as 'instruction patterns'
> 
> Except for what seems to be the majority that isn't.

Really?  Everytime I looked for one I found it.

> > This is where new optabs need to be documented.
> 
> It's going to be challenging to find a reasonable spot in there.
> Oh well.

Put it next to fmin/fmax docs or sin, etc. - at least the section
should be clear ;)  But yeah, patterns seem to be quite randomly
"sorted"...

Richard.

> Thanks,
> 
> 
> Segher
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend