date:20201002

Re: [PATCH 1/4] system_data_types.7: Add '__int128'

2020-10-02 Thread Paul Eggert


On 10/2/20 4:44 PM, Alejandro Colomar wrote:


I know, they aren't perfect.
But they are still very useful,
and don't see a good reason to not document them.


"aren't perfect" is an understatement

More important, lots of things in GNU C are useful but shouldn't be documented 
in the man pages, because they're out of scope. (The syntax of GNU C strings, 
for example.) The man pages are not intended to be a guide to every feature of 
GNU C. There is the GNU C manual for that, and people can read that.

Re: [PATCH 1/4] system_data_types.7: Add '__int128'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Hi Paul,

On 2020-10-02 22:19, Paul Eggert wrote:
> On 10/2/20 1:03 PM, Alejandro Colomar wrote:
>> I know it's not in stdint,
>> but I mean that it behaves as any other stdint type.

With caveats, of course.

>
> It doesn't. There's no portable way to use scanf and printf on it.

I didn't need to.  Yes that's a problem.
It may be possible to write a custom specifier for printf,
but I didn't try.  I wrote one for printing binary,
and it's not that difficult.

If you really need it, this might help:

https://github.com/alejandro-colomar/libalx/blob/d193b5648834c135824a5ba68d0ffcd2d38155a8/src/base/stdio/printf/b.c

> You can't reliably convert it to intmax_t.

Well, intmax_t isn't really that useful.
I see it more like a generic type, than an actual type.

I guess you could have

typedef __int128 intwidest_t;

if you find it's useful to you.

> It doesn't have the associated _MIN and _MAX macros
> that the stdint types do. It's a completely different animal.

Those are really easy to write.
For my use cases, they have been enough.
These might be useful to you:

#define UINT128_C(c)((uint128_t)c)
#define INT128_C(c) (( int128_t)c)
#define UINT128_MAX ((uint128_t)~UINT128_C(0))
#define INT128_MAX  (( int128_t)(UINT128_MAX >> 1))
#define INT128_MIN  (( int128_t)(-INT128_MAX - 1))

>
> If all you need are a few bit-twiddling tricks on x86-64, it should
> work. But watch out if you try to do something fancy, like multiply or
> divide or read or print or atomics. There's a good reason it's excluded
> from intmax_t.

I know, they aren't perfect.
But they are still very useful,
and don't see a good reason to not document them.

Cheers,

Alex

[r11-3633 Regression] FAIL: c-c++-common/spellcheck-reserved.c -std=gnu++98 (test for excess errors) on Linux/x86_64 (-m64 -march=cascadelake)

2020-10-02 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

7ee1c0413e251ff0b6a6d526209ef038b9835320 is the first bad commit
commit 7ee1c0413e251ff0b6a6d526209ef038b9835320
Author: Nathan Sidwell 
Date:   Fri Oct 2 11:13:26 2020 -0700

c++: Hash table iteration for namespace-member spelling suggestions

caused

FAIL: c-c++-common/spellcheck-reserved.c  -std=gnu++14  (test for errors, line 
31)
FAIL: c-c++-common/spellcheck-reserved.c  -std=gnu++14 (test for excess errors)
FAIL: c-c++-common/spellcheck-reserved.c  -std=gnu++17  (test for errors, line 
31)
FAIL: c-c++-common/spellcheck-reserved.c  -std=gnu++17 (test for excess errors)
FAIL: c-c++-common/spellcheck-reserved.c  -std=gnu++2a  (test for errors, line 
31)
FAIL: c-c++-common/spellcheck-reserved.c  -std=gnu++2a (test for excess errors)
FAIL: c-c++-common/spellcheck-reserved.c  -std=gnu++98  (test for errors, line 
31)
FAIL: c-c++-common/spellcheck-reserved.c  -std=gnu++98 (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3633/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=c-c++-common/spellcheck-reserved.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [patch] Rework CPP_BUILTINS_SPEC for powerpc-vxworks

2020-10-02 Thread Segher Boessenkool

Hi Olivier,

On Thu, Oct 01, 2020 at 11:30:55AM +0200, Olivier Hainque wrote:
> This change reworks CPP_BUILTINS_SPEC for powerpc-vxworks to
> prepare for the upcoming addition of 32 and 64 bit ports for
> VxWorks 7r2.

Cool, looking forward to it!

Your attachment is not quotable (it is application/octet-stream), so
I'll paste it in here, hopefully correct:

--- a/gcc/config/rs6000/vxworks.h
+++ b/gcc/config/rs6000/vxworks.h
@@ -26,21 +26,56 @@ along with GCC; see the file COPYING3.  If not see
 /* CPP predefined macros.  */

 #undef TARGET_OS_CPP_BUILTINS
-#define TARGET_OS_CPP_BUILTINS()   \
-  do   \
-{  \
-  builtin_define ("__ppc");\
-  builtin_define ("__PPC__");  \
-  builtin_define ("__EABI__"); \
-  builtin_define ("__ELF__");  \
-  if (!TARGET_SOFT_FLOAT)  \
-   builtin_define ("__hardfp");\
+#define TARGET_OS_CPP_BUILTINS()\

Hrm, you changed a lot of white space, was that on purpose?

+  do\
+{   \
+  /* CPU macros.  */   \
+  builtin_define ("__ppc"); \
+  builtin_define ("__ppc__");   \
+  builtin_define ("__PPC"); \
+  builtin_define ("__PPC__");   \
+  builtin_define ("__powerpc"); \
+  builtin_define ("__powerpc__");   \
+  if (TARGET_64BIT) \
+{   \
+  builtin_define ("__ppc64");   \
+  builtin_define ("__ppc64__"); \
+  builtin_define ("__PPC64");  \
+  builtin_define ("__PPC64__"); \
+  builtin_define ("__powerpc64");  \
+  builtin_define ("__powerpc64__"); \
+   }   \

Are all those new names actually defined by your ABIs?  If not, this is
counter-productive: it does not help anyone if there are six ways to
write things, where not all ways are supported by all compilers!
(Including older versions of the same compilers.)

-  /* C89 namespace violation! */   \
-  builtin_define ("CPU_FAMILY=PPC");   \

+  builtin_define ("CPU_FAMILY=PPC");   \

You removed the comment, but it is rather important still?  Of course
the "C89" part of it is dated, but it is true for all newer language
standards just the same.

Cheers,

Segher

Re: [PATCH] libgccjit: add some reflection functions in the jit C api [PR96889]

2020-10-02 Thread Antoni Boucher via Gcc-patches


Hi.
Thanks for the review.
I attached the updated patch file.
I don't have a copyright assignment, but I'll look at that.

On Fri, Oct 02, 2020 at 04:24:26PM -0400, David Malcolm wrote:

On Fri, 2020-10-02 at 16:17 -0400, David Malcolm wrote:

On Tue, 2020-09-01 at 21:01 -0400, Antoni Boucher via Jit wrote:
> Hello.
> This WIP patch implements new reflection functions in the C API as
> mentioned in bug 96889.
> I'm looking forward for feedbacks on this patch.
> It's WIP because I'll probably add a few more reflection functions.
> Thanks.

Sorry about the belated review, looks like I missed this one.

At a high level, it seems reasonable.

Do you have a copyright assignment in place for GCC contributions?
See https://gcc.gnu.org/contribute.html

[...]


diff --git a/gcc/jit/docs/topics/compatibility.rst

> b/gcc/jit/docs/topics/compatibility.rst
> index bb3387fa583..7e786194ded 100644
> --- a/gcc/jit/docs/topics/compatibility.rst
> +++ b/gcc/jit/docs/topics/compatibility.rst
> @@ -219,3 +219,14 @@ entrypoints:
>* :func:`gcc_jit_version_minor`
>
>* :func:`gcc_jit_version_patchlevel`
> +
> +.. _LIBGCCJIT_ABI_14:
> +
> +``LIBGCCJIT_ABI_14``
> +
> +``LIBGCCJIT_ABI_14`` covers the addition of reflection functions
> via API
> +entrypoints:
> +
> +  * :func:`gcc_jit_function_get_return_type`
> +
> +  * :func:`gcc_jit_function_get_param_count`

This will now need bumping to 15; 14 covers the addition of
gcc_jit_global_set_initializer.

[...]

> +/* Public entrypoint.  See description in libgccjit.h.
> +
> +   After error-checking, the real work is done by the
> +   gcc::jit::recording::function::get_return_type method, in
> +   jit-recording.h.  */
> +
> +gcc_jit_type *
> +gcc_jit_function_get_return_type (gcc_jit_function *func)
> +{

This one is missing a:
  RETURN_NULL_IF_FAIL (func, NULL, NULL, "NULL function");


> +return (gcc_jit_type *)func->get_return_type ();
> +}

[...]

> diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
> index 1c5a12e9c01..6999ce25ca2 100644
> --- a/gcc/jit/libgccjit.h
> +++ b/gcc/jit/libgccjit.h

[...]

> @@ -1503,6 +1511,22 @@ gcc_jit_version_minor (void);
>  extern int
>  gcc_jit_version_patchlevel (void);
>
> +#define LIBGCCJIT_HAVE_gcc_jit_function_reflection
> +
> +/* Reflection functions to get the number of parameters and return
> types of
> +   a function from the C API.

"return type", is better grammar, I think, given that "function" is
singular.

> +
> +   "vec_type" should be a vector type, created using
> gcc_jit_type_get_vector.

This line about "vec_type" seems to be a leftover from a copy


> +   This API entrypoint was added in LIBGCCJIT_ABI_14; you can test
> for its
> +   presence using
> + #ifdef LIBGCCJIT_HAVE_gcc_jit_function_reflection

Version number will need bumping, as mentioned above.

[...]

> diff --git a/gcc/jit/libgccjit.map b/gcc/jit/libgccjit.map
> index 6137dd4b4b0..b28f81a7a32 100644
> --- a/gcc/jit/libgccjit.map
> +++ b/gcc/jit/libgccjit.map
> @@ -186,4 +186,10 @@ LIBGCCJIT_ABI_13 {
>  gcc_jit_version_major;
>  gcc_jit_version_minor;
>  gcc_jit_version_patchlevel;
> -} LIBGCCJIT_ABI_12;
> \ No newline at end of file
> +} LIBGCCJIT_ABI_12;
> +
> +LIBGCCJIT_ABI_14 {
> +  global:
> +gcc_jit_function_get_return_type;
> +gcc_jit_function_get_param_count;
> +} LIBGCCJIT_ABI_13;

Likewise.

[...]

Otherwise looks good to me.

Bonus points for adding C++ bindings (and docs for them), but I don't
know of anyone using the C++ bindings.



Also, please add "PR jit/96889" to the ChangeLog entries, and [PR96889]
to the subject line.

Dave

>From ef3b7d15622cc50dc4cae62fb7c31aeacc0f1ed9 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Sat, 1 Aug 2020 17:52:17 -0400
Subject: [PATCH] This patch add some reflection functions in the jit C api
 [PR96889]

2020-09-1  Antoni Boucher  

gcc/jit/
PR target/96889
* docs/topics/compatibility.rst (LIBGCCJIT_ABI_14): New ABI tag.
* docs/topics/functions.rst: Add documentation for the
functions gcc_jit_function_get_return_type and
gcc_jit_function_get_param_count
* libgccjit.c (gcc_jit_function_get_param_count and
gcc_jit_function_get_return_type): New functions.
* libgccjit.h
* libgccjit.map (LIBGCCJIT_ABI_14): New ABI tag.

gcc/testsuite/
PR target/96889
* jit.dg/all-non-failing-tests.h: Add test-reflection.c.
* jit.dg/test-reflection.c: New test.
---
 gcc/jit/docs/topics/compatibility.rst| 11 
 gcc/jit/docs/topics/functions.rst| 10 +++
 gcc/jit/libgccjit.c  | 29 
 gcc/jit/libgccjit.h  | 22 +++
 gcc/jit/libgccjit.map|  6 
 gcc/testsuite/jit.dg/all-non-failing-tests.h | 10 +++
 gcc/testsuite/jit.dg/test-reflection.c   | 27 ++
 7 files changed, 115

Re: [PATCH] libstdc++: Diagnose visitors with different return types [PR95904]

2020-10-02 Thread Jonathan Wakely via Gcc-patches


On 29/09/20 19:35 +0300, Ville Voutilainen via Libstdc++ wrote:

On Tue, 29 Sep 2020 at 14:20, Jonathan Wakely  wrote:

I think this is what we want:

   template
 constexpr inline __same_types = (is_same_v<_Tp, _Types> && ...);

is_same_v is very cheap, it uses the built-in directly, so you don't
need to instantiate any class templates at all.

>+
>+  template 

typename not class please.

>+decltype(auto) __check_visitor_result(_Visitor&& __vis,

New line after the decltype(auto) please, not in the middle of the
parameter list.


Aye.



diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index dd8847cf829..6f647d622c4 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -182,7 +182,7 @@ namespace __variant
  // used for raw visitation with indices passed in
  struct __variant_idx_cookie { using type = __variant_idx_cookie; };
  // Used to enable deduction (and same-type checking) for std::visit:
-  template struct __deduce_visit_result { };
+  template struct __deduce_visit_result { using type = _Tp; };

  // Visit variants that might be valueless.
  template
@@ -1017,7 +1017,22 @@ namespace __variant

  static constexpr auto
  _S_apply()
-  { return _Array_type{&__visit_invoke}; }
+  {
+   constexpr bool __visit_ret_type_mismatch =
+ _Array_type::__result_is_deduced::value
+ && !is_same_v(),
+   std::declval<_Variants>()...))>;
+   if constexpr (__visit_ret_type_mismatch)
+ {
+   static_assert(!__visit_ret_type_mismatch,
+ "std::visit requires the visitor to have the same "
+ "return type for all alternatives of a variant");
+   return __nonesuch{};
+ }
+   else
+ return _Array_type{&__visit_invoke};
+  }
};

  template
@@ -1692,6 +1707,26 @@ namespace __variant
   std::forward<_Variants>(__variants)...);
}

+  template
+ constexpr inline bool __same_types = (is_same_v<_Tp, _Types> && ...);
+
+  template 
+decltype(auto)
+__check_visitor_result(_Visitor&& __vis, _Variant&& __variant)
+{
+  return std::forward<_Visitor>(__vis)(
+std::get<_Idx>(std::forward<_Variant>(__variant)));


Looks good, the new error is nice.

git apply warns about some whitespace errors:

/dev/shm/pr95904.diff:51: indent with spaces.
std::get<_Idx>(std::forward<_Variant>(__variant)));
/dev/shm/pr95904.diff:73: indent with spaces.
{
/dev/shm/pr95904.diff:77: indent with spaces.
std::variant_size...>::value>());
/dev/shm/pr95904.diff:92: indent with spaces.
  std::forward<_Visitor>(__visitor),
warning: 4 lines add whitespace errors.

OK for trunk with those leading spaces switched to tab.

Thanks!

Re: This is my patch for fstream to fix the performance issue on Windows.

2020-10-02 Thread Jonathan Wakely via Gcc-patches


On 01/10/20 03:29 +, sotrdg sotrdg via Libstdc++ wrote:

From fb8d644a4c315058af141a3e84fcc083d665c8b9 Mon Sep 17 00:00:00 2001
From: ejsvifq_mabmip 
Date: Wed, 30 Sep 2020 23:26:47 -0400
Subject: [PATCH] Fix a long term performance issue of fstream on Windows since
MSVCRT defines BUFSIZ as 512 which causes the serious downgrade of I/O
performance.

Even stdio itself is using 4096 as real buffer size, the behavior should be the 
same as FILE* on Windows.


The attached patch seems a cleaner approach. Does it solve your
performance issues?

commit dd71d4081e34f8c95149c561456140ae59ea10ef
Author: Jonathan Wakely 
Date:   Fri Oct 2 22:54:50 2020

libstdc++: Override BUFSIZ for Windows targets [PR 94268]

This replaces uses of BUFSIZ with a new _GLIBCXX_BUFSIZ macro that can
be overridden in target-specific config headers.

That allows the mingw and mingw-w64 targets to override it, because
BUFSIZ is apparently defined to 512, resulting in poor performance. The
MSVCRT stdio apparently uses 4096, so we use that too.

libstdc++-v3/ChangeLog:

PR libstdc++/94268
* config/os/mingw32-w64/os_defines.h (_GLIBCXX_BUFSIZ):
Define.
* config/os/mingw32/os_defines.h (_GLIBCXX_BUFSIZ):
Define.
* include/bits/fstream.tcc: Use _GLIBCXX_BUFSIZ instead
of BUFSIZ.
* include/ext/stdio_filebuf.h: Likewise.
* include/std/fstream (_GLIBCXX_BUFSIZ): Define.

diff --git a/libstdc++-v3/config/os/mingw32-w64/os_defines.h b/libstdc++-v3/config/os/mingw32-w64/os_defines.h
index e535f6c2b85..39bdedd19e9 100644
--- a/libstdc++-v3/config/os/mingw32-w64/os_defines.h
+++ b/libstdc++-v3/config/os/mingw32-w64/os_defines.h
@@ -90,4 +90,7 @@
 
 #define _GLIBCXX_USE_CRT_RAND_S 1
 
+// See libstdc++/94268
+#define _GLIBCXX_BUFSIZ 4096
+
 #endif
diff --git a/libstdc++-v3/config/os/mingw32/os_defines.h b/libstdc++-v3/config/os/mingw32/os_defines.h
index 1fee89c49f5..9d2f2bda660 100644
--- a/libstdc++-v3/config/os/mingw32/os_defines.h
+++ b/libstdc++-v3/config/os/mingw32/os_defines.h
@@ -78,4 +78,7 @@
 // See libstdc++/59807
 #define _GTHREAD_USE_MUTEX_INIT_FUNC 1
 
+// See libstdc++/94268
+#define _GLIBCXX_BUFSIZ 4096
+
 #endif
diff --git a/libstdc++-v3/include/bits/fstream.tcc b/libstdc++-v3/include/bits/fstream.tcc
index 81d00c4d318..a4ebbb84fe7 100644
--- a/libstdc++-v3/include/bits/fstream.tcc
+++ b/libstdc++-v3/include/bits/fstream.tcc
@@ -80,7 +80,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 basic_filebuf<_CharT, _Traits>::
 basic_filebuf() : __streambuf_type(), _M_lock(), _M_file(&_M_lock),
 _M_mode(ios_base::openmode(0)), _M_state_beg(), _M_state_cur(),
-_M_state_last(), _M_buf(0), _M_buf_size(BUFSIZ),
+_M_state_last(), _M_buf(0), _M_buf_size(_GLIBCXX_BUFSIZ),
 _M_buf_allocated(false), _M_reading(false), _M_writing(false), _M_pback(), 
 _M_pback_cur_save(0), _M_pback_end_save(0), _M_pback_init(false),
 _M_codecvt(0), _M_ext_buf(0), _M_ext_buf_size(0), _M_ext_next(0),
diff --git a/libstdc++-v3/include/ext/stdio_filebuf.h b/libstdc++-v3/include/ext/stdio_filebuf.h
index fb95bec7350..3b297285ad3 100644
--- a/libstdc++-v3/include/ext/stdio_filebuf.h
+++ b/libstdc++-v3/include/ext/stdio_filebuf.h
@@ -75,7 +75,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  closed when the stdio_filebuf is closed/destroyed.
   */
   stdio_filebuf(int __fd, std::ios_base::openmode __mode,
-		size_t __size = static_cast(BUFSIZ));
+		size_t __size = static_cast(_GLIBCXX_BUFSIZ));
 
   /**
*  @param  __f  An open @c FILE*.
@@ -88,7 +88,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  stdio_filebuf is closed/destroyed.
   */
   stdio_filebuf(std::__c_file* __f, std::ios_base::openmode __mode,
-		size_t __size = static_cast(BUFSIZ));
+		size_t __size = static_cast(_GLIBCXX_BUFSIZ));
 
   /**
*  Closes the external data stream if the file descriptor constructor
diff --git a/libstdc++-v3/include/std/fstream b/libstdc++-v3/include/std/fstream
index efc99d1e5a5..c00f9d03895 100644
--- a/libstdc++-v3/include/std/fstream
+++ b/libstdc++-v3/include/std/fstream
@@ -44,6 +44,11 @@
 #include  // For std::string overloads.
 #endif
 
+// This can be override by the target's os_defines.h
+#ifndef _GLIBCXX_BUFSIZ
+# define _GLIBCXX_BUFSIZ BUFSIZ
+#endif
+
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION

[committed] libstdc++: Change test to work without 64-bit atomics

2020-10-02 Thread Jonathan Wakely via Gcc-patches

This fixes a linker error for older ARM cores without 64-bit atomics.

I think the { dg-add-options libatomic } is no longer needed, but it's
harmless to keep it there.

libstdc++-v3/ChangeLog:

* testsuite/29_atomics/atomic_float/value_init.cc: Use float
instead of double so that __atomic_load_8 isn't needed.

Tested powerpc64le-linux and armv7l-linux-gnueabihf.
Committed to trunk.

commit 324118378e4e26d9c0f86734af26538491c5c5fc
Author: Jonathan Wakely 
Date:   Fri Oct 2 22:14:06 2020

libstdc++: Change test to work without 64-bit atomics

This fixes a linker error for older ARM cores without 64-bit atomics.

I think the { dg-add-options libatomic } is no longer needed, but it's
harmless to keep it there.

libstdc++-v3/ChangeLog:

* testsuite/29_atomics/atomic_float/value_init.cc: Use float
instead of double so that __atomic_load_8 isn't needed.

diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_float/value_init.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_float/value_init.cc
index 38af9bdc8d4..dd8114d6034 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_float/value_init.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_float/value_init.cc
@@ -22,13 +22,13 @@
 #include 
 #include 
 
-constexpr std::atomic a;
+constexpr std::atomic a;
 
 void
 test01()
 {
   VERIFY(a.load() == 0);
-  static_assert(std::is_nothrow_default_constructible_v>);
+  static_assert(std::is_nothrow_default_constructible_v>);
 }
 
 int

Re: [committed] libstdc++: Use __libc_single_threaded to optimise atomics [PR 96817]

2020-10-02 Thread Jonathan Wakely via Gcc-patches


On 26/09/20 20:42 +0100, Jonathan Wakely wrote:

Glibc 2.32 adds a global variable that says whether the process is
single-threaded. We can use this to decide whether to elide atomic
operations, as a more precise and reliable indicator than
__gthread_active_p.

This means that guard variables for statics and reference counting in
shared_ptr can use less expensive, non-atomic ops even in processes that
are linked to libpthread, as long as no threads have been created yet.
It also means that we switch to using atomics if libpthread gets loaded
later via dlopen (this still isn't supported in general, for other
reasons).

We can't use __libc_single_threaded to replace __gthread_active_p
everywhere. If we replaced the uses of __gthread_active_p in std::mutex
then we would elide the pthread_mutex_lock in the code below, but not
the pthread_mutex_unlock:

 std::mutex m;
 m.lock();// pthread_mutex_lock
 std::thread t([]{}); // __libc_single_threaded = false
 t.join();
 m.unlock();  // pthread_mutex_unlock

We need the lock and unlock to use the same "is threading enabled"
predicate, and similarly for init/destroy pairs for mutexes and
condition variables, so that we don't try to release resources that were
never acquired.

There are other places that could use __libc_single_threaded, such as
_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
they can be changed later.

libstdc++-v3/ChangeLog:

PR libstdc++/96817
* include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
New function wrapping __libc_single_threaded if available.
(__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
* libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
(__cxa_guard_release): Likewise.
* testsuite/18_support/96817.cc: New test.



The new test was broken, fixed with this.

Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.


commit 1ad08b64cea51d3cb989a1a176baeb8a18071231
Author: Jonathan Wakely 
Date:   Fri Oct 2 21:10:55 2020

libstdc++: Fix testcase by using terminate handler

This test was supposed to verify that when __libc_single_threaded is
available we successfully detect recursive static initialization even
when linked to libpthread. But I forgot to that when recursive init is
detected, we terminate, and so the test fails.

This adds a terminate handler that exits cleanly, so the test passes
when recursive init is detected.

libstdc++-v3/ChangeLog:

* testsuite/18_support/96817.cc: Use terminate handler that
calls _Exit(0).

diff --git a/libstdc++-v3/testsuite/18_support/96817.cc b/libstdc++-v3/testsuite/18_support/96817.cc
index 4c4da40afa9..19399c473ef 100644
--- a/libstdc++-v3/testsuite/18_support/96817.cc
+++ b/libstdc++-v3/testsuite/18_support/96817.cc
@@ -21,6 +21,9 @@
 
 // PR libstdc++/96817
 
+#include 
+#include 
+
 int init()
 {
 #if __has_include()
@@ -32,8 +35,11 @@ int init()
   return 0;
 }
 
+void clean_terminate() { _Exit(0); }
+
 int
 main (int argc, char **argv)
 {
+  std::set_terminate(clean_terminate);
   init();
 }

Re: [PATCH] c++: Fix printing of C++20 template parameter object [PR97014]

2020-10-02 Thread Jason Merrill via Gcc-patches


On 10/1/20 5:49 PM, Marek Polacek wrote:

No one is interested in the mangled name of the C++20 template parameter
object for a class NTTP.  So instead of printing

   required for the satisfaction of ‘positive’ [with T = 
X<::_ZTAXtl5ratioLin1ELi2EEE>]

let's print

   required for the satisfaction of ‘positive’ [with T = X<{-1, 2}>]

I don't think adding a test is necessary for this.


OK.


gcc/cp/ChangeLog:

PR c++/97014
* cxx-pretty-print.c (pp_cxx_template_argument_list): If the
argument is template_parm_object_p, print its DECL_INITIAL.
---
  gcc/cp/cxx-pretty-print.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/gcc/cp/cxx-pretty-print.c b/gcc/cp/cxx-pretty-print.c
index d10c18db039..8bea79b93a2 100644
--- a/gcc/cp/cxx-pretty-print.c
+++ b/gcc/cp/cxx-pretty-print.c
@@ -1910,6 +1910,8 @@ pp_cxx_template_argument_list (cxx_pretty_printer *pp, 
tree t)
  if (TYPE_P (arg) || (TREE_CODE (arg) == TEMPLATE_DECL
   && TYPE_P (DECL_TEMPLATE_RESULT (arg
pp->type_id (arg);
+ else if (template_parm_object_p (arg))
+   pp->expression (DECL_INITIAL (arg));
  else
pp->expression (arg);
}

base-commit: dfaa24c974bab4bc1bd3840d67ca1701acc0010c

[PATCH] PR fortran/97272 - Wrong answer from MAXLOC with character arg

2020-10-02 Thread Harald Anlauf

The generation of the library call for the MINLOC/MAXLOC intrinsic
mishandled the optional KIND argument and resulted in a bad
argument list passed to the library function.  The fix is obvious.

Regtested on x86_64-pc-linux-gnu.

OK for master?  As it technically wrong code, OK for backports?

Thanks,
Harald


PR fortran/97272 - Wrong answer from MAXLOC with character arg

The optional KIND argument to the MINLOC/MAXLOC intrinsic must not be
passed to the library function, as the kind conversion of the result
is treated explicitly elsewhere.

gcc/fortran/ChangeLog:

PR fortran/97272
* trans-intrinsic.c (gfc_conv_intrinsic_minmaxloc): Ignore KIND
argument here, as it is treated elsewhere.

gcc/testsuite/ChangeLog:

PR fortran/97272
* gfortran.dg/pr97272.f90: New test.

diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index 3b3bd8629cd..9e9898c2bbf 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -5211,7 +5211,9 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op)
   while (a->next)
 	{
 	  b = a->next;
-	  if (b->expr == NULL || strcmp (b->name, "dim") == 0)
+	  if (b->expr == NULL
+	  || strcmp (b->name, "dim") == 0
+	  || strcmp (b->name, "kind") == 0)
 	{
 	  a->next = b->next;
 	  b->next = NULL;
diff --git a/gcc/testsuite/gfortran.dg/pr97272.f90 b/gcc/testsuite/gfortran.dg/pr97272.f90
new file mode 100644
index 000..e81903860ea
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr97272.f90
@@ -0,0 +1,19 @@
+! { dg-do run }
+! PR fortran/97272 - Wrong answer from MAXLOC with character arg
+
+program test
+  implicit none
+  integer :: i, j, k, l = 10
+  character, allocatable :: a(:)
+  allocate (a(l))
+  a(:) = 'a'
+  l = l - 1
+  a(l) = 'b'
+  i = maxloc (a, dim=1)
+  j = maxloc (a, dim=1, kind=2)
+  k = maxloc (a, dim=1, kind=8, back=.true.)
+! print *, 'i = ', i, 'a(i) = ', a(i)
+! print *, 'j = ', j, 'a(j) = ', a(j)
+! print *, 'k = ', k, 'a(k) = ', a(k)
+  if (i /= l .or. j /= l .or. k /= l) stop 1
+end

Re: [PATCH v4 1/2] system_data_types.7: Add 'void *'

2020-10-02 Thread Alejandro Colomar via Gcc-patches


Hi Paul,

On 2020-10-02 22:14, Paul Eggert wrote:
> On 10/2/20 11:38 AM, Alejandro Colomar wrote:
>
>> .I void *
>>
>> renders with a space in between.
>
> That's odd, as "man(7)" says "All of the arguments will be printed next
> to each other without intervening spaces". I'd play it safe and quote
> the arg anyway.

Oops, that's a bug in man(7).
Don't worry about it.

Michael, you might want to have a look at it.

I'll also add Branden, who might have something to say about it.

>
>>  > %p works with any object pointer type (or in POSIX, any pointer 
type),

>>  > not just  void *.
>> In theory, no (if otherwise, I'd like to know why):
>
> Oh, you're right. I had missed that. In GNU/Linux hosts, though, any
> pointer (including function pointers) can be given to %p.
>
> The only platforms where %p wouldn't work on all pointers would be
> platforms like IBM i, which has both 64-bit (process local) pointers and
> 128-bit (tagged space) pointers and where you can declare and use
> pointers of different widths in the same program.

:-)

Cheers,

Alex

Re: [PATCH] libgccjit: add some reflection functions in the jit C api

2020-10-02 Thread David Malcolm via Gcc-patches

On Fri, 2020-10-02 at 16:17 -0400, David Malcolm wrote:
> On Tue, 2020-09-01 at 21:01 -0400, Antoni Boucher via Jit wrote:
> > Hello.
> > This WIP patch implements new reflection functions in the C API as 
> > mentioned in bug 96889.
> > I'm looking forward for feedbacks on this patch.
> > It's WIP because I'll probably add a few more reflection functions.
> > Thanks.
> 
> Sorry about the belated review, looks like I missed this one.
> 
> At a high level, it seems reasonable.
> 
> Do you have a copyright assignment in place for GCC contributions?
> See https://gcc.gnu.org/contribute.html
> 
> [...]

diff --git a/gcc/jit/docs/topics/compatibility.rst
> > b/gcc/jit/docs/topics/compatibility.rst
> > index bb3387fa583..7e786194ded 100644
> > --- a/gcc/jit/docs/topics/compatibility.rst
> > +++ b/gcc/jit/docs/topics/compatibility.rst
> > @@ -219,3 +219,14 @@ entrypoints:
> >* :func:`gcc_jit_version_minor`
> >  
> >* :func:`gcc_jit_version_patchlevel`
> > +
> > +.. _LIBGCCJIT_ABI_14:
> > +
> > +``LIBGCCJIT_ABI_14``
> > +
> > +``LIBGCCJIT_ABI_14`` covers the addition of reflection functions
> > via API
> > +entrypoints:
> > +
> > +  * :func:`gcc_jit_function_get_return_type`
> > +
> > +  * :func:`gcc_jit_function_get_param_count`
> 
> This will now need bumping to 15; 14 covers the addition of
> gcc_jit_global_set_initializer.
> 
> [...]
> 
> > +/* Public entrypoint.  See description in libgccjit.h.
> > +
> > +   After error-checking, the real work is done by the
> > +   gcc::jit::recording::function::get_return_type method, in
> > +   jit-recording.h.  */
> > +
> > +gcc_jit_type *
> > +gcc_jit_function_get_return_type (gcc_jit_function *func)
> > +{
> 
> This one is missing a:
>   RETURN_NULL_IF_FAIL (func, NULL, NULL, "NULL function");
> 
> 
> > +return (gcc_jit_type *)func->get_return_type ();
> > +}
> 
> [...]
> 
> > diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
> > index 1c5a12e9c01..6999ce25ca2 100644
> > --- a/gcc/jit/libgccjit.h
> > +++ b/gcc/jit/libgccjit.h
> 
> [...]
> 
> > @@ -1503,6 +1511,22 @@ gcc_jit_version_minor (void);
> >  extern int
> >  gcc_jit_version_patchlevel (void);
> >  
> > +#define LIBGCCJIT_HAVE_gcc_jit_function_reflection
> > +
> > +/* Reflection functions to get the number of parameters and return
> > types of
> > +   a function from the C API.
> 
> "return type", is better grammar, I think, given that "function" is
> singular.
> 
> > +
> > +   "vec_type" should be a vector type, created using
> > gcc_jit_type_get_vector.
> 
> This line about "vec_type" seems to be a leftover from a copy
> 
> 
> > +   This API entrypoint was added in LIBGCCJIT_ABI_14; you can test
> > for its
> > +   presence using
> > + #ifdef LIBGCCJIT_HAVE_gcc_jit_function_reflection
> 
> Version number will need bumping, as mentioned above.
> 
> [...]
> 
> > diff --git a/gcc/jit/libgccjit.map b/gcc/jit/libgccjit.map
> > index 6137dd4b4b0..b28f81a7a32 100644
> > --- a/gcc/jit/libgccjit.map
> > +++ b/gcc/jit/libgccjit.map
> > @@ -186,4 +186,10 @@ LIBGCCJIT_ABI_13 {
> >  gcc_jit_version_major;
> >  gcc_jit_version_minor;
> >  gcc_jit_version_patchlevel;
> > -} LIBGCCJIT_ABI_12;
> > \ No newline at end of file
> > +} LIBGCCJIT_ABI_12;
> > +
> > +LIBGCCJIT_ABI_14 {
> > +  global:
> > +gcc_jit_function_get_return_type;
> > +gcc_jit_function_get_param_count;
> > +} LIBGCCJIT_ABI_13;
> 
> Likewise.
> 
> [...]
> 
> Otherwise looks good to me.
> 
> Bonus points for adding C++ bindings (and docs for them), but I don't
> know of anyone using the C++ bindings.


Also, please add "PR jit/96889" to the ChangeLog entries, and [PR96889]
to the subject line.

Dave

Re: [PATCH 1/4] system_data_types.7: Add '__int128'

2020-10-02 Thread Paul Eggert


On 10/2/20 1:03 PM, Alejandro Colomar wrote:

I know it's not in stdint,
but I mean that it behaves as any other stdint type.


It doesn't. There's no portable way to use scanf and printf on it. You can't 
reliably convert it to intmax_t. It doesn't have the associated _MIN and _MAX 
macros that the stdint types do. It's a completely different animal.


If all you need are a few bit-twiddling tricks on x86-64, it should work. But 
watch out if you try to do something fancy, like multiply or divide or read or 
print or atomics. There's a good reason it's excluded from intmax_t.

Re: [PATCH] libgccjit: add some reflection functions in the jit C api

2020-10-02 Thread David Malcolm via Gcc-patches

On Tue, 2020-09-01 at 21:01 -0400, Antoni Boucher via Jit wrote:
> Hello.
> This WIP patch implements new reflection functions in the C API as 
> mentioned in bug 96889.
> I'm looking forward for feedbacks on this patch.
> It's WIP because I'll probably add a few more reflection functions.
> Thanks.

Sorry about the belated review, looks like I missed this one.

At a high level, it seems reasonable.

Do you have a copyright assignment in place for GCC contributions?
See https://gcc.gnu.org/contribute.html

[...]
 
> diff --git a/gcc/jit/docs/topics/compatibility.rst 
> b/gcc/jit/docs/topics/compatibility.rst
> index bb3387fa583..7e786194ded 100644
> --- a/gcc/jit/docs/topics/compatibility.rst
> +++ b/gcc/jit/docs/topics/compatibility.rst
> @@ -219,3 +219,14 @@ entrypoints:
>* :func:`gcc_jit_version_minor`
>  
>* :func:`gcc_jit_version_patchlevel`
> +
> +.. _LIBGCCJIT_ABI_14:
> +
> +``LIBGCCJIT_ABI_14``
> +
> +``LIBGCCJIT_ABI_14`` covers the addition of reflection functions via API
> +entrypoints:
> +
> +  * :func:`gcc_jit_function_get_return_type`
> +
> +  * :func:`gcc_jit_function_get_param_count`

This will now need bumping to 15; 14 covers the addition of
gcc_jit_global_set_initializer.

[...]

> +/* Public entrypoint.  See description in libgccjit.h.
> +
> +   After error-checking, the real work is done by the
> +   gcc::jit::recording::function::get_return_type method, in
> +   jit-recording.h.  */
> +
> +gcc_jit_type *
> +gcc_jit_function_get_return_type (gcc_jit_function *func)
> +{

This one is missing a:
  RETURN_NULL_IF_FAIL (func, NULL, NULL, "NULL function");


> +return (gcc_jit_type *)func->get_return_type ();
> +}

[...]

> diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
> index 1c5a12e9c01..6999ce25ca2 100644
> --- a/gcc/jit/libgccjit.h
> +++ b/gcc/jit/libgccjit.h

[...]

> @@ -1503,6 +1511,22 @@ gcc_jit_version_minor (void);
>  extern int
>  gcc_jit_version_patchlevel (void);
>  
> +#define LIBGCCJIT_HAVE_gcc_jit_function_reflection
> +
> +/* Reflection functions to get the number of parameters and return types of
> +   a function from the C API.

"return type", is better grammar, I think, given that "function" is singular.

> +
> +   "vec_type" should be a vector type, created using gcc_jit_type_get_vector.

This line about "vec_type" seems to be a leftover from a copy


> +   This API entrypoint was added in LIBGCCJIT_ABI_14; you can test for its
> +   presence using
> + #ifdef LIBGCCJIT_HAVE_gcc_jit_function_reflection

Version number will need bumping, as mentioned above.

[...]

> diff --git a/gcc/jit/libgccjit.map b/gcc/jit/libgccjit.map
> index 6137dd4b4b0..b28f81a7a32 100644
> --- a/gcc/jit/libgccjit.map
> +++ b/gcc/jit/libgccjit.map
> @@ -186,4 +186,10 @@ LIBGCCJIT_ABI_13 {
>  gcc_jit_version_major;
>  gcc_jit_version_minor;
>  gcc_jit_version_patchlevel;
> -} LIBGCCJIT_ABI_12;
> \ No newline at end of file
> +} LIBGCCJIT_ABI_12;
> +
> +LIBGCCJIT_ABI_14 {
> +  global:
> +gcc_jit_function_get_return_type;
> +gcc_jit_function_get_param_count;
> +} LIBGCCJIT_ABI_13;

Likewise.

[...]

Otherwise looks good to me.

Bonus points for adding C++ bindings (and docs for them), but I don't
know of anyone using the C++ bindings.

Dave

Re: [PATCH v4 1/2] system_data_types.7: Add 'void *'

2020-10-02 Thread Paul Eggert

On 10/2/20 11:38 AM, Alejandro Colomar wrote:

.I void *

renders with a space in between.

That's odd, as "man(7)" says "All of the arguments will be printed next to each 
other without intervening spaces". I'd play it safe and quote the arg anyway.

 > %p works with any object pointer type (or in POSIX, any pointer type),
 > not just  void *.
In theory, no (if otherwise, I'd like to know why):

Oh, you're right. I had missed that. In GNU/Linux hosts, though, any pointer 
(including function pointers) can be given to %p.

The only platforms where %p wouldn't work on all pointers would be platforms 
like IBM i, which has both 64-bit (process local) pointers and 128-bit (tagged 
space) pointers and where you can declare and use pointers of different widths 
in the same program.

Re: [PATCH 1/4] system_data_types.7: Add '__int128'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Hi Paul,

On 2020-10-02 21:54, Paul Eggert wrote:
> On 10/2/20 12:01 PM, Alejandro Colomar wrote:
>> If you propose not to document the stdint types either,
>
> This is not a stdint.h issue. __int128 is not in stdint.h and is not a
> system data type in any real sense; it's purely a compiler issue.
> Besides, do we start repeating the GCC manual too, while we're at it? At
> some point we need to restrain ourselves and stay within the scope of
> the man pages.

I know it's not in stdint,
but I mean that it behaves as any other stdint type.
So I see value in having them documented together in the same page.
And it's very useful in some (very specific) cases
where portability isn't in mind
(although many compilers are starting to provide this type).

>
> PS. Have you ever tried to use __int128 in real code? I have, to my
> regret. It's a portability and bug minefield and should not be used
> unless you really know what you're doing, which most people do not.

I have.
I used unsigned __int128, for operating on a big bit matrix.
This type helped me remove a whole abstraction for the columns:

unsigned __int128 mat[128]; // This is a 128x128 bit matrix.

This way I avoided either having to use a shorter type,
which would have been a bit weird:

uint64_tmat[128][2];// This is more complicated to see.

Or having to use GMP,
which would have also complicated unnecessarily my code.

And of course, I didn't care about portability,
because I just wanted it to work.

Thanks,

Alex

Re: [PATCH 1/4] system_data_types.7: Add '__int128'

2020-10-02 Thread Paul Eggert


On 10/2/20 12:01 PM, Alejandro Colomar wrote:

If you propose not to document the stdint types either,


This is not a stdint.h issue. __int128 is not in stdint.h and is not a system 
data type in any real sense; it's purely a compiler issue. Besides, do we start 
repeating the GCC manual too, while we're at it? At some point we need to 
restrain ourselves and stay within the scope of the man pages.


PS. Have you ever tried to use __int128 in real code? I have, to my regret. It's 
a portability and bug minefield and should not be used unless you really know 
what you're doing, which most people do not.

[PATCH v5 1/2] system_data_types.7: Add 'void *'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Signed-off-by: Alejandro Colomar 

system_data_types.7: void *: Add info about generic function parameters and 
return value

Reported-by: Paul Eggert 
Reported-by: David Laight 
Signed-off-by: Alejandro Colomar 

system_data_types.7: void *: Add info about pointer artihmetic

Reported-by: Paul Eggert 
Reported-by: David Laight 
Signed-off-by: Alejandro Colomar 

system_data_types.7: void *: Add Versions notes

Compatibility between function pointers and void * hasn't always been so.
Document when that was added to POSIX.

Reported-by: Michael Kerrisk 
Signed-off-by: Alejandro Colomar 

system_data_types.7: void *: wfix

Reported-by: Jonathan Wakely 
Reported-by: Paul Eggert 
Signed-off-by: Alejandro Colomar 
---
 man7/system_data_types.7 | 76 ++--
 1 file changed, 74 insertions(+), 2 deletions(-)

diff --git a/man7/system_data_types.7 b/man7/system_data_types.7
index c82d3b388..7c1198802 100644
--- a/man7/system_data_types.7
+++ b/man7/system_data_types.7
@@ -679,7 +679,6 @@ See also the
 .I uintptr_t
 and
 .I void *
-.\" TODO: Document void *
 types in this page.
 .RE
 .\"- lconv /
@@ -1780,7 +1779,6 @@ See also the
 .I intptr_t
 and
 .I void *
-.\" TODO: Document void *
 types in this page.
 .RE
 .\"- va_list --/
@@ -1814,6 +1812,80 @@ See also:
 .BR va_copy (3),
 .BR va_end (3)
 .RE
+.\"- void * ---/
+.TP
+.I void *
+.RS
+According to the C language standard,
+a pointer to any object type may be converted to a pointer to
+.I void
+and back.
+POSIX further requires that any pointer,
+including pointers to functions,
+may be converted to a pointer to
+.I void
+and back.
+.PP
+Conversions from and to any other pointer type are done implicitly,
+not requiring casts at all.
+Note that this feature prevents any kind of type checking:
+the programmer should be careful not to convert a
+.I void *
+value to a type incompatible to that of the underlying data,
+because that would result in undefined behavior.
+.PP
+This type is useful in function parameters and return value
+to allow passing values of any type.
+The function will typically use some mechanism to know
+the real type of the data being passed via a pointer to
+.IR void .
+.PP
+A value of this type can't be dereferenced,
+as it would give a value of type
+.IR void ,
+which is not possible.
+Likewise, pointer arithmetic is not possible with this type.
+However, in GNU C, pointer arithmetic is allowed
+as an extension to the standard;
+this is done by treating the size of a
+.I void
+or of a function as 1.
+A consequence of this is that
+.I sizeof
+is also allowed on
+.I void
+and on function types, and returns 1.
+.PP
+The conversion specifier for
+.I void *
+for the
+.BR printf (3)
+and the
+.BR scanf (3)
+families of functions is
+.BR p .
+.PP
+Versions:
+The POSIX requirement about compatibility between
+.I void *
+and function pointers was added in
+POSIX.1-2008 Technical Corrigendum 1 (2013).
+.PP
+Conforming to:
+C99 and later; POSIX.1-2001 and later.
+.PP
+See also:
+.BR malloc (3),
+.BR memcmp (3),
+.BR memcpy (3),
+.BR memset (3)
+.PP
+See also the
+.I intptr_t
+and
+.I uintptr_t
+types in this page.
+.RE
 .\"/
 .SH NOTES
 The structures described in this manual page shall contain,
-- 
2.28.0

[PATCH v5 2/2] void.3: New link to system_data_types(7)

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Signed-off-by: Alejandro Colomar 
---
 man3/void.3 | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 man3/void.3

diff --git a/man3/void.3 b/man3/void.3
new file mode 100644
index 0..db50c0f09
--- /dev/null
+++ b/man3/void.3
@@ -0,0 +1 @@
+.so man7/system_data_types.7
-- 
2.28.0

[PATCH v5 0/2] Document 'void *'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Hi Michael,

Here I added a wfix fixing some wording issues and a few typos
spotted by Paul and Jonathan in the (many) threads.
As previously, it is squashed into a single commit.


Thanks again for those who reviewed the patch!

BTW, for those who don't have a local repo of the man-pages,
below you can see a rendered version of the patch.

Thanks,

Alex

[[
void *
  According  to  the  C language standard, a pointer to any object
  type may be converted to a pointer to void and back.  POSIX fur-
  ther requires that any pointer, including pointers to functions,
  may be converted to a pointer to void and back.

  Conversions from and to any other pointer type are done  implic-
  itly,  not  requiring casts at all.  Note that this feature pre-
  vents any kind of type checking: the programmer should be  care-
  ful not to convert a void * value to a type incompatible to that
  of the underlying data, because that would result  in  undefined
  behavior.

  This  type  is useful in function parameters and return value to
  allow passing values of any type.  The function  will  typically
  use  some  mechanism  to  know  the  real type of the data being
  passed via a pointer to void.

  A value of this type can't be dereferenced, as it would  give  a
  value  of  type  void, which is not possible.  Likewise, pointer
  arithmetic is not possible with this type.  However, in  GNU  C,
  pointer  arithmetic  is allowed as an extension to the standard;
  this is done by treating the size of a void or of a function  as
  1.  A consequence of this is that sizeof is also allowed on void
  and on function types, and returns 1.

  The conversion specifier for void * for the  printf(3)  and  the
  scanf(3) families of functions is p.

  Versions: The POSIX requirement about compatibility between void
  * and function pointers was added in POSIX.1-2008 Technical Cor-
  rigendum 1 (2013).

  Conforming to: C99 and later; POSIX.1-2001 and later.

  See also: malloc(3), memcmp(3), memcpy(3), memset(3)

  See also the intptr_t and uintptr_t types in this page.
]]


Alejandro Colomar (2):
  system_data_types.7: Add 'void *'
  void.3: New link to system_data_types(7)

 man3/void.3  |  1 +
 man7/system_data_types.7 | 76 ++--
 2 files changed, 75 insertions(+), 2 deletions(-)
 create mode 100644 man3/void.3

-- 
2.28.0

c++: Kill DECL_ANTICIPATED

2020-10-02 Thread Nathan Sidwell



Here's the patch to remove DECL_ANTICIPATED, and with it hiddenness is
managed entirely in the symbol table.  Sadly I couldn't get rid of the
actual field without more investigation -- it's repurposed for
OMP_PRIVATIZED_MEMBER.  It looks like a the VAR-related flags in
lang_decl_base are not completely orthogonal, so perhaps some can be
turned into an enumeration or something.  But that's more than I want
to do right now.

DECL_FRIEND_P Is still slightly suspect as it appears to mean more
than just in-class definition.  However, I'm leaving that for now.

gcc/cp/
* cp-tree.h (lang_decl_base): anticipated_p is not used for
anticipatedness.
(DECL_ANTICIPATED): Delete.
* decl.c (duplicate_decls): Delete DECL_ANTICIPATED_management,
use was_hidden.
(cxx_builtin_function): Drop DECL_ANTICIPATED setting.
(xref_tag_1): Drop DECL_ANTICIPATED assert.
* name-lookup.c (name_lookup::adl_class_only): Drop
DECL_ANTICIPATED check.
(name_lookup::search_adl): Always dedup.
(anticipated_builtin_p): Reimplement.
(do_pushdecl): Drop DECL_ANTICIPATED asserts & update.
(lookup_elaborated_type_1): Drop DECL_ANTICIPATED update.
(do_pushtag): Drop DECL_ANTICIPATED setting.
* pt.c (push_template_decl): Likewise.
(tsubst_friend_class): Likewise.
libcc1/
* libcp1plugin.cc (libcp1plugin.cc): Drop DECL_ANTICIPATED test.

pushing to trunk

nathan

--
Nathan Sidwell
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 43e0c18ec03..c9ad75117ad 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -2657,8 +2657,10 @@ struct GTY(()) lang_decl_base {
   unsigned not_really_extern : 1;	   /* var or fn */
   unsigned initialized_in_class : 1;	   /* var or fn */
   unsigned threadprivate_or_deleted_p : 1; /* var or fn */
-  unsigned anticipated_p : 1;		   /* fn, type or template */
-  /* anticipated_p reused as DECL_OMP_PRIVATIZED_MEMBER in var */
+  /* anticipated_p is no longer used for anticipated_decls (fn, type
+ or template).  It is used as DECL_OMP_PRIVATIZED_MEMBER in
+ var.  */
+  unsigned anticipated_p : 1;
   unsigned friend_or_tls : 1;		   /* var, fn, type or template */
   unsigned unknown_bound_p : 1;		   /* var */
   unsigned odr_used : 1;		   /* var or fn */
@@ -4037,13 +4039,6 @@ more_aggr_init_expr_args_p (const aggr_init_expr_arg_iterator *iter)
 #define DECL_BUILTIN_P(NODE) \
   (DECL_SOURCE_LOCATION(NODE) == BUILTINS_LOCATION)
 
-/* Nonzero if NODE is a DECL which we know about but which has not
-   been explicitly declared, such as a built-in function or a friend
-   declared inside a class.  */
-#define DECL_ANTICIPATED(NODE) \
-  (DECL_LANG_SPECIFIC (TYPE_FUNCTION_OR_TEMPLATE_DECL_CHECK (NODE)) \
-   ->u.base.anticipated_p)
-
 /* True for artificial decls added for OpenMP privatized non-static
data members.  */
 #define DECL_OMP_PRIVATIZED_MEMBER(NODE) \
diff --git i/gcc/cp/decl.c w/gcc/cp/decl.c
index 6b306ee4667..f333a36b0e1 100644
--- i/gcc/cp/decl.c
+++ w/gcc/cp/decl.c
@@ -1444,7 +1444,7 @@ tree
 duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
 {
   unsigned olddecl_uid = DECL_UID (olddecl);
-  int olddecl_friend = 0, types_match = 0, hidden_friend = 0;
+  int olddecl_friend = 0, types_match = 0;
   int olddecl_hidden_friend = 0;
   int new_defines_function = 0;
   tree new_template_info;
@@ -1473,7 +1473,7 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
 	{
 	  /* Avoid warnings redeclaring built-ins which have not been
 	 explicitly declared.  */
-	  if (DECL_ANTICIPATED (olddecl))
+	  if (was_hidden)
 	{
 	  if (TREE_PUBLIC (newdecl)
 		  && CP_DECL_CONTEXT (newdecl) == global_namespace)
@@ -1645,7 +1645,7 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
   /* If a function is explicitly declared "throw ()", propagate that to
 	 the corresponding builtin.  */
   if (DECL_BUILT_IN_CLASS (olddecl) == BUILT_IN_NORMAL
-	  && DECL_ANTICIPATED (olddecl)
+	  && was_hidden
 	  && TREE_NOTHROW (newdecl)
 	  && !TREE_NOTHROW (olddecl))
 	{
@@ -2139,9 +2139,6 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
 {
   olddecl_friend = DECL_FRIEND_P (STRIP_TEMPLATE (olddecl));
   olddecl_hidden_friend = olddecl_friend && was_hidden;
-  hidden_friend = olddecl_hidden_friend && hiding;
-  if (!hidden_friend)
-	DECL_ANTICIPATED (olddecl) = false;
 }
 
   if (TREE_CODE (newdecl) == TEMPLATE_DECL)
@@ -2890,8 +2887,6 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
   DECL_UID (olddecl) = olddecl_uid;
   if (olddecl_friend)
 DECL_FRIEND_P (olddecl) = true;
-  if (hidden_friend)
-DECL_ANTICIPATED (olddecl) = true;
 
   /* NEWDECL contains the merged attribute lists.
  Update OLDDECL to be the same.  */
@@ -4690,21 +4685,15 @@ cxx_builtin_function (tree decl)
   const char

Re: [PATCH 1/4] system_data_types.7: Add '__int128'

2020-10-02 Thread Alejandro Colomar via Gcc-patches


Hi Paul,

On 2020-10-02 19:52, Paul Eggert wrote:
Why describe __int128_t in these man pages at all? __int128_t is not a 
property of either the kernel or of glibc, so it's out of scope.


Well, as I see it, [unsigned] __int128 is as good as [u]int64_t.
They are part of the C interface in Linux.
As a programmer I never cared if it was
Glibc providing me the type with a typedef,
or GCC providing me the type with its magic.

If you propose not to document the stdint types either,
I can see your position, but I'll disagree.



A few personal lines:

I just want to use it, and to do that,
I need a reference manual where to look for how to use it.
And belive me that unsigned __int128 has been very useful to me.

I think having this page with most of the types,
in a centralized manner, is exactly
what I would have needed in the past.
I have had trouble finding where ssize_t was defined.
I could have looked at the POSIX manual,
and I would have easily found that it is defined in ,
but I didn't even know that it was a POSIX thing
(and I can tell that I'm not the only one who didn't know this :),
so I didn't even know where to search for it.

When I wanted to use unsigned __128, kind of the same thing,
where do you search for the documentation of something
that you don't even know who specified it?

When that happens, the first thing for me is to 'man something'.
If that doesn't give me any useful info, then duckduckgo something.

In the internet there's much info,
but also much of it is incomplete or incorrect,
so if I have the man, I trust the man over anything else
(except for the standard documents, of course).

But the standards documents usually provide the information
in a reverse fashion:
If you know where to look at, you'll find it.
But if you only know its name, it'll be hard to find where it is.

The man provides documentation with the name of what you want to
know about.  Simple and easy.

And man is faster than the internet :)


Regards,

Alex.

Re: [PATCH] calls.c:can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check

2020-10-02 Thread Segher Boessenkool

Hi!

On Fri, Oct 02, 2020 at 04:41:05PM +0930, Alan Modra wrote:
> This moves an #ifdef block of code from calls.c to
> targetm.function_ok_for_sibcall.  Only two targets, x86 and rs6000,
> define REG_PARM_STACK_SPACE or OUTGOING_REG_PARM_STACK_SPACE macros
> that might vary depending on the called function.  Macros like
> UNITS_PER_WORD don't change over a function boundary, nor does the
> MIPS ABI, nor does TARGET_64BIT on PA-RISC.  Other targets are even
> more trivially seen to not need the calls.c code.
> 
> Besides cleaning up a small piece of #ifdef code, the motivation for
> this patch is to allow tail calls on PowerPC for functions that
> require less reg_parm_stack_space than their caller.  The original
> code in calls.c only permitted tail calls when exactly equal.

> +  /* If reg parm stack space increases, we cannot sibcall.  */
> +  if (REG_PARM_STACK_SPACE (decl ? decl : fntype)
> +  > REG_PARM_STACK_SPACE (current_function_decl))
> +{
> +  maybe_complain_about_tail_call (exp,
> +   "inconsistent size of stack space"
> +   " allocated for arguments which are"
> +   " passed in registers");
> +  return false;
> +}

Maybe change the message?  You allow all sizes smaller or equal than
the current size, "inconsistent" isn't very great for that.

> +static int __attribute__ ((__noclone__, __noinline__))
> +reg_args (int j1, int j2, int j3, int j4, int j5, int j6, int j7, int j8)
> +{
> +  return j1 + j2 + j3 + j4 + j5 + j6 + j7 + j8;
> +}
> +
> +int __attribute__ ((__noclone__, __noinline__))
> +stack_args (int j1, int j2, int j3, int j4, int j5, int j6, int j7, int j8,
> + int j9)
> +{
> +  if (j9 == 0)
> +return 0;
> +  return reg_args (j1, j2, j3, j4, j5, j6, j7, j8);
> +}
> +
> +/* { dg-final { scan-assembler-not {(?n)^\s+bl\s} } } */

Nice testcase :-)  (You can do \M or even just a space instead of that
last \s, but this works fine.)

The rs6000 parts are fine (with the message improved perhaps).  Thanks!
The rest looks fine to me as well, fwiw.


Segher

Re: [PATCH v4 1/2] system_data_types.7: Add 'void *'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Hi Paul,

On 2020-10-02 18:53, Paul Eggert wrote:
> On 10/2/20 8:14 AM, Alejandro Colomar wrote:
>
>> +.I void *
>
> GNU style is a space between "void" and "*", so this should be '.I
> "void\ *"', both here and elsewhere. The backslash prevents a line break.

.I void *

renders with a space in between.
I'll show you the rendered version at the end of this email.

>
>> +Conversions from and to any other pointer type are done implicitly,
>> +not requiring casts at all.
>> +Note that this feature prevents any kind of type checking:
>> +the programmer should be careful not to cast a
>
> Change "cast" to "convert", since the point is that no cast is needed.

Ok.

>
>> +.PP
>> +The conversion specifier for
>> +.I void *
>> +for the
>> +.BR printf (3)
>> +and the
>> +.BR scanf (3)
>> +families of functions is
>> +.BR p ;
>> +resulting commonly in
>> +.B %p
>> +for printing
>> +.I void *
>> +values.
>
> %p works with any object pointer type (or in POSIX, any pointer type),
> not just  void *.
In theory, no (if otherwise, I'd like to know why):

[[
p
The argument shall be a pointer to void. The value of the pointer 
is converted to a sequence of printable characters, in an 
implementation-defined manner.

]] POSIX.1-2008

However, it's unlikely to cause any problems, I must admit.

>
> Should also mention "void const *", "void volatile *", etc.

I already answered to this:
https://lore.kernel.org/linux-man/cah6ehdqhh46tjvc72mewftwci7iouaod0ic1zlrga+c-36g...@mail.gmail.com/T/#m6f657e988558a556cb70f7c056ef7a24e73dbe4a

> Plus it
> really should talk about plain "void", saying that it's a placeholder as
> a return value for functions, for casting away values, and as a keyword
> in C11 for functions with no parameters (though this is being changed in
> the next C version!). I sent comments about most of this stuff already.

'void' is a completely different type from 'void *'.

This patch is for 'void *'.

If 'void' is documented,
it'll be in a different entry (although in the same page),
and therefore, that'll be for a different patch.

Thanks,

Alex

__

void *
  According  to  the  C language standard, a pointer to any object
  type may be converted to a pointer to void and back.  POSIX fur-
  ther requires that any pointer, including pointers to functions,
  may be converted to a pointer to void and back.

  Conversions from and to any other pointer type are done  implic-
  itly,  not  requiring casts at all.  Note that this feature pre-
  vents any kind of type checking: the programmer should be  care-
  ful not to cast a void * value to a type incompatible to that of
  the underlying data, because that would result in undefined  be-
  havior.

  This  type  is useful in function parameters and return value to
  allow passing values of any type.  The function will usually use
  some  mechanism to know of which type the underlying data passed
  to the function really is.

  A value of this type can't be dereferenced, as it would  give  a
  value  of  type  void  which is not possible.  Likewise, pointer
  arithmetic is not possible with this type.  However, in  GNU  C,
  poitner  arithmetic  is allowed as an extension to the standard;
  this is done by treating the size of a void or of a function  as
  1.  A consequence of this is that sizeof is also allowed on void
  and on function types, and returns 1.

  The conversion specifier for void * for the  printf(3)  and  the
  scanf(3)  families  of  functions is p; resulting commonly in %p
  for printing void * values.

  Versions: The POSIX requirement about compatibility between void
  * and function pointers was added in POSIX.1-2008 Technical Cor-
  rigendum 1 (2013).

  Conforming to: C99 and later; POSIX.1-2001 and later.

  See also: malloc(3), memcmp(3), memcpy(3), memset(3)

  See also the intptr_t and uintptr_t types in this page.

c++: Hash table iteration for namespace-member spelling suggestions

2020-10-02 Thread Nathan Sidwell



For 'no such binding' errors, we iterate over binding levels to find a
close match.  At the namespace level we were using DECL_ANTICIPATED to
skip undeclared builtins.  But (a) there are other unnameable things
there and (b) decl-anticipated is about to go away.  This changes the
namespace scanning to iterate over the hash table, and look at
non-hidden bindings.  This does mean we look at fewer strings
(hurrarh), but the order we meet them is somewhat 'random'.  Our
distance measure is not very fine grained, and a couple of testcases
change their suggestion.  I notice for the c/c++ common one, we now
match the output of the C compiler.  For the other one we think 'int'
and 'int64_t' have the same distance from 'int64', and now meet the
former first.  That's a little unfortunate.  If it's too problematic I
suppose we could sort the strings via an intermediate array before
measuring distance.

gcc/cp/
* name-lookup.c (consider_decl): New, broken out of ...
(consider_binding_level): ... here.  Iterate the hash table for
namespace bindings.
gcc/testsuite/
* c-c++-common/spellcheck-reserved.c: Adjust diagnostic.
* g++.dg/spellcheck-typenames.C: Adjust diagnostic.

pushing to trunk

nathan

--
Nathan Sidwell
diff --git i/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index 620f3a6..4024ceaa74b 100644
--- i/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -6106,6 +6106,39 @@ qualified_namespace_lookup (tree scope, name_lookup *lookup)
   return found;
 }
 
+static void
+consider_decl (tree decl,  best_match  ,
+	   bool consider_impl_names)
+{
+  /* Skip compiler-generated variables (e.g. __for_begin/__for_end
+ within range for).  */
+  if (TREE_CODE (decl) == VAR_DECL && DECL_ARTIFICIAL (decl))
+return;
+
+  tree suggestion = DECL_NAME (decl);
+  if (!suggestion)
+return;
+
+  /* Don't suggest names that are for anonymous aggregate types, as
+ they are an implementation detail generated by the compiler.  */
+  if (IDENTIFIER_ANON_P (suggestion))
+return;
+
+  const char *suggestion_str = IDENTIFIER_POINTER (suggestion);
+
+  /* Ignore internal names with spaces in them.  */
+  if (strchr (suggestion_str, ' '))
+return;
+
+  /* Don't suggest names that are reserved for use by the
+ implementation, unless NAME began with an underscore.  */
+  if (!consider_impl_names
+  && name_reserved_for_implementation_p (suggestion_str))
+return;
+
+  bm.consider (suggestion_str);
+}
+
 /* Helper function for lookup_name_fuzzy.
Traverse binding level LVL, looking for good name matches for NAME
(and BM).  */
@@ -6129,54 +6162,63 @@ consider_binding_level (tree name, best_match  ,
  with an underscore.  */
   bool consider_implementation_names = (IDENTIFIER_POINTER (name)[0] == '_');
 
-  for (tree t = lvl->names; t; t = TREE_CHAIN (t))
-{
-  tree d = t;
-
-  /* OVERLOADs or decls from using declaration are wrapped into
-	 TREE_LIST.  */
-  if (TREE_CODE (d) == TREE_LIST)
-	d = OVL_FIRST (TREE_VALUE (d));
-
-  /* Don't use bindings from implicitly declared functions,
-	 as they were likely misspellings themselves.  */
-  if (TREE_TYPE (d) == error_mark_node)
-	continue;
-
-  /* Skip anticipated decls of builtin functions.  */
-  if (TREE_CODE (d) == FUNCTION_DECL
-	  && fndecl_built_in_p (d)
-	  && DECL_ANTICIPATED (d))
-	continue;
+  if (lvl->kind != sk_namespace)
+for (tree t = lvl->names; t; t = TREE_CHAIN (t))
+  {
+	tree d = t;
 
-  /* Skip compiler-generated variables (e.g. __for_begin/__for_end
-	 within range for).  */
-  if (TREE_CODE (d) == VAR_DECL
-	  && DECL_ARTIFICIAL (d))
-	continue;
+	/* OVERLOADs or decls from using declaration are wrapped into
+	   TREE_LIST.  */
+	if (TREE_CODE (d) == TREE_LIST)
+	  d = OVL_FIRST (TREE_VALUE (d));
 
-  tree suggestion = DECL_NAME (d);
-  if (!suggestion)
-	continue;
-
-  /* Don't suggest names that are for anonymous aggregate types, as
-	 they are an implementation detail generated by the compiler.  */
-  if (IDENTIFIER_ANON_P (suggestion))
-	continue;
+	/* Don't use bindings from implicitly declared functions,
+	   as they were likely misspellings themselves.  */
+	if (TREE_TYPE (d) == error_mark_node)
+	  continue;
 
-  const char *suggestion_str = IDENTIFIER_POINTER (suggestion);
+	/* If we want a typename, ignore non-types.  */
+	if (kind == FUZZY_LOOKUP_TYPENAME
+	&& TREE_CODE (STRIP_TEMPLATE (d)) != TYPE_DECL)
+	  continue;
 
-  /* Ignore internal names with spaces in them.  */
-  if (strchr (suggestion_str, ' '))
-	continue;
+	consider_decl (d, bm, consider_implementation_names);
+  }
+  else
+{
+  /* Iterate over the namespace hash table, that'll have fewer
+	 entries than the decl list.  */
+  tree ns = lvl->this_entity;
 
-  /* Don't suggest names that are reserved for use by the
-	 implementation, unless NAME began with an underscore.  */
-  if

Re: [rs6000] Avoid useless masking of count operand for rotation

2020-10-02 Thread Segher Boessenkool

Hi Eric,

On Fri, Oct 02, 2020 at 10:26:24AM +0200, Eric Botcazou wrote:
> > Don't call it "mask" please: *all* of our basic rotate instructions
> > already have something called "mask" (that is the "m" in "rlwnm" for
> > example; and "rotlw d,a,b" is just an extended mnemonic for
> > "rlwnm d,a,b,0,31").  The hardware also does not mask the shift amount
> > at all (instead, only the low 5 bits of RB *are* the shift amount).
> 
> "Mask" is a common terminology though, used e.g. in the x86 back-end.  That 
> being said, I agree that it conflicts with PowerPC specific stuff so please 
> suggest an alternative here.

Something that describes what it does well ("mask" does *not* in our
context).  "_shiftmask" maybe?  I'm sure you can think of something
better than that ;-)

> > > +(define_insn "*rotl3_mask"
> > > +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> > > + (rotate:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
> > > + (and:GPR (match_operand:GPR 2 "gpc_reg_operand" "r")
> > > +  (match_operand:GPR 3 "const_int_operand" 
> "n"]
> > > +  "(UINTVAL (operands[3]) & (GET_MODE_BITSIZE (mode) - 1))
> > > +   == (unsigned HOST_WIDE_INT) (GET_MODE_BITSIZE (mode) - 1)"

> > Don't mask operands[3] please (in the UINTVAL): RTL with the number
> > outside of that range is *undefined*.  So just check that it is equal?
> 
> Copied from the x86 back-end as well, so I'd rather keep it that way, unless 
> you really insist...

Improve the x86 backend instead, if you want to have the two backends
similar?

> > Why?  This diverges the "dot" version from the non-dot for no reason.
> 
> OK, let's drop it.

Some rationale: almost all dot and dot2 patterns are exactly the same as
the non-dot versions (they can be generated by a script even), and just
the insn condition makes the insn only valid for Pmode (for the insns
where that is true!  Some SImode dot patterns are valid in 64-bit mode
as well!)  This way is more obvious, more explicit.

> > I don't see a patch with splitters only?  Huh.  Did you attach the same
> > patch twice?
> 
> AFAICS no, 2 different patches were attached.

Yes I see, but that uses define_insn(_and_split) for these still.

> > Since this won't be handled before combine (or what do I miss?), it is
> > fine to do splitters only (splitters for combine).  But the other
> > approach is fine as well.
> 
> Patch #2 uses define_and_split like the x86 back-end; moreover, we already 
> have define_and_split for the dot variants so maybe it's the best way to go?

Please just use define_split, or just define_insn.  define_split if you
want this to be split during combine (how could that work here though?),
or define_insn if you think other passes can generate this as well.
define_insn_and_split is *both*, and that isn't useful here.

Do you see dot patterns for this used ever, btw?

Segher

Re: [PATCH 1/4] system_data_types.7: Add '__int128'

2020-10-02 Thread Paul Eggert

Why describe __int128_t in these man pages at all? __int128_t is not a property 
of either the kernel or of glibc, so it's out of scope.

Re: [Patch] libgomp: Add, if existing, -latomic to libgomp.spec --as-needed (was: Re: [RFC] Offloading and automatic linking of libraries)

2020-10-02 Thread Jakub Jelinek via Gcc-patches

On Fri, Oct 02, 2020 at 05:33:13PM +, Joseph Myers wrote:
> On Fri, 2 Oct 2020, Tobias Burnus wrote:
> 
> > However, this flag can be added into the offload-target's libgomp.spec,
> > which avoids all kind of issues. That's what this patch now does.
> > 
> > I tested it with x86_64-gnu-linux w/o + w/ nvptx-none. Result:
> > * x86_64-gnu-linux's libgomp.spec:
> >   "*link_gomp: -lgomp %{static: -ldl } --as-needed -latomic --no-as-needed"
> > * nvptx-none's libgomp.spec:
> >   "*link_gomp: -lgomp  -latomic"
> > 
> > On x86-64, a simple test program did not use and also did not link 
> > libatomic.
> > 
> > OK?
> 
> Do all testsuites that link using libgomp.spec also use the testsuite 
> logic from atomic-dg.exp to locate libatomic for build-tree testing 
> (otherwise -latomic will fail if it can't be found and there isn't an 
> installed system copy that the driver might fall back on, even with 
> --as-needed)?

The only testsuite that uses libgomp.spec is libgomp/testsuite/, the
gcc/testsuite/ tests if they test -fopenmp must be only dg-do compile tests,
all the link and runtime tests are in libgomp.
There is no ordering between libatomic and libgomp build, so we shouldn't be
querying libatomic in libgomp configury.  Not sure if we can rely on if
libatomic is among configdirs then libatomic will be actually built.

Jakub

Re: [Patch] libgomp: Add, if existing, -latomic to libgomp.spec --as-needed (was: Re: [RFC] Offloading and automatic linking of libraries)

2020-10-02 Thread Joseph Myers

On Fri, 2 Oct 2020, Tobias Burnus wrote:

> However, this flag can be added into the offload-target's libgomp.spec,
> which avoids all kind of issues. That's what this patch now does.
> 
> I tested it with x86_64-gnu-linux w/o + w/ nvptx-none. Result:
> * x86_64-gnu-linux's libgomp.spec:
>   "*link_gomp: -lgomp %{static: -ldl } --as-needed -latomic --no-as-needed"
> * nvptx-none's libgomp.spec:
>   "*link_gomp: -lgomp  -latomic"
> 
> On x86-64, a simple test program did not use and also did not link libatomic.
> 
> OK?

Do all testsuites that link using libgomp.spec also use the testsuite 
logic from atomic-dg.exp to locate libatomic for build-tree testing 
(otherwise -latomic will fail if it can't be found and there isn't an 
installed system copy that the driver might fall back on, even with 
--as-needed)?

My view as noted in bug 81358 is that --as-needed -latomic --no-as-needed 
should *always* be used (given --as-needed supported and libatomic built), 
not restricted to when libgomp is needed.  But while adding it only for 
libgomp might make a useful improvement while reducing the extent to which 
global testsuite / configure test changes need to be made at the same 
time, some testsuite support should still be needed if not already in use.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Fix build of ppc64 target.

2020-10-02 Thread Segher Boessenkool

On Fri, Oct 02, 2020 at 10:46:12AM +0200, Aldy Hernandez wrote:
> On 10/2/20 2:17 AM, David Edelsohn wrote:
> >On Thu, Oct 1, 2020 at 8:02 PM Andrew MacLeod  wrote:
> >Thanks for investigating.  And I definitely am not suggesting that you
> >delay the great progress on Ranger to flatten and compact tree-vrp.h
> >and ssa.h immediately.
> >
> >Inclusion of any header file in tree-ssa-propagate.h is new, which
> >surprised me because of the GCC strategy for headers.  As you and Aldy
> >continue to develop Ranger, I wanted to alert you to the fragility of
> >the current header design.  The rs6000 port is a very effective
> >canary!
> 
> Indeed.  Actually, I had been using a ppc64le-linux machine as my 
> primary testing target for quite a while.  However, a few weeks ago ppc 
> bootstrap broke and I switched back to x86-64, because I was too lazy to 
> investigate why.
> 
> I'll test more regularly on ppc if it's back in working order.

It is.  We keep a close eye on it; it usually bootstraps (for at least
the configs we test) all the time.  Especially during stage 1 things do
break from time to time, but that isn't unique to Power ;-)

There isn't a rule that we can just revert any patches that break
bootstrap on primary targets (and I am not suggesting there should be),
so breakage now and then is unavoidable.  Maybe some CI thing can help
make stage 1 a less bumpy road.

Segher

PING^2 [PATCH] doc: gcc.c: Update documentation for spec files

2020-10-02 Thread Armin Brauns via Gcc-patches

On 06/09/2020 17.23, Armin Brauns wrote:
> There were some differences between the actual code in do_spec_1, its
> source comment, and the documentation in doc/invoke.texi. These should
> now be resolved.
PING: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553321.html

[PATCH 6/6] Hybrid EVRP

2020-10-02 Thread Andrew MacLeod via Gcc-patches

The patch switches to a hybrid EVRP pass which utilizes both the Ranger 
and the classic EVRP pass.


it introduces a new undocumented option:

-fevrp-mode=   which can be one of the following options

evrp-only    : This is classic EVRP mode, identical to whats in trunk now.
rvrp-only     : This runs EVRP using *only* the ranger for ranges.
*evrp-first *   : This is a hybrid mode, which uses EVRP to satisfy 
simplifications first, and if that fails, then tries with the ranger
rvrp-first     : Another hybrid mode, this time it tries to simplify 
with the ranger first, then with EVRP.
rvrp-trace    : same as rvrp-only, except a lot of tracing information 
is also dumped to the listing file
rvrp-debug   : This is similar to rvrp-trace, except it also include a 
lot of debug information fo the on-entry caches.
trace        :This runs in EVRP-first mode, but turns on tracing in 
the ranger


The default option currently enabled is *evrp_first*
This gives us similar functionality to what trunk current has, except 
its enhanced by trying to use the ranger to find additional cases.


We see numerous places where the ranger provides enhanced result, the 
primary cases are
  a) When switches are involved.. we provide very precise ranges to 
each switch case, including default,  and we see cases where we can 
eliminate branches due to that
  b) We track ranges on edges quite accurately, and are not limited to 
single entry blocks. In particular we are seeing a number of places 
where ranges are being propagated into PHIs that were not before:


ie, from PR 81192:

  if (j_8(D) != 2147483647)
    goto ; [50.00%]
  else
    goto ; [50.00%]
 :
  iftmp.2_11 = j_8(D) + 1;
 :
  # iftmp.2_12 = PHI 

hybrid mode now recognizes a constant can be propagated into the 3->5 
edge and

produces
  # iftmp.2_12 = PHI <2147483647(3), iftmp.2_11(4)>

As a result, we're finding a lot of jump threading opportunities are 
being exposed earlier.



The patch provides 3 EVRP passes, and uses the option to choose which of 
the 3 are invoked. You can see from the patch how interchangeable we 
have managed to make the range engines.


The goal here is to continue exercising both engines regularly, which 
making it easy to detect when one engine is better.  when a dump_file is 
requested for the pass, any time there is a variance in results between 
the 2 engines will be highlighted by lines such as


   EVRP:hybrid: RVRP found singleton 3
   EVRP:hybrid: EVRP found singleton -1B
   EVRP:hybrid: Second query simplifed stmt

We'll be using these to work on identifying differences/issues  as we 
move towards replacing EVRP/VRP fully.


Andrew

	* flag-types.h (enum evrp_mode): New enumerated type EVRP_MODE_*.
	* common.opt (fevrp-mode): New undocumented flag.
	* vr-values.c (simplify_using_ranges::fold_cond): Try range_of_stmt
	first to see if it returns a value.
	(simplify_using_ranges::simplify_switch_using_ranges): Return true if
	any changes were made to the switch.
	* gimple-ssa-evrp.c: Include gimple-range.h
	(class rvrp_folder): EVRP folding using ranger exclusively.
	(rvrp_folder::rvrp_folder): New.
	(rvrp_folder::~rvrp_folder): New.
	(rvrp_folder::value_of_expr): New.  Use rangers value_of_expr.
	(rvrp_folder::value_on_edge): New.  Use rangers value_on_edge.
	(rvrp_folder::value_of_Stmt): New.  Use rangers value_of_stmt.
	(rvrp_folder::fold_stmt): New.  Call the simplifier.
	(class hybrid_folder): EVRP folding using both engines.
	(hybrid_folder::hybrid_folder): New.
	(hybrid_folder::~hybrid_folder): New.
	(hybrid_folder::fold_stmt): New.  Simplify with one engne, then the
	other.
	(hybrid_folder::value_of_expr): New.  Use both value routines.
	(hybrid_folder::value_on_edge): New.  Use both value routines.
	(hybrid_folder::value_of_stmt): New.  Use both value routines.
	(hybrid_folder::choose_value): New.  Choose between range_analzyer and
	rangers values.
	(execute_early_vrp): Choose a folder based on flag_evrp_mode.

diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 852ea76eaa2..0ca2a1cff46 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -382,4 +382,16 @@ enum parloops_schedule_type
   PARLOOPS_SCHEDULE_RUNTIME
 };
 
+/* EVRP mode.  */
+enum evrp_mode
+{
+  EVRP_MODE_EVRP_ONLY,
+  EVRP_MODE_RVRP_ONLY,
+  EVRP_MODE_EVRP_FIRST,
+  EVRP_MODE_RVRP_FIRST,
+  EVRP_MODE_RVRP_TRACE,
+  EVRP_MODE_RVRP_DEBUG,
+  EVRP_MODE_TRACE
+};
+
 #endif /* ! GCC_FLAG_TYPES_H */
diff --git a/gcc/common.opt b/gcc/common.opt
index 292c2de694e..71b9292a22e 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2870,6 +2870,34 @@ ftree-vrp
 Common Report Var(flag_tree_vrp) Init(0) Optimization
 Perform Value Range Propagation on trees.
 
+fevrp-mode=
+Common Undocumented Joined RejectNegative Enum(evrp_mode) Var(flag_evrp_mode) Init(EVRP_MODE_EVRP_FIRST) Optimization
+-fevrp-mode=[evrp-only|rvrp-only|evrp-first|rvrp-first|rvrp-trace|rvrp-debug|trace] Specifies the mode Early VRP should operate in.
+
+Enum
+Name(evrp_mode) Type(enum evrp_mode)

[PATCH 5/6] gimple-range

2020-10-02 Thread Andrew MacLeod via Gcc-patches


This is the ranger component that pulls all the others bits together.

Its API is basically the range_query we've already pushed into the 
compiler. The main ones a client are likely to use are


bool range_of_stmt (irange , gimple *, tree name = NULL)
bool range_of_expr (irange , tree name, gimple * = NULL)
bool range_on_edge (irange , edge e, tree name)

These are for calculating ranges of ssa names as used ON a statement, ON 
and edge, or as the result of a statement.. respectively.


The ranger is responsible for calculating the range of any gimple 
statement, but before doing that, resolves any dependencies that 
statement has first.


given
a_2 = b_3 + c_4

The ranger will first use its own API to resolve a range_of_expr (b_3), 
then a range_of_expr (c_4), then finally calculate a range for a_2using 
the values it gets back.


A lot of the heavy lifting is done by the earlier components, the ranger 
is mostly responsible for actually calculating a global value for an 
ssa_name based on the stmt, as well as resolving dependencies as much as 
possible before asking for a cache fill. Cycles are handled by breaking 
the request cycle if we get back to the same definition, and then using 
a best guess up to that point.


Future enhancements will involve updating back edges with more accurate 
values when they become available.. but thats more of an issue for when 
we go to replace the ASSERT driven VRP pass.

	* gimple-range.h: New File.
	(class gimple_ranger): New.  Ranger base class.
	(gimple_range_handler): New.  Get range-ops handler.
	(gimple_range_ssa_p): New.  Is this a range comptible ssa_name.
	(gimple_range_global): New.  Retreive legacy global value.
	(class trace_ranger): New.  Ranger with tracing capabilties.

	* gimple-range.cc: New File.
	(adjust_pointer_diff_expr): New.  Ptrdiff adjustements.
	(gimple_range_adjustment): New.  Multi-statement range adjustments.
	(get_tree_range): New.  What range is known about a tree node.
	(gimple_range_fold): New.  Gimple interface to range_op::fold.
	(gimple_range_base_of_assignment): New.  Base name of an assignment.
	(gimple_range_operand1): New.  Symbolic name of operand 1.
	(gimple_range_operand2): New.  Symbolic name of operand 2.
	(gimple_range_calc_op1): New.  Gimple interface to range-op::op1_range.
	(gimple_range_calc_op2): New.  Gimple interface to range-op::op2_range.
	(gimple_ranger::calc_stmt): New.  Calculate a range for a statement.
	(gimple_ranger::range_of_range_op): New.  Handle range_op statements.
	(gimple_ranger::range_of_non_trivial_assignment): New.  Handle pointer
	expressions on ssa_names.
	(gimple_ranger::range_of_phi): New.
	(gimple_ranger::range_of_call): New.
	(gimple_ranger::range_of_builtin_ubsan_call): New.
	(gimple_ranger::range_of_builtin_call): New.
	(gimple_ranger::range_of_cond_expr): New.  Handle x ? y : z.
	(gimple_ranger::range_of_expr): New.  Calculate range on stmt.
	(gimple_ranger::range_on_entry): New.  Calculate range at block entry.
	(gimple_ranger::range_on_exit): New.  Calculate range at block exit.
	(gimple_ranger::range_on_edge): New.  Calculate range on an edge.
	(gimple_ranger::range_of_stmt): New.  Calculate LHS of statement.
	(gimple_ranger::export_global_ranges): New.  Export known ranges to
	the legacy cache SSA_NAME_RANGE_INFO.
	(gimple_ranger::dump): New.
	(gimple_ranger::range_of_ssa_name_with_loop_info): New. SCEV hook.
	(trace_ranger::trace_ranger): New.
	(trace_ranger::dumping: New.  Tracing prefix handler.
	(trace_ranger::trailer): New.  Tracing suffix handler.
	(trace_ranger::range_on_edge): New.  Provide tracing.
	(trace_ranger::range_on_entry): New.  Provide tracing.
	(trace_ranger::range_on_exit): New.  Provide tracing.
	(trace_ranger::range_of_stmt): New.  Provide tracing.
	(trace_ranger::range_of_expr): New.  Provide tracing.

diff --git a/gcc/gimple-range.h b/gcc/gimple-range.h
new file mode 100644
index 000..c45760af393
--- /dev/null
+++ b/gcc/gimple-range.h
@@ -0,0 +1,170 @@
+/* Header file for the GIMPLE range interface.
+   Copyright (C) 2019-2020 Free Software Foundation, Inc.
+   Contributed by Andrew MacLeod 
+   and Aldy Hernandez .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_GIMPLE_RANGE_STMT_H
+#define GCC_GIMPLE_RANGE_STMT_H
+
+
+#include "range.h"
+#include "range-op.h"
+#include "gimple-range-edge.h"
+#include "gimple-range-gori.h"
+#include

[PATCH 4/6] gimple-range-cache

2020-10-02 Thread Andrew MacLeod via Gcc-patches


These are the various caches used by the ranger.

- non-null-ref :  Tracks non-null pointer dereferences in blocks so we 
can properly calculate non-null pointer ranges
- on entry cache : Block ranges tracks the calucalted values sof 
ssa-names on entry to each basic block.
- global cache:  Stores whats globally known. Also used to determine if 
a stmt has been evaluated already.


And finally, the "Ranger Cache" which combines all those caches together.

It does far more than that tho.

It links them with a gori-computes component so that when it is 
calculating an outgoing range, it can "seed" the value with whatever has 
already been computed.  ie, if a_4 is used within a basic block, it's 
value  can be seeded with whatever the range-on-entry cache says a_4 is 
for that block.


Furthermore, the ranger cache also takes responsible for populating the 
on-entry cache.  When the ranger asks the cache for the range of a_4 on 
entry to BB9, the ranger cache is responsible for either retrieving the 
stored value, or computing it the first time.


The first time computation simply looks back to the definition (or 
previously populated entries) and applies any intervening GORI edge 
calculations between the definition and this block. and propagates those 
values along the edges until the cache is filled.  Note no new range 
calculations are requested, its all based on what already known between 
all the caches and what GORI can do with those values.


Its the rangers responsibility to ensure before asking for the cache to 
be filled that any global ranges have been calculated.


This component is also where the iterative update engine is located, and 
it purely based on injecting a new range into the on-entry cache, 
checking if propagating that range to successor blocks changes anything, 
and if so propagating those changes.   The update engine is currently 
only used within the cache fill for handling back edges and certain 
"poor" values that were used.


The API for this component is again pretty simple.  It exposes the 
caches it maintains, as well as 2 entry points.


 ssa_range_in_bb  :   which is a virtual function used by the gori 
computes component to access the on-entry or global cache for an ssa-name
 block_range :  This is the entry point the ranger uses to ask for the 
on entry range of a name which may trigger a cache fill if needed.

	* gimple-range-cache.h: New File.
	(class non_null_ref): New.  Non-null reference tracker.
	(class block_range_cache): New.  Block range on-entry cache.
	(class ssa_global_cache): New.  Global range cache.
	(class ranger_cache): New.  GORI computes plus opther caches combined.

	* gimple-range-cache.cc: New File.
	(non_null_ref::non_null_ref): New.
	(non_null_ref::~non_null_ref): New.
	(non_null_ref::non_null_deref_p): New.  Non-null in block query.
	(non_null_ref::process_name): New.  Find non-null block references.
	(class ssa_block_ranges): New.  Track ranges for a single SSA_NAME.
	(ssa_block_ranges::ssa_block_ranges): New.
	(ssa_block_ranges::~ssa_block_ranges): New.
	(ssa_block_ranges::set_bb_range): New.
	(ssa_block_ranges::set_bb_varying): New.
	(ssa_block_ranges::get_bb_range): New.
	(ssa_block_ranges::bb_range_p): New.
	(ssa_block_ranges::dump): New.
	(block_range_cache::block_range_cache): New.  Manage ssa_block_ranges.
	(block_range_cache::~block_range_cache): New.
	(block_range_cache::get_block_ranges): New.  Get an ssa_block_range.
	(block_range_cache::set_bb_range): New.
	(block_range_cache::set_bb_varying): New.
	(block_range_cache::get_bb_range): New.
	(block_range_cache::bb_range_p): New.
	(block_range_cache::dump): New.
	(ssa_global_cache::ssa_global_cache): New.
	(ssa_global_cache::~ssa_global_cache): New.
	(ssa_global_cache::get_global_range): New.
	(ssa_global_cache::set_global_range): New.
	(ssa_global_cache::clear_global_range): New.
	(ssa_global_cache::clear): New. Clear everything.
	(ssa_global_cache::dump): New.
	(ranger_cache::ranger_cache): New.
	(ranger_cache::~ranger_cache): New.
	(ranger_cache::push_poor_value): New.  Add value to recalculation queue.
	(ranger_cache::ssa_range_in_bb): New.  Provide best available range
	without any calculations for an ssa_name.
	(ranger_cache::block_range): New.  Fill block cache for ssa_name from
	definition block to this block, and return range.
	(ranger_cache::add_to_update): New.  Add to list of blocks to update.
	(ranger_cache::iterative_cache_update): New.  Empty update list by 
	updating block ranges until done.
	(ranger_cache::fill_block_cache): New.  Create list of blocks to
	update.

diff --git a/gcc/gimple-range-cache.h b/gcc/gimple-range-cache.h
new file mode 100644
index 000..29ab01e2a98
--- /dev/null
+++ b/gcc/gimple-range-cache.h
@@ -0,0 +1,120 @@
+/* Header file for gimple ranger SSA cache.
+   Copyright (C) 2017-2020 Free Software Foundation, Inc.
+   Contributed by Andrew MacLeod .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or

[PATCH 3/6] gimple-range-gori

2020-10-02 Thread Andrew MacLeod via Gcc-patches


This is the true core of the ranger.

The GORI (Generates Outgoing Range Information) engine contains all the 
smarts to utilize the functionality provided via range-ops in order to 
calculate outgoing ranges on an edge, based on the control flow at the exit.


It functions *only* at the basic block level.   Based on the control 
values on the control edges, it can calculate dependent ranges within 
the block, as well as restriction implied on incoming values in the block.


thats a mouthful.  A simple example is better.


 int t = a - 4;
  if (a > 3 && t < 11)
  return t;

This maps to gimple:

=== BB 2 
a_5(D)  int VARYING
     :
    t_6 = a_5(D) + -4;
    _1 = a_5(D) > 3;
    _2 = t_6 <= 10;
    _3 = _1 & _2;
    if (_3 != 0)
  goto ; [INV]
    else
  goto ; [INV]

For this block, the gori engine can figure out the following ranges:

globally true:
t_6 : int [-INF, 2147483643]

On the true edge 2->3:
2->3  (T) _1 :  _Bool [1, 1]
2->3  (T) _2 :  _Bool [1, 1]
2->3  (T) _3 :  _Bool [1, 1]
2->3  (T) a_5(D) :  int [4, 14]
2->3  (T) t_6 : int [0, 10]

And on the false edge 2->4:
2->4  (F) _3 :  _Bool [0, 0]
2->4  (F) a_5(D) :  int [-2147483644, 3][15, +INF]
2->4  (F) t_6 : int [-INF, -1][11, 2147483643]


AS long as there are entries in range-ops for the gimple operation, it 
can wind back through an arbitary number of expressions and utilizing 
the expression solving abilities of range-ops come up with calculations.


The API for this class is pretty straightforward.. simply ask for the 
range of X on an edge:

  bool outgoing_edge_range_p (irange , edge e, tree name);
  bool has_edge_range_p (edge e, tree name);

everything else is hidden under the covers.

Andrew


	* gimple-range-gori.h: New File.
	(class gori_compute): New.  Generates Outgoing Range Info computation.
	(class gori_compute_cache): New.  Adds logical stmt cache.

	* gimple-range-gori.cc: New File.
	(class range_def_chain): New.  Definition chain calculator.
	(range_def_chain::range_def_chain): New.
	(range_def_chain::~range_def_chain): New.
	(range_def_chain::in_chain_p): New.  Query chain contents.
	(range_def_chain::build_def_chain): New.  Build a new def chain.
	(range_def_chain::has_def_chain): New.  Query existance of a chain.
	(range_def_chain::get_def_chain): New.  Get or calculate a chain.
	(class gori_map): New.  Add GORI information to def_chains.
	(gori_map::gori_map): New.
	(gori_map::~gori_map): New.
	(gori_map::exports): New.  Return export bitmap for a block.
	(gori_map::is_export_p): New.  Is SSA_NAME an export.
	(gori_map::def_chain_in_export_p): New.  Is any chain member an export.
	(gori_map::maybe_add_gori): New.  Add def chain exports if appropriate.
	(gori_map::calculate_gori): New.  Calculate summary for BB.
	(gori_map::dump): New. Dump contents.
	(gori_compute::gori_compute): New.
	(gori_compute::~gori_compute): New.
	(gori_compute::ssa_range_in_bb): New.  Get incoming ssa-name value.
	(gori_compute::expr_range_in_bb): New.  Get incoming expression value.
	(gori_compute::compute_name_range_op): New.  Calculate outgoing range
	for an an ssa-name in a statement.
	(gori_compute::compute_operand_range_switch):New.  Handle switch.
	(is_gimple_logical_p): New.
	(gori_compute::compute_operand_range): New.
	(range_is_either_true_or_false): New.
	(gori_compute::logical_combine): New.  Combine ranges through a
	logical expression.
	(gori_compute::optimize_logical_operands): New.
	(gori_compute::compute_logical_operands_in_chain): New.
	(gori_compute::compute_logical_operands): New.  Calculate incoming
	ranges for both sides of a logical expression when approriate.
	(gori_compute::compute_operand1_range): New.  Compute op1_range.
	(gori_compute::compute_operand2_range): New.  Compute op2_range.
	(gori_compute::compute_operand1_and_operand2_range): New.  op1_range
	and op2_range are the same SSA_NAME.
	(gori_compute::has_edge_range_p): New.  Can ssa-name be calcualted.
	(gori_compute::dump): New.
	(gori_compute::outgoing_edge_range_p): New.  Calculate an outgoing edge
	range for ssa-name entry point.
	(class logical_stmt_cache): New.  logical evaluation cache.
	(logical_stmt_cache::logical_stmt_cache): New.
	(logical_stmt_cache::~logical_stmt_cache): New.
	(logical_stmt_cache::set_range): New.
	(logical_stmt_cache::get_range): New.
	(logical_stmt_cache::cached_name): New.  
	(logical_stmt_cache::same_cached_name): New.  Do cached names match.
	(logical_stmt_cache::cacheable_p): New.  Is statement cacheable.
	(logical_stmt_cache::slot_diagnostics): New.  Debugging diagnostics.
	(logical_stmt_cache::dump): New.
	(gori_compute_cache::gori_compute_cache): New.
	(gori_compute_cache::~gori_compute_cache): New.
	(gori_compute_cache::compute_operand_range): New.  Check cache first.
	(gori_compute_cache::cache_stmt): New.  Create entry if possible.

diff --git a/gcc/gimple-range-gori.h b/gcc/gimple-range-gori.h
new file mode 100644
index 000..8ef452bf433
--- /dev/null
+++

[PATCH 2/6] gimple-range-edge

2020-10-02 Thread Andrew MacLeod via Gcc-patches

This file produces constant edge ranges.  It provides a class which can 
be instantiated and will return the range on any edge.


This was originally suppose to be trivial, but switches mucked that up :-)

There are 2 basic expressions that can result ina  constant range on an 
edge:
Note there is nothing related to ssa or anythin else, its about ranges 
which are forced by the edge.


a gimple condition:

BB2:
    if (_3 != 0)
  goto ; [INV]
    else
  goto ; [INV]

a query for range on edge 2->3 returns [1,1]  (TRUE)
a query for range on edge 2->5 returns [0,0] (FALSE)

the other case, and the reason this exists, is for switches.  There is 
no simple way to map the values on switch edges implied by the labels 
without walking through the switch statement and calculating them.  The 
on-demand ranger was spending a lot of time recalculating these values 
for multiple requests, thus this class was created to process a switch 
once, and cache the values.


y = x + 4
switch (x)
 {
 case 0:
 case 1:
 return 1;
 case 2:
 return y;
 case 3:
 case 10:
 return 3;
 default:
    return x;
 }

Any request for the range on any edge leaving the switch will calculate 
all the edge ranges at once and cache them.  subsequent queries will 
then just return the cached value.


 :
 switch (x_2(D))  [INV], case 0 ... 1:  [INV], 
case 2:  [INV], case 3:  [INV], case 10:  [INV]>


4->7  x_2(D) :  unsigned int [4, 9][11, +INF] // default 
edge

4->8  x_2(D) :  unsigned int [0, 1]
4->5  x_2(D) :  unsigned int [2, 2]
4->6  x_2(D) :  unsigned int [3, 3][10, 10]

There is no ranger dependency, this can be used on any valid gimple IL 
to get the ranges implied by the edge.


The ranger is needed to map those values to the switch variable, and 
also apply any previous ranges or derived values (ie, if you ask for the 
range of 'y' in case 2, it will return unsigned int [6,6].

	* gimple-range-edge.h: New File.
	(outgoing_range): Calculate/cache constant outgoing edge ranges.

	* gimple-range-edge.cc: New file.
	(gimple_outgoing_range_stmt_p): New.  Find control statement.
	(outgoing_range::outgoing_range): New.
	(outgoing_range::~outgoing_range): New.
	(outgoing_range::get_edge_range): New.  Internal switch edge query.
	(outgoing_range::calc_switch_ranges): New.  Calculate switch ranges.
	(outgoing_range::edge_range_p): New.  Find constant range on edge.

diff --git a/gcc/gimple-range-edge.h b/gcc/gimple-range-edge.h
new file mode 100644
index 000..400c814ac7e
--- /dev/null
+++ b/gcc/gimple-range-edge.h
@@ -0,0 +1,55 @@
+/* Gimple range edge header file.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   Contributed by Andrew MacLeod 
+   and Aldy Hernandez .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GIMPLE_RANGE_EDGE_H
+#define GIMPLE_RANGE_EDGE_H
+
+// This class is used to query ranges on constant edges in GIMPLE.
+//
+// For a COND_EXPR, the TRUE edge will return [1,1] and the false edge a [0,0].
+//
+// For SWITCH_EXPR, it is awkward to calculate ranges.  When a request
+// is made, the entire switch is evalauted and the results cached. 
+// Any future requests to that switch will use the cached value, providing
+// dramatic decrease in computation time.
+//
+// The API is simple, just ask for the range on the edge.
+// The return value is NULL for no range, or the branch statement which the
+// edge gets the range from, along with the range.
+
+class outgoing_range
+{
+public:
+  outgoing_range ();
+  ~outgoing_range ();
+  gimple *edge_range_p (irange , edge e);
+private:
+  void calc_switch_ranges (gswitch *sw);
+  bool get_edge_range (irange , gimple *s, edge e);
+
+  hash_map *m_edge_table;
+  irange_allocator m_range_allocator;
+}; 
+
+// If there is a range control statment at the end of block BB, return it.
+gimple *gimple_outgoing_range_stmt_p (basic_block bb);
+
+#endif  // GIMPLE_RANGE_EDGE_H
diff --git a/gcc/gimple-range-edge.cc b/gcc/gimple-range-edge.cc
new file mode 100644
index 000..c5ee54fec51
--- /dev/null
+++ b/gcc/gimple-range-edge.cc
@@ -0,0 +1,197 @@
+/* Gimple range edge functionaluity.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   Contributed by Andrew MacLeod 
+   and Aldy Hernandez .
+
+This file is part of GCC.
+
+GCC is free

[PATCH 1/6] Ranger patches.

2020-10-02 Thread Andrew MacLeod via Gcc-patches


This patch set contains the various components that make up a ranger.

Ranger files are prefixed by   "gimple-range".

gimple-range-cache.{h,cc} :   Various caches used by the ranger.
gimple-range-edge.{h,cc} :    Outgoing edge range calculations, 
particularly switch edge ranges.
gimple-range-gori.{h,cc} :     "Generate Outgoing Range Info" module 
which calculates ranges on exit to basic blocks.
gimple-range.{h,cc} :         gimple_ranger which pulls together the 
other components and provides on-demand ranges.



These are pretty much self contained and do not require changing any 
other files.


Consumers need only include "gimple-range.h"   and then invoke a ranger 
for use within the pass


gimple_ranger ranger;
<..>
 if (ranger.range_of_expr (r, name, stmt))
    use_range (r);

Follow on patches from Aldy will convert some of the passes will better 
demonstrate usage once we're in and running.


I plan to update the documentation over the next couple of weeks.

I will let these patches sit here for a few days, and barring any 
issues, will then commit them early next week.



Andrew





	* Makefile.in (OBJS): Add gimple-range*.o.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 50d6c83eb76..5a8fb0d7612 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1369,6 +1369,10 @@ OBJS = \
 	gimple-loop-versioning.o \
 	gimple-low.o \
 	gimple-pretty-print.o \
+	gimple-range.o \
+	gimple-range-cache.o \
+	gimple-range-edge.o \
+	gimple-range-gori.o \
 	gimple-ssa-backprop.o \
 	gimple-ssa-evrp.o \
 	gimple-ssa-evrp-analyze.o \

Project Ranger Submission / On Demand VRP

2020-10-02 Thread Andrew MacLeod via Gcc-patches


Where to start?

  This is the culmination of numerous years work on the ranger for 
generating accurate on-demand ranges in GCC.

There are 2 patch sets as a part of this submission

1)  The ranger &  Enhancements to EVRP
3)  Other passes converted to Ranger

There is still some missing functionality in the ranger which is 
required for it to be a full replacement of either EVRP or VRP, but its 
complete enough to offer some enhanced functionality which we think is 
worthwhile.


It is still purely range based, which means it does not currently handle 
equivalences, nor other relations.  I have the relation oracle 
prototyped, and may still be able to get at least some of it in before 
the end of stage 1.


I will go into more details with the "hybrid" EVRP mode in the intro to 
that patchset, but basically you can run EVRP in one of 3 modes:

  - classic EVRP mode,
  - Ranger only (RVRP) or
  - hybrid mode where both engines are used.  This allows us to perform 
extra optimizations we could not perform before, all while continuing to 
exercise both engines fully, as well as articulating in dump files what 
the differences are.


performance:

Performance has been the primary holdup, and it still not as good as it 
was.  The original irange class was wide_int based and presumed quick 
and efficient manipulate of ranges. Back then, we were comparable to 
EVRP in speed.  When we merged value_range with irange last year  to 
produce int_range, we changed the internal representation to be tree 
based.   This has slowed down the mult-range code noticeably as now all 
the various sub-range pairs are allocated as trees/hash table looked up 
each time.  That plus various other tweaks have slowed us down.  We have 
plans for addressing at least some of this discrepancy.


Regardless, the net effect is that our performance runs have shown 
ranger now takes about 70% longer than EVRP does when compiling now.  
Since the new hybrid mode uses both engines, we see a 170% increase in 
time required to execute EVRP.  EVRP starts off as a very fast pass, and 
In a compile of 242 GCC source files, it amounts to about a 3 second 
increase out of 280, or about 1% overall.


This is offset by the speed up we get by converting the -wrestrict pass 
to use the range.  That pass speeds up to 15%  of its original time. 
Yes, its an 85% reduction because it only calculates the few ranges it 
requires.


The end result of all patches being applied is we find overall compile 
time to typically be slightly faster with these changes than before, 
plus we are solving more cases within EVRP.  This data is backed up by a 
callgrind analysis of a compile where less insutrctions end up being 
executed with ths path.




Regardless, our goal with this submission is to enable other passes to 
begin using the rangers functionality now, as well as getting it 
regularly exercised while we move toward a complete VRP replacement.  
The more we replace, the faster things will get :-)


The idea is that we run in hybrid mode for the rest of stage1 which 
fully exercises both engines, and see where we end up early in the next 
stage.  We can assess based on functionality vs speed whether we want 
EVRP to run in classic mode, hybrid mode, or rvrp only mode for the next 
release.


During the remainder of stage 1, we will work on a combination of 
performance and functionality, and then once stage 1 closes, we'll focus 
primarily on performance.  We have a laundry list of things to attack 
and we hope to get most of the lost time back.


I will provide the ranger and the hybrid EVRP patches, and Aldy will 
follow on with the different optimization conversions.


These patches bootstrap, pass all regressions, and additionally have 
gone thru an entire build of fedora in the past week, in hybrid mode.  
At least on our branch :-)


More details in each of the patchsets.

I'll let these sit here for a few days, and barring issues, will 
re-post  the latest versions early next week along with any requires 
testcase tweaks for hybrid mode,  and check them in. (we're still 
tracking down a fedora build failure, and i have a couple of testcases 
to look at with the latest merge from trunk)


Andrew

c++: cleanup ctor_omit_inherited_parms [PR97268]

2020-10-02 Thread Nathan Sidwell



ctor_omit_inherited_parms was being somewhat abused.  What I'd missed
is that it checks for a base-dtor name, before proceeding with the
check.  But we ended up passing it that during cloning before we'd
completed the cloning.  It was also using DECL_ORIGIN to get to the
in-charge ctor, but we sometimes zap DECL_ABSTRACT_ORIGIN, and it ends
up processing the incoming function -- which happens to work.  so,
this breaks out a predicate that expects to get the incharge ctor, and
will tell you whether its base ctor will need to omit the parms.  We
call that directly during cloning.

Then the original fn is essentially just a wrapper, but uses
DECL_CLONED_FUNCTION to get to the in-charge ctor.  That uncovered
abuse in add_method, which was happily passing TEMPLATE_DECLs to it.
Let's not do that.  add_method itself contained a loop mostly
containing an 'if (nomatch) continue' idiom, except for a final 'if
(match) {...}'  check, which itself contained instances of the former
idiom.  I refactored that to use the former idiom throughout.  In
doing that I found a place where we'd issue an error, but then not
actually reject the new member.

gcc/cp/
* cp-tree.h (base_ctor_omit_inherited_parms): Declare.
* class. (add_method): Refactor main loop, only pass fns to
ctor_omit_inherited_parms.
(build_cdtor_clones): Rename bool parms.
(clone_cdtor): Call base_ctor_omit_inherited_parms.
* method.c (base_ctor_omit_inherited_parms): New, broken out of
...
(ctor_omit_inherited_parms): ... here, call it with
DECL_CLONED_FUNCTION.
gcc/testsuite/
* g++.dg/inherit/pr97268.C: New.

pushing to trunk

nathan

--
Nathan Sidwell
diff --git c/gcc/cp/class.c w/gcc/cp/class.c
index c9a1f753d56..01780fe8291 100644
--- c/gcc/cp/class.c
+++ w/gcc/cp/class.c
@@ -1006,10 +1006,6 @@ add_method (tree type, tree method, bool via_using)
   for (ovl_iterator iter (current_fns); iter; ++iter)
 {
   tree fn = *iter;
-  tree fn_type;
-  tree method_type;
-  tree parms1;
-  tree parms2;
 
   if (TREE_CODE (fn) != TREE_CODE (method))
 	continue;
@@ -1037,10 +1033,8 @@ add_method (tree type, tree method, bool via_using)
 	 functions in the derived class override and/or hide member
 	 functions with the same name and parameter types in a base
 	 class (rather than conflicting).  */
-  fn_type = TREE_TYPE (fn);
-  method_type = TREE_TYPE (method);
-  parms1 = TYPE_ARG_TYPES (fn_type);
-  parms2 = TYPE_ARG_TYPES (method_type);
+  tree fn_type = TREE_TYPE (fn);
+  tree method_type = TREE_TYPE (method);
 
   /* Compare the quals on the 'this' parm.  Don't compare
 	 the whole types, as used functions are treated as
@@ -1055,137 +1049,149 @@ add_method (tree type, tree method, bool via_using)
 	  || type_memfn_rqual (fn_type) != type_memfn_rqual (method_type)))
 	  continue;
 
-  /* For templates, the return type and template parameters
-	 must be identical.  */
-  if (TREE_CODE (fn) == TEMPLATE_DECL
-	  && (!same_type_p (TREE_TYPE (fn_type),
-			TREE_TYPE (method_type))
-	  || !comp_template_parms (DECL_TEMPLATE_PARMS (fn),
-   DECL_TEMPLATE_PARMS (method
+  tree real_fn = fn;
+  tree real_method = method;
+
+  /* Templates and conversion ops must match return types.  */
+  if ((DECL_CONV_FN_P (fn) || TREE_CODE (fn) == TEMPLATE_DECL)
+	  && !same_type_p (TREE_TYPE (fn_type), TREE_TYPE (method_type)))
 	continue;
+  
+  /* For templates, the template parameters must be identical.  */
+  if (TREE_CODE (fn) == TEMPLATE_DECL)
+	{
+	  if (!comp_template_parms (DECL_TEMPLATE_PARMS (fn),
+DECL_TEMPLATE_PARMS (method)))
+	continue;
 
-  if (! DECL_STATIC_FUNCTION_P (fn))
+	  real_fn = DECL_TEMPLATE_RESULT (fn);
+	  real_method = DECL_TEMPLATE_RESULT (method);
+	}
+
+  tree parms1 = TYPE_ARG_TYPES (fn_type);
+  tree parms2 = TYPE_ARG_TYPES (method_type);
+  if (! DECL_STATIC_FUNCTION_P (real_fn))
 	parms1 = TREE_CHAIN (parms1);
-  if (! DECL_STATIC_FUNCTION_P (method))
+  if (! DECL_STATIC_FUNCTION_P (real_method))
 	parms2 = TREE_CHAIN (parms2);
 
-  /* Bring back parameters omitted from an inherited ctor.  */
-  if (ctor_omit_inherited_parms (fn))
-	parms1 = FUNCTION_FIRST_USER_PARMTYPE (DECL_ORIGIN (fn));
-  if (ctor_omit_inherited_parms (method))
-	parms2 = FUNCTION_FIRST_USER_PARMTYPE (DECL_ORIGIN (method));
+  /* Bring back parameters omitted from an inherited ctor.  The
+	 method and the function can have different omittedness.  */
+  if (ctor_omit_inherited_parms (real_fn))
+	parms1 = FUNCTION_FIRST_USER_PARMTYPE (DECL_CLONED_FUNCTION (real_fn));
+  if (ctor_omit_inherited_parms (real_method))
+	parms2 = (FUNCTION_FIRST_USER_PARMTYPE
+		  (DECL_CLONED_FUNCTION (real_method)));
 
-  if (compparms (parms1, parms2)
-	  && (!DECL_CONV_FN_P (fn)
-	  || same_type_p (TREE_TYPE (fn_type),
-

Re: [PATCH v4 1/2] system_data_types.7: Add 'void *'

2020-10-02 Thread Paul Eggert


On 10/2/20 8:14 AM, Alejandro Colomar wrote:


+.I void *


GNU style is a space between "void" and "*", so this should be '.I "void\ *"', 
both here and elsewhere. The backslash prevents a line break.



+Conversions from and to any other pointer type are done implicitly,
+not requiring casts at all.
+Note that this feature prevents any kind of type checking:
+the programmer should be careful not to cast a


Change "cast" to "convert", since the point is that no cast is needed.


+.PP
+The conversion specifier for
+.I void *
+for the
+.BR printf (3)
+and the
+.BR scanf (3)
+families of functions is
+.BR p ;
+resulting commonly in
+.B %p
+for printing
+.I void *
+values.


%p works with any object pointer type (or in POSIX, any pointer type), not just 
 void *.


Should also mention "void const *", "void volatile *", etc. Plus it really 
should talk about plain "void", saying that it's a placeholder as a return value 
for functions, for casting away values, and as a keyword in C11 for functions 
with no parameters (though this is being changed in the next C version!). I sent 
comments about most of this stuff already.

Re: [PATCH] optimize permutes in SLP, remove vect_attempt_slp_rearrange_stmts

2020-10-02 Thread Richard Sandiford via Gcc-patches

Richard Biener  writes:
> This introduces a permute optimization phase for SLP which is
> intended to cover the existing permute eliding for SLP reductions
> plus handling commonizing the easy cases.
>
> It currently uses graphds to compute a postorder on the reverse
> SLP graph and it handles all cases vect_attempt_slp_rearrange_stmts
> did (hopefully - I've adjusted most testcases that triggered it
> a few days ago).  It restricts itself to move around bijective
> permutations to simplify things for now, mainly around constant nodes.
>
> As a prerequesite it makes the SLP graph cyclic (ugh).  It looks
> like it would pay off to compute a PRE/POST order visit array
> once and elide all the recursive SLP graph walks and their
> visited hash-set.  At least for the time where we do not change
> the SLP graph during such walk.
>
> I do not like using graphds too much but at least I don't have to
> re-implement yet another RPO walk, so maybe it isn't too bad.
>
> Comments are welcome - I do want to see vect_attempt_slp_rearrange_stmts
> go way for GCC 11 and the permute optimization helps non-store
> BB vectorization opportunities where we can end up with a lot of
> useless load permutes otherwise.

Looks really nice.  Got a couple of questions that probably just show
my misunderstanding :-)

Is this intended to compute an optimal-ish solution?  It looked from
a quick read like it tried to push permutes as far away from loads as
possible without creating permuted and unpermuted versions of the same
node.  But I guess there will be cases where the optimal placement is
somewhere between the two extremes of permuting at the loads and
permuting as far away as possible.

Of course, whatever we do will be a heuristic.  I just wasn't sure how
often this would be best in practice.

It looks like the materialisation phase changes the choices for nodes
on the fly, is that right?  If so, how does that work for backedges?
I'd expected the materialisation phase to treat the permutation choice
as read-only, and simply implement what the graph already said.

Thanks,
Richard

>
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>
> Richard.
>
> 2020-10-02  Richard Biener  
>
>   * tree-vect-data-refs.c (vect_slp_analyze_instance_dependence):
>   Use SLP_TREE_REPRESENTATIVE.
>   * tree-vectorizer.h (_slp_tree::vertex): New member used
>   for graphds interfacing.
>   * tree-vect-slp.c (vect_build_slp_tree_2): Allocate space
>   for PHI SLP children.
>   (vect_analyze_slp_backedges): New function filling in SLP
>   node children for PHIs that correspond to backedge values.
>   (vect_analyze_slp): Call vect_analyze_slp_backedges for the
>   graph.
>   (vect_slp_analyze_node_operations): Deal with a cyclic graph.
>   (vect_schedule_slp_instance): Likewise.
>   (vect_schedule_slp): Likewise.
>   (slp_copy_subtree): Remove.
>   (vect_slp_rearrange_stmts): Likewise.
>   (vect_attempt_slp_rearrange_stmts): Likewise.
>   (vect_slp_build_vertices): New functions.
>   (vect_slp_permute): Likewise.
>   (vect_optimize_slp): Remove special code to elide
>   permutations with SLP reductions.  Implement generic
>   permute optimization.
>
>   * gcc.dg/vect/bb-slp-50.c: New testcase.
>   * gcc.dg/vect/bb-slp-51.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-50.c |  20 +
>  gcc/testsuite/gcc.dg/vect/bb-slp-51.c |  20 +
>  gcc/tree-vect-data-refs.c |   2 +-
>  gcc/tree-vect-slp.c   | 660 +-
>  gcc/tree-vectorizer.h |   2 +
>  5 files changed, 479 insertions(+), 225 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-50.c
>  create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-51.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-50.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-50.c
> new file mode 100644
> index 000..80216be4ebf
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-50.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_double } */
> +
> +double a[2];
> +double b[2];
> +double c[2];
> +double d[2];
> +double e[2];
> +void foo(double x)
> +{
> +  double tembc0 = b[1] + c[1];
> +  double tembc1 = b[0] + c[0];
> +  double temde0 = d[0] + e[1];
> +  double temde1 = d[1] + e[0];
> +  a[0] = tembc0 + temde0;
> +  a[1] = tembc1 + temde1;
> +}
> +
> +/* We should common the permutation on the tembc{0,1} operands.  */
> +/* { dg-final { scan-tree-dump-times "add new stmt: \[^\\n\\r\]* = 
> VEC_PERM_EXPR" 2 "slp2" } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-51.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-51.c
> new file mode 100644
> index 000..1481018428e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-51.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_double } */
> +
> +double a[2];
> +double b[2];
> +double c[2];
> +double e[2];
> +void

[PATCH] libstdc++: Add C++2a synchronization support

2020-10-02 Thread Thomas Rodgers

From: Thomas Rodgers 

Updated patch incorporating latest feedback (revised).

Add support for -
  * atomic_flag::wait/notify_one/notify_all
  * atomic::wait/notify_one/notify_all
  * counting_semaphore
  * binary_semaphore
  * latch

libstdc++-v3/ChangeLog:

* include/Makefile.am (bits_headers): Add new header.
* include/Makefile.in: Regenerate.
* include/bits/atomic_base.h (__atomic_flag::wait): Define.
(__atomic_flag::notify_one): Likewise.
(__atomic_flag::notify_all): Likewise.
(__atomic_base<_Itp>::wait): Likewise.
(__atomic_base<_Itp>::notify_one): Likewise.
(__atomic_base<_Itp>::notify_all): Likewise.
(__atomic_base<_Ptp*>::wait): Likewise.
(__atomic_base<_Ptp*>::notify_one): Likewise.
(__atomic_base<_Ptp*>::notify_all): Likewise.
(__atomic_impl::wait): Likewise.
(__atomic_impl::notify_one): Likewise.
(__atomic_impl::notify_all): Likewise.
(__atomic_float<_Fp>::wait): Likewise.
(__atomic_float<_Fp>::notify_one): Likewise.
(__atomic_float<_Fp>::notify_all): Likewise.
(__atomic_ref<_Tp>::wait): Likewise.
(__atomic_ref<_Tp>::notify_one): Likewise.
(__atomic_ref<_Tp>::notify_all): Likewise.
(atomic_wait<_Tp>): Likewise.
(atomic_wait_explicit<_Tp>): Likewise.
(atomic_notify_one<_Tp>): Likewise.
(atomic_notify_all<_Tp>): Likewise.
* include/bits/atomic_wait.h: New file.
* include/bits/atomic_timed_wait.h: New file.
* include/bits/semaphore_base.h: New file.
* include/std/atomic (atomic::wait): Define.
(atomic::wait_one): Likewise.
(atomic::wait_all): Likewise.
(atomic<_Tp>::wait): Likewise.
(atomic<_Tp>::wait_one): Likewise.
(atomic<_Tp>::wait_all): Likewise.
(atomic<_Tp*>::wait): Likewise.
(atomic<_Tp*>::wait_one): Likewise.
(atomic<_Tp*>::wait_all): Likewise.
* include/std/latch: New file.
* include/std/semaphore: New file.
* include/std/version: Add __cpp_lib_semaphore and
__cpp_lib_latch defines.
* testsuite/29_atomic/atomic/wait_notify/atomic_refs.cc: New test.
* testsuite/29_atomic/atomic/wait_notify/bool.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/integrals.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/floats.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomic/atomic/wait_notify/generic.cc: Liekwise.
* testsuite/29_atomic/atomic/wait_notify/generic.h: New File.
* testsuite/29_atomics/atomic_flag/wait_notify/1.cc: New test.
* testsuite/30_thread/semaphore/1.cc: New test.
* testsuite/30_thread/semaphore/2.cc: Likewise.
* testsuite/30_thread/semaphore/least_max_value_neg.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire_for.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire_posix.cc: Likewise.
* testsuite/30_thread/semaphore/try_acquire_until.cc: Likewise.
* testsuite/30_thread/latch/1.cc: New test.
* testsuite/30_thread/latch/2.cc: New test.
* testsuite/30_thread/latch/3.cc: New test.
---
 libstdc++-v3/include/Makefile.am  |   5 +
 libstdc++-v3/include/Makefile.in  |   5 +
 libstdc++-v3/include/bits/atomic_base.h   | 195 +++-
 libstdc++-v3/include/bits/atomic_timed_wait.h | 281 
 libstdc++-v3/include/bits/atomic_wait.h   | 301 ++
 libstdc++-v3/include/bits/semaphore_base.h| 283 
 libstdc++-v3/include/std/atomic   |  73 +
 libstdc++-v3/include/std/latch|  90 ++
 libstdc++-v3/include/std/semaphore|  92 ++
 libstdc++-v3/include/std/version  |   2 +
 .../atomic/wait_notify/atomic_refs.cc | 103 ++
 .../29_atomics/atomic/wait_notify/bool.cc |  59 
 .../29_atomics/atomic/wait_notify/floats.cc   |  32 ++
 .../29_atomics/atomic/wait_notify/generic.cc  |  31 ++
 .../29_atomics/atomic/wait_notify/generic.h   | 160 ++
 .../atomic/wait_notify/integrals.cc   |  65 
 .../29_atomics/atomic/wait_notify/pointers.cc |  59 
 .../29_atomics/atomic_flag/wait_notify/1.cc   |  61 
 libstdc++-v3/testsuite/30_threads/latch/1.cc  |  27 ++
 libstdc++-v3/testsuite/30_threads/latch/2.cc  |  27 ++
 libstdc++-v3/testsuite/30_threads/latch/3.cc  |  50 +++
 .../testsuite/30_threads/semaphore/1.cc   |  27 ++
 .../testsuite/30_threads/semaphore/2.cc   |  27 ++
 .../semaphore/least_max_value_neg.cc  |  30 ++
 .../30_threads/semaphore/try_acquire.cc   |  55 
 .../30_threads/semaphore/try_acquire_for.cc   |  85 +
 .../30_threads/semaphore/try_acquire_posix.cc | 153 +

Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-02 Thread Qing Zhao via Gcc-patches




> On Oct 1, 2020, at 11:20 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
>> Hi, Richard,
>> 
>> To answer the question, which registers should be included in “ALL”. 
>> I studied X86 hard register set in more details. And also consulted with 
>> H.J.Lu, And found:
>> 
>> In the current x86 implementation, mask registers, MM0-MM7 registers, and 
>> ST0-ST7 registers are not zeroed.
>> 
>> The reasons actually are:
>> 1. Mask registers are marked as “FIXED_REGS” by middle end,  (in the 
>> following place, reginfo.c, init_reg_sets_1)
>> 
>>  /* If a register is too limited to be treated as a register operand,
>> then it should never be allocated to a pseudo.  */
>>  if (!TEST_HARD_REG_BIT (operand_reg_set, i))
>>fixed_regs[i] = 1;
> 
> But isn't that only true when AVX512F is disabled?

You are right.

Yes, when AVX512F is present, mask registers are not fixed register anymore.

I just added the zeroing of mask registers into i386 implementation and also 
the testing case.

Thanks.

Qing
> 
> The question is more why the registers shouldn't be zeroed when
> they're available.
>

Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-02 Thread Qing Zhao via Gcc-patches




> On Oct 2, 2020, at 10:15 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
>>> 
>>> Going back to the default hook, I guess one option is:
>>> 
>>>  rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>>>  rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
>>>  if (!valid_insn_p (insn))
>>>sorry (…);
>> 
>> “Sorry” here will tell the user that the implementation on this platform is 
>> not valid?
> 
> Right.  If we didn't have a default implementation of the target hook,
> we would presumably need to issue a sorry () if the user tried to use
> the option on a target that didn't define the hook.  The above is a
> compromise: we instead make a reasonable attempt to handle the option,
> and issue the sorry only if that attempt fails.

Sounds reasonable, I will do this.

thanks.

Qing
> 
> Thanks,
> Richard
> 
>> 
>> Qing
>>> 
>>> but with some mechanism to avoid spewing the user with messages
>>> for the same problem.
>>> 
>>> Thanks,
>>> Richard

[PATCH v4 2/2] void.3: New link to system_data_types(7)

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Signed-off-by: Alejandro Colomar 
---
 man3/void.3 | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 man3/void.3

diff --git a/man3/void.3 b/man3/void.3
new file mode 100644
index 0..db50c0f09
--- /dev/null
+++ b/man3/void.3
@@ -0,0 +1 @@
+.so man7/system_data_types.7
-- 
2.28.0

Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-02 Thread Richard Sandiford via Gcc-patches

Qing Zhao  writes:
>> 
>> Going back to the default hook, I guess one option is:
>> 
>>   rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>>   rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
>>   if (!valid_insn_p (insn))
>> sorry (…);
>
> “Sorry” here will tell the user that the implementation on this platform is 
> not valid?

Right.  If we didn't have a default implementation of the target hook,
we would presumably need to issue a sorry () if the user tried to use
the option on a target that didn't define the hook.  The above is a
compromise: we instead make a reasonable attempt to handle the option,
and issue the sorry only if that attempt fails.

Thanks,
Richard

>
> Qing
>> 
>> but with some mechanism to avoid spewing the user with messages
>> for the same problem.
>> 
>> Thanks,
>> Richard

[PATCH v4 1/2] system_data_types.7: Add 'void *'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Signed-off-by: Alejandro Colomar 

system_data_types.7: void *: Add info about generic function parameters and 
return value

Reported-by: Paul Eggert 
Reported-by: David Laight 
Signed-off-by: Alejandro Colomar 

system_data_types.7: void *: Add info about pointer artihmetic

Reported-by: Paul Eggert 
Reported-by: David Laight 
Signed-off-by: Alejandro Colomar 

system_data_types.7: void *: Add Versions notes

Compatibility between function pointers and void * hasn't always been so.
Document when that was added to POSIX.

Reported-by: Michael Kerrisk 
Signed-off-by: Alejandro Colomar 
---
 man7/system_data_types.7 | 80 +++-
 1 file changed, 78 insertions(+), 2 deletions(-)

diff --git a/man7/system_data_types.7 b/man7/system_data_types.7
index c82d3b388..277e05b12 100644
--- a/man7/system_data_types.7
+++ b/man7/system_data_types.7
@@ -679,7 +679,6 @@ See also the
 .I uintptr_t
 and
 .I void *
-.\" TODO: Document void *
 types in this page.
 .RE
 .\"- lconv /
@@ -1780,7 +1779,6 @@ See also the
 .I intptr_t
 and
 .I void *
-.\" TODO: Document void *
 types in this page.
 .RE
 .\"- va_list --/
@@ -1814,6 +1812,84 @@ See also:
 .BR va_copy (3),
 .BR va_end (3)
 .RE
+.\"- void * ---/
+.TP
+.I void *
+.RS
+According to the C language standard,
+a pointer to any object type may be converted to a pointer to
+.I void
+and back.
+POSIX further requires that any pointer,
+including pointers to functions,
+may be converted to a pointer to
+.I void
+and back.
+.PP
+Conversions from and to any other pointer type are done implicitly,
+not requiring casts at all.
+Note that this feature prevents any kind of type checking:
+the programmer should be careful not to cast a
+.I void *
+value to a type incompatible to that of the underlying data,
+because that would result in undefined behavior.
+.PP
+This type is useful in function parameters and return value
+to allow passing values of any type.
+The function will usually use some mechanism to know
+of which type the underlying data passed to the function really is.
+.PP
+A value of this type can't be dereferenced,
+as it would give a value of type
+.I void
+which is not possible.
+Likewise, pointer arithmetic is not possible with this type.
+However, in GNU C, poitner arithmetic is allowed
+as an extension to the standard;
+this is done by treating the size of a
+.I void
+or of a function as 1.
+A consequence of this is that
+.I sizeof
+is also allowed on
+.I void
+and on function types, and returns 1.
+.PP
+The conversion specifier for
+.I void *
+for the
+.BR printf (3)
+and the
+.BR scanf (3)
+families of functions is
+.BR p ;
+resulting commonly in
+.B %p
+for printing
+.I void *
+values.
+.PP
+Versions:
+The POSIX requirement about compatibility between
+.I void *
+and function pointers was added in
+POSIX.1-2008 Technical Corrigendum 1 (2013).
+.PP
+Conforming to:
+C99 and later; POSIX.1-2001 and later.
+.PP
+See also:
+.BR malloc (3),
+.BR memcmp (3),
+.BR memcpy (3),
+.BR memset (3)
+.PP
+See also the
+.I intptr_t
+and
+.I uintptr_t
+types in this page.
+.RE
 .\"/
 .SH NOTES
 The structures described in this manual page shall contain,
-- 
2.28.0

[PATCH v4 0/2] Document 'void *'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Hi Michael,

I'm sorry I forgot to increase the version count.
And given there are conversations continuing in old threads,
you may mix them easily.  I'm a bit lost in the emails too.
I'll resend the latest patch (identically) as v4
(there's no v3, but this is the 4th time or so, so let's call it v4).

Regards,

Alex

Alejandro Colomar (2):
  system_data_types.7: Add 'void *'
  void.3: New link to system_data_types(7)

 man3/void.3  |  1 +
 man7/system_data_types.7 | 80 +++-
 2 files changed, 79 insertions(+), 2 deletions(-)
 create mode 100644 man3/void.3

-- 
2.28.0

RE: [PATCH][GCC 10] arm: Add support for Neoverse N2 CPU

2020-10-02 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Alex Coplan 
> Sent: 02 October 2020 15:49
> To: gcc-patches@gcc.gnu.org
> Cc: ni...@redhat.com; Richard Earnshaw ;
> Ramana Radhakrishnan ; Kyrylo
> Tkachov 
> Subject: [PATCH][GCC 10] arm: Add support for Neoverse N2 CPU
> 
> This patch backports the AArch32 support for Arm's Neoverse N2 CPU to
> GCC 10.
> 
> Testing:
>  * Bootstrapped and regtested on arm-none-linux-gnueabihf.
> 
> OK for GCC 10 branch?

Ok, as well as the other branches.
Thanks,
Kyrill

> 
> Thanks,
> Alex
> 
> ---
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm-cpus.in (neoverse-n2): New.
>   * config/arm/arm-tables.opt: Regenerate.
>   * config/arm/arm-tune.md: Regenerate.
>   * doc/invoke.texi: Document support for Neoverse N2.

[PATCH][GCC 8] arm: Add support for Neoverse N2 CPU

2020-10-02 Thread Alex Coplan via Gcc-patches

This patch backports the AArch32 support for Arm's Neoverse N2 CPU to
GCC 8.

Testing:
 * Bootstrapped and regtested on arm-none-linux-gnueabihf.

OK for GCC 8 branch?

Thanks,
Alex

---

gcc/ChangeLog:

* config/arm/arm-cpus.in (neoverse-n2): New.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* config/arm/driver-arm.c (arm_cpu_table): Add Neoverse N2.
* doc/invoke.texi: Document support for Neoverse N2.
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index edfe5b378da..39a9e8b76ba 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1588,6 +1588,17 @@ begin cpu neoverse-v1
 end cpu neoverse-v1
 
 
+# Armv8.5 A-profile Architecture Processors
+begin cpu neoverse-n2
+  cname neoversen2
+  tune for cortex-a57
+  tune flags LDSCHED
+  architecture armv8.4-a+fp16
+  option crypto add FP_ARMv8 CRYPTO
+  costs cortex_a57
+end cpu neoverse-n2
+
+
 # V8 M-profile implementations.
 begin cpu cortex-m23
  cname cortexm23
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 36dba62003a..a0fb8323ee3 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -354,6 +354,9 @@ Enum(processor_type) String(cortex-a75.cortex-a55) Value( 
TARGET_CPU_cortexa75co
 EnumValue
 Enum(processor_type) String(neoverse-v1) Value( TARGET_CPU_neoversev1)
 
+EnumValue
+Enum(processor_type) String(neoverse-n2) Value( TARGET_CPU_neoversen2)
+
 EnumValue
 Enum(processor_type) String(cortex-m23) Value( TARGET_CPU_cortexm23)
 
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index c972ce55576..ea3dbcda43f 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -57,6 +57,6 @@
cortexa73,exynosm1,xgene1,
cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,
cortexa73cortexa53,cortexa55,cortexa75,
-   cortexa75cortexa55,neoversev1,cortexm23,
-   cortexm33,cortexr52"
+   cortexa75cortexa55,neoversev1,neoversen2,
+   cortexm23,cortexm33,cortexr52"
(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/driver-arm.c b/gcc/config/arm/driver-arm.c
index a53c2272864..45ad92ef0e0 100644
--- a/gcc/config/arm/driver-arm.c
+++ b/gcc/config/arm/driver-arm.c
@@ -56,6 +56,7 @@ static struct vendor_cpu arm_cpu_table[] = {
 {"0xd09", "armv8-a+crc", "cortex-a73"},
 {"0xd05", "armv8.2-a+fp16+dotprod", "cortex-a55"},
 {"0xd0a", "armv8.2-a+fp16+dotprod", "cortex-a75"},
+{"0xd49", "armv8.4-a+fp16", "neoverse-n2"},
 {"0xc14", "armv7-r", "cortex-r4"},
 {"0xc15", "armv7-r", "cortex-r5"},
 {"0xc17", "armv7-r", "cortex-r7"},
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b91366daafd..78ca7738df2 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16334,8 +16334,8 @@ Permissible names are: @samp{arm2}, @samp{arm250},
 @samp{cortex-a9}, @samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a17},
 @samp{cortex-a32}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
 @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
-@samp{neoverse-v1}, @samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-r5},
-@samp{cortex-r7}, @samp{cortex-r8}, @samp{cortex-r52},
+@samp{neoverse-v1}, @samp{neoverse-n2}, @samp{cortex-r4}, @samp{cortex-r4f},
+@samp{cortex-r5}, @samp{cortex-r7}, @samp{cortex-r8}, @samp{cortex-r52},
 @samp{cortex-m33},
 @samp{cortex-m23},
 @samp{cortex-m7},

[PATCH][GCC 9] arm: Add support for Neoverse N2 CPU

2020-10-02 Thread Alex Coplan via Gcc-patches

This patch backports the AArch32 support for Arm's Neoverse N2 CPU to
GCC 9.

Testing:
 * Bootstrapped and regtested on arm-none-linux-gnueabihf.

OK for GCC 9 branch?

Thanks,
Alex

---

gcc/ChangeLog:

* config/arm/arm-cpus.in (neoverse-n2): New.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Document support for Neoverse N2.
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 747767ab386..3c375b9a7b9 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1372,6 +1372,18 @@ begin cpu neoverse-v1
   costs cortex_a57
 end cpu neoverse-v1
 
+# Armv8.5 A-profile Architecture Processors
+begin cpu neoverse-n2
+  cname neoversen2
+  tune for cortex-a57
+  tune flags LDSCHED
+  architecture armv8.5-a+fp16
+  option crypto add FP_ARMv8 CRYPTO
+  costs cortex_a57
+  vendor 41
+  part 0xd49
+end cpu neoverse-n2
+
 # V8 M-profile implementations.
 begin cpu cortex-m23
  cname cortexm23
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 5384284b53a..5befadddf9e 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -246,6 +246,9 @@ Enum(processor_type) String(cortex-a76.cortex-a55) Value( 
TARGET_CPU_cortexa76co
 EnumValue
 Enum(processor_type) String(neoverse-v1) Value( TARGET_CPU_neoversev1)
 
+EnumValue
+Enum(processor_type) String(neoverse-n2) Value( TARGET_CPU_neoversen2)
+
 EnumValue
 Enum(processor_type) String(cortex-m23) Value( TARGET_CPU_cortexm23)
 
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 1257daff074..102765e6568 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -45,6 +45,6 @@
cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,
cortexa73cortexa53,cortexa55,cortexa75,
cortexa76,neoversen1,cortexa75cortexa55,
-   cortexa76cortexa55,neoversev1,cortexm23,
-   cortexm33,cortexr52"
+   cortexa76cortexa55,neoversev1,neoversen2,
+   cortexm23,cortexm33,cortexr52"
(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e4cc83ba5cb..2eb2c82dc47 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -17570,9 +17570,9 @@ Permissible names are: @samp{arm7tdmi}, 
@samp{arm7tdmi-s}, @samp{arm710t},
 @samp{cortex-m4}, @samp{cortex-m7}, @samp{cortex-m23}, @samp{cortex-m33},
 @samp{cortex-m1.small-multiply}, @samp{cortex-m0.small-multiply},
 @samp{cortex-m0plus.small-multiply}, @samp{exynos-m1}, @samp{marvell-pj4},
-@samp{neoverse-n1}, @samp{neoverse-v1}, @samp{xscale}, @samp{iwmmxt},
-@samp{iwmmxt2}, @samp{ep9312}, @samp{fa526}, @samp{fa626}, @samp{fa606te},
-@samp{fa626te}, @samp{fmp626}, @samp{fa726te}, @samp{xgene1}.
+@samp{neoverse-n1}, @samp{neoverse-n2}, @samp{neoverse-v1}, @samp{xscale},
+@samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}, @samp{fa526}, @samp{fa626},
+@samp{fa606te}, @samp{fa626te}, @samp{fmp626}, @samp{fa726te}, @samp{xgene1}.
 
 Additionally, this option can specify that GCC should tune the performance
 of the code for a big.LITTLE system.  Permissible names are:

[PATCH][GCC 10] arm: Add support for Neoverse N2 CPU

2020-10-02 Thread Alex Coplan via Gcc-patches

This patch backports the AArch32 support for Arm's Neoverse N2 CPU to
GCC 10.

Testing:
 * Bootstrapped and regtested on arm-none-linux-gnueabihf.

OK for GCC 10 branch?

Thanks,
Alex

---

gcc/ChangeLog:

* config/arm/arm-cpus.in (neoverse-n2): New.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Document support for Neoverse N2.
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index b1fe48eb087..ca772bdcf6d 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1488,6 +1488,18 @@ begin cpu neoverse-v1
   costs cortex_a57
 end cpu neoverse-v1
 
+# Armv8.5 A-profile Architecture Processors
+begin cpu neoverse-n2
+  cname neoversen2
+  tune for cortex-a57
+  tune flags LDSCHED
+  architecture armv8.5-a+fp16+bf16+i8mm
+  option crypto add FP_ARMv8 CRYPTO
+  costs cortex_a57
+  vendor 41
+  part 0xd49
+end cpu neoverse-n2
+
 # V8 M-profile implementations.
 begin cpu cortex-m23
  cname cortexm23
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 1a7c3191784..c8f83b03b6f 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -252,6 +252,9 @@ Enum(processor_type) String(cortex-a76.cortex-a55) Value( 
TARGET_CPU_cortexa76co
 EnumValue
 Enum(processor_type) String(neoverse-v1) Value( TARGET_CPU_neoversev1)
 
+EnumValue
+Enum(processor_type) String(neoverse-n2) Value( TARGET_CPU_neoversen2)
+
 EnumValue
 Enum(processor_type) String(cortex-m23) Value( TARGET_CPU_cortexm23)
 
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 3874f42a26b..f98f7ca9ae5 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -46,6 +46,7 @@ (define_attr "tune"
cortexa73cortexa53,cortexa55,cortexa75,
cortexa76,cortexa76ae,cortexa77,
neoversen1,cortexa75cortexa55,cortexa76cortexa55,
-   neoversev1,cortexm23,cortexm33,
-   cortexm35p,cortexm55,cortexr52"
+   neoversev1,neoversen2,cortexm23,
+   cortexm33,cortexm35p,cortexm55,
+   cortexr52"
(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4c08258bf57..1d924085b02 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18824,9 +18824,9 @@ Permissible names are: @samp{arm7tdmi}, 
@samp{arm7tdmi-s}, @samp{arm710t},
 @samp{cortex-m35p}, @samp{cortex-m55},
 @samp{cortex-m1.small-multiply}, @samp{cortex-m0.small-multiply},
 @samp{cortex-m0plus.small-multiply}, @samp{exynos-m1}, @samp{marvell-pj4},
-@samp{neoverse-n1} @samp{neoverse-v1}, @samp{xscale}, @samp{iwmmxt},
-@samp{iwmmxt2}, @samp{ep9312}, @samp{fa526}, @samp{fa626}, @samp{fa606te},
-@samp{fa626te}, @samp{fmp626}, @samp{fa726te}, @samp{xgene1}.
+@samp{neoverse-n1}, @samp{neoverse-n2}, @samp{neoverse-v1}, @samp{xscale},
+@samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}, @samp{fa526}, @samp{fa626},
+@samp{fa606te}, @samp{fa626te}, @samp{fmp626}, @samp{fa726te}, @samp{xgene1}.
 
 Additionally, this option can specify that GCC should tune the performance
 of the code for a big.LITTLE system.  Permissible names are:

RE: [PATCH] arm: Add missing part number for Neoverse V1

2020-10-02 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Alex Coplan 
> Sent: 02 October 2020 15:41
> To: gcc-patches@gcc.gnu.org
> Cc: ni...@redhat.com; Richard Earnshaw ;
> Ramana Radhakrishnan ; Kyrylo
> Tkachov 
> Subject: [PATCH] arm: Add missing part number for Neoverse V1
> 
> This patch adds vendor and part numbers which were missing from the
> initial entry for Neoverse V1 in AArch32 GCC.
> 
> OK for trunk and backports to GCC 10 and 9?

Ok.
Thanks,
Kyrill

> 
> I believe GCC 8 handles these differently so that will need fixing
> separately.
> 
> Thanks,
> Alex
> 
> ---
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm-cpus.in (neoverse-v1): Add missing vendor and
>   part numbers.

[PATCH] arm: Add missing part number for Neoverse V1

2020-10-02 Thread Alex Coplan via Gcc-patches

This patch adds vendor and part numbers which were missing from the
initial entry for Neoverse V1 in AArch32 GCC.

OK for trunk and backports to GCC 10 and 9?

I believe GCC 8 handles these differently so that will need fixing
separately.

Thanks,
Alex

---

gcc/ChangeLog:

* config/arm/arm-cpus.in (neoverse-v1): Add missing vendor and
part numbers.
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 9abb59a00ba..27ce0001633 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1519,6 +1519,8 @@ begin cpu neoverse-v1
   architecture armv8.4-a+fp16+bf16+i8mm
   option crypto add FP_ARMv8 CRYPTO
   costs cortex_a57
+  vendor 41
+  part 0xd40
 end cpu neoverse-v1
 
 # Armv8.5 A-profile Architecture Processors

[PATCH][GCC 8] AArch64: Add Neoverse V1 tuning struct

2020-10-02 Thread Kyrylo Tkachov via Gcc-patches

Hi all,

This GCC 8 patch duplicates the Cortex-A72 tuning struct that's currently used 
for Neoverse V1 and AARCH64_EXTRA_TUNE_PREFER_ADVSIMD_AUTOVEC tune flag to 
prefer Advanced SIMD over SVE autovectorisation.

Bootstrapped and tested on GCC 8, pushing to the branch.

Thanks,
Kyrill

gcc/
* config/aarch64/aarch64.c (neoversev1_tunings): Define.
* config/aarch64/aarch64-cores.def (zeus): Use it.
(neoverse-v1): Likewise.


v1-tune-8.patch
Description: v1-tune-8.patch

[PATCH][GCC 9] AArch64: Add Neoverse V1 tuning struct

2020-10-02 Thread Kyrylo Tkachov via Gcc-patches

Hi all,

This GCC 9 patch duplicates the Neoverse N1 tuning struct that's currently used 
for Neoverse V1 and AARCH64_EXTRA_TUNE_PREFER_ADVSIMD_AUTOVEC tune flag to 
prefer Advanced SIMD over SVE autovectorisation.

Bootstrapped and tested on GCC 9, pushing to the branch.

Thanks,
Kyrill

gcc/
* config/aarch64/aarch64.c (neoversev1_tunings): Define.
* config/aarch64/aarch64-cores.def (zeus): Use it.
(neoverse-v1): Likewise.


v1-tune-9.patch
Description: v1-tune-9.patch

Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-02 Thread Qing Zhao via Gcc-patches




> 
> Going back to the default hook, I guess one option is:
> 
>   rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>   rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
>   if (!valid_insn_p (insn))
> sorry (…);

“Sorry” here will tell the user that the implementation on this platform is not 
valid?

Qing
> 
> but with some mechanism to avoid spewing the user with messages
> for the same problem.
> 
> Thanks,
> Richard

[PATCH] AArch64: Add neoversev1_tunings struct

2020-10-02 Thread Kyrylo Tkachov via Gcc-patches

Hi all,

This patch adds a Neoverse V1-specific tuning struct that currently is just a 
deduplication of the N1 struct it was using before and specifying the SVE width.
This will allow us to tweak Neoverse V1 things in the future as needed.

Bootstrapped and tested on aarch64-none-linux-gnu.
Committing to trunk and will backport similar patches to the branches.

Thanks,
Kyrill

gcc/
* config/aarch64/aarch64.c (neoversev1_tunings): Define.
* config/aarch64/aarch64-cores.def (zeus): Use it.
(neoverse-v1): Likewise.


v1-tune.patch
Description: v1-tune.patch

Re: [PATCH] options: Save and restore opts_set for Optimization and Target options

2020-10-02 Thread Stefan Schulze Frielinghaus via Gcc-patches

On Fri, Oct 02, 2020 at 10:46:33AM +0200, Jakub Jelinek wrote:
> On Wed, Sep 30, 2020 at 03:24:08PM +0200, Stefan Schulze Frielinghaus via 
> Gcc-patches wrote:
> > On Wed, Sep 30, 2020 at 01:39:11PM +0200, Jakub Jelinek wrote:
> > > On Wed, Sep 30, 2020 at 01:21:44PM +0200, Stefan Schulze Frielinghaus 
> > > wrote:
> > > > I think the problem boils down that on S/390 we distinguish between four
> > > > states of a flag: explicitely set to yes/no and implicitely set to
> > > > yes/no.  If set explicitely, the option wins.  For example, the options
> > > > `-march=z10 -mhtm` should enable the hardware transactional memory
> > > > option although z10 does not have one.  In the past if a flag was set or
> > > > not explicitely was encoded into opts_set->x_target_flags ... for each
> > > > flag individually, e.g. TARGET_OPT_HTM_P (opts_set->x_target_flags) was
> > > 
> > > Oops, seems I've missed that set_option has special treatment for
> > > CLVC_BIT_CLEAR/CLVC_BIT_SET.
> > > Which means I'll need to change the generic handling, so that for
> > > global_options_set elements mentioned in CLVC_BIT_* options are treated
> > > differently, instead of using the accumulated bitmasks they'll need to use
> > > their specific bitmask variables during the option saving/restoring.
> > > Is it ok if I defer it for tomorrow? Need to prepare for OpenMP meeting 
> > > now.
> > 
> > Sure, no problem at all.  In that case I stop to investigate further and
> > wait for you.
> 
> Here is a patch that implements that.
> 
> Can you please check if it fixes the s390x regressions that I couldn't
> reproduce in a cross?

Bootstrapped and regtested on S/390. Now all tattr-*.c test cases run
successfully with the patch. All other tests remain the same.

Thanks for the quick follow up!

Cheers,
Stefan

> 
> Bootstrapped/regtested on x86_64-linux and i686-linux so far.
> I don't have a convenient way to test it on the trunk on other
> architectures ATM, so I've just started testing a backport of the patchset to 
> 10
> on {x86_64,i686,powerpc64le,s390x,armv7hl,aarch64}-linux (though, don't
> intend to actually commit the backport).
> 
> 2020-10-02  Jakub Jelinek  
> 
>   * opth-gen.awk: For variables referenced in Mask and InverseMask,
>   don't use the explicit_mask bitmask array, but add separate
>   explicit_mask_* members with the same types as the variables.
>   * optc-save-gen.awk: Save, restore, compare and hash the separate
>   explicit_mask_* members.
> 
> --- gcc/opth-gen.awk.jj   2020-09-14 09:04:35.866854351 +0200
> +++ gcc/opth-gen.awk  2020-10-01 21:52:30.855122749 +0200
> @@ -209,6 +209,7 @@ n_target_int = 0;
>  n_target_enum = 0;
>  n_target_other = 0;
>  n_target_explicit = n_extra_target_vars;
> +n_target_explicit_mask = 0;
>  
>  for (i = 0; i < n_target_save; i++) {
>   if (target_save_decl[i] ~ "^((un)?signed +)?int +[_" alnum "]+$")
> @@ -240,6 +241,12 @@ if (have_save) {
>   var_save_seen[name]++;
>   n_target_explicit++;
>   otype = var_type_struct(flags[i])
> +
> + if (opt_args("Mask", flags[i]) != "" \
> + || opt_args("InverseMask", flags[i]))
> + 
> var_target_explicit_mask[n_target_explicit_mask++] \
> + = otype "explicit_mask_" name;
> +
>   if (otype ~ "^((un)?signed +)?int *$")
>   var_target_int[n_target_int++] = otype "x_" 
> name;
>  
> @@ -259,6 +266,8 @@ if (have_save) {
>  } else {
>   var_target_int[n_target_int++] = "int x_target_flags";
>   n_target_explicit++;
> + var_target_explicit_mask[n_target_explicit_mask++] \
> + = "int explicit_mask_target_flags";
>  }
>  
>  for (i = 0; i < n_target_other; i++) {
> @@ -281,8 +290,12 @@ for (i = 0; i < n_target_char; i++) {
>   print "  " var_target_char[i] ";";
>  }
>  
> -print "  /* " n_target_explicit " members */";
> -print "  unsigned HOST_WIDE_INT explicit_mask[" int ((n_target_explicit + 
> 63) / 64) "];";
> +print "  /* " n_target_explicit - n_target_explicit_mask " members */";
> +print "  unsigned HOST_WIDE_INT explicit_mask[" int ((n_target_explicit - 
> n_target_explicit_mask + 63) / 64) "];";
> +
> +for (i = 0; i < n_target_explicit_mask; i++) {
> + print "  " var_target_explicit_mask[i] ";";
> +}
>  
>  print "};";
>  print "";
> --- gcc/optc-save-gen.awk.jj  2020-09-16 10:06:23.018093486 +0200
> +++ gcc/optc-save-gen.awk 2020-10-01 21:48:10.933868862 +0200
> @@ -516,6 +516,10 @@ if (have_save) {
>  
>   var_save_seen[name]++;
>   otype = var_type_struct(flags[i])
> + if (opt_args("Mask", flags[i]) != "" \
> + || opt_args("InverseMask", flags[i]))
> + var_target_explicit_mask[name] = 1;
> +
>   if (otype ~ "^((un)?signed +)?int *$")
>

Re: [PATCH] Add if-chain to switch conversion pass.

2020-10-02 Thread Andrew MacLeod via Gcc-patches


On 10/2/20 9:26 AM, Martin Liška wrote:
Yes, you simply get all sorts of conditions that hold when a 
condition is

true, not just those based on the SSA name you put in.  But it occured
to me that the use-case is somewhat different - for switch-conversion
you want to know whether the test _exactly_ matches a range test,
the VRP worker will not tell you that.  For example if you had
if (x &&  a > 3 && a < 7) then it will give you 'a in [4, 6]' and it 
might
not give you 'x in [1, 1]' (for example if x is float).  But that's 
required

for correctness.


Hello.

Adding Ranger guys. Is it something that can be handled by the 
upcoming changes in VRP?


Presumably. It depends on exactly how the code lays out.  We dont 
process floats, so we wont know anything about the float (at least this 
release :-).  We will sort through complex logicals and tell you what we 
do know, so if x is integral



   if (x &&  a > 3 && a < 7)

will give you, on the final true edge:

x_5(D)  int [-INF, -1][1, +INF]
a_6(D)  int [4, 6]


IF x is a float, then we wont give you anything for x obviously, but on 
the eventual true edge we'd still give you

a_6(D)  int [4, 6]


Andrew

RE: [PATCH v2][GCC] arm: Add +nomve and +nomve.fp options to -mcpu=cortex-m55

2020-10-02 Thread Kyrylo Tkachov via Gcc-patches

Hi Joe,

> -Original Message-
> From: Gcc-patches  On Behalf Of Joe
> Ramsay
> Sent: 19 August 2020 17:12
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH v2][GCC] arm: Add +nomve and +nomve.fp options to -
> mcpu=cortex-m55
> 
> From: Joe Ramsay 
> 
> Hi all,
> 
> This patch rearranges feature bits for MVE and FP to implement the
> following flags for -mcpu=cortex-m55.
> 
>   - +nomve:equivalent to armv8.1-m.main+fp.dp+dsp.
>   - +nomve.fp: equivalent to armv8.1-m.main+mve+fp.dp (+dsp is implied by
> +mve).
>   - +nofp: equivalent to armv8.1-m.main+mve (+dsp is implied by +mve).
>   - +nodsp:equivalent to armv8.1-m.main+fp.dp.
> 
> Combinations of the above:
> 
>   - +nomve+nofp: equivalent to armv8.1-m.main+dsp.
>   - +nodsp+nofp: equivalent to armv8.1-m.main.
> 
> Due to MVE and FP sharing vfp_base, some new syntax was required in the
> CPU
> description to implement the concept of 'implied bits'. These are non-named
> features added to the ISA late, depending on whether one or more features
> which
> depend on them are present. This means vfp_base can be present when only
> one of
> MVE and FP is removed, but absent when both are removed.
> 
> Bootstrapped and tested on arm-none-eabi. OK for master?

Ok, since Richard reviewed the code and his requested extra testing looks ok to 
me.
Thanks,
Kyrill

> 
> Thanks,
> Joe
> 
> gcc/ChangeLog:
> 
> 2020-07-31  Joe Ramsay  
> 
>   * config/arm/arm-cpus.in:
>   (ALL_FPU_INTERNAL): Remove vfp_base.
>   (VFPv2): Remove vfp_base.
>   (MVE): Remove vfp_base.
>   (vfp_base): Redefine as implied bit dependent on MVE or FP
>   (cortex-m55): Add flags to disable MVE, MVE FP, FP and DSP
> extensions.
>   * config/arm/arm.c (arm_configure_build_target): Add implied bits
> to ISA.
>   * config/arm/parsecpu.awk:
>   (gen_isa): Print implied bits and their dependencies to ISA header.
>   (gen_data): Add parsing for implied feature bits.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-07-31  Joe Ramsay  
> 
>   * gcc.target/arm/multilib.exp: Add tests for -mcpu=cortex-m55.
>   * gcc.target/arm/cortex-m55-nodsp-flag.c: New test.
>   * gcc.target/arm/cortex-m55-nodsp-nofp-flag.c: New test.
>   * gcc.target/arm/cortex-m55-nofp-flag.c: New test.
>   * gcc.target/arm/cortex-m55-nofp-nomve-flag.c: New test.
>   * gcc.target/arm/cortex-m55-nomve-flag.c: New test.
>   * gcc.target/arm/cortex-m55-nomve.fp-flag.c: New test.
> ---
>  gcc/config/arm/arm-cpus.in | 26 ---
>  gcc/config/arm/arm.c   | 14 ++
>  gcc/config/arm/parsecpu.awk| 51 
> ++
>  .../gcc.target/arm/cortex-m55-nodsp-flag-hard.c| 15 +++
>  .../gcc.target/arm/cortex-m55-nodsp-flag-softfp.c  | 15 +++
>  .../arm/cortex-m55-nodsp-nofp-flag-softfp.c| 15 +++
>  .../gcc.target/arm/cortex-m55-nofp-flag-hard.c | 15 +++
>  .../gcc.target/arm/cortex-m55-nofp-flag-softfp.c   | 15 +++
>  .../arm/cortex-m55-nofp-nomve-flag-softfp.c| 15 +++
>  .../gcc.target/arm/cortex-m55-nomve-flag-hard.c| 15 +++
>  .../gcc.target/arm/cortex-m55-nomve-flag-softfp.c  | 15 +++
>  .../gcc.target/arm/cortex-m55-nomve.fp-flag-hard.c | 15 +++
>  .../arm/cortex-m55-nomve.fp-flag-softfp.c  | 15 +++
>  gcc/testsuite/gcc.target/arm/multilib.exp  | 16 +++
>  14 files changed, 250 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/cortex-m55-nodsp-flag-
> hard.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/cortex-m55-nodsp-flag-
> softfp.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/cortex-m55-nodsp-nofp-
> flag-softfp.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/cortex-m55-nofp-flag-
> hard.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/cortex-m55-nofp-flag-
> softfp.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/cortex-m55-nofp-
> nomve-flag-softfp.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/cortex-m55-nomve-flag-
> hard.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/cortex-m55-nomve-flag-
> softfp.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/cortex-m55-nomve.fp-
> flag-hard.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/cortex-m55-nomve.fp-
> flag-softfp.c
> 
> diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
> index c98f8ed..5083028 100644
> --- a/gcc/config/arm/arm-cpus.in
> +++ b/gcc/config/arm/arm-cpus.in
> @@ -135,10 +135,6 @@ define feature armv8_1m_main
>  # Floating point and Neon extensions.
>  # VFPv1 is not supported in GCC.
> 
> -# This feature bit is enabled for all VFP, MVE and
> -# MVE with floating point extensions.
> -define feature vfp_base
> -
>  # Vector floating point v2.
>  define feature vfpv2
> 
> @@ -251,7 +247,7 @@ define fgroup ALL_SIMD
>   ALL_SIMD_INTERNAL ALL_SIMD_EXTERNAL
> 
>  # List of all FPU bits to strip out if -mfpu is used to override

RE: [PATCH] arm: Remove coercion from scalar argument to vmin & vmax intrinsics

2020-10-02 Thread Kyrylo Tkachov via Gcc-patches

Hi Joe,

> -Original Message-
> From: Gcc-patches  On Behalf Of Joe
> Ramsay
> Sent: 13 August 2020 14:16
> To: Joe Ramsay ; gcc-patches@gcc.gnu.org
> Subject: [PATCH] arm: Remove coercion from scalar argument to vmin &
> vmax intrinsics
> 
> From: Joe Ramsay 
> 
> Hi,
> 
> This patch fixes an issue with vmin* and vmax* intrinsics which accept
> a scalar argument. Previously when the scalar was of different width
> to the vector elements this would generate __ARM_undef. This change
> allows the scalar argument to be implicitly converted to the correct
> width. Also tidied up the relevant unit tests, some of which would
> have passed even if only one of two or three intrinsic calls had
> compiled correctly.
> 
> Bootstrapped and tested on arm-none-eabi, gcc and CMSIS_DSP
> testsuites are clean. OK for trunk?

Ok.
Sorry for the delay,
Kyrill

> 
> Thanks,
> Joe
> 
> gcc/ChangeLog:
> 
> 2020-08-10  Joe Ramsay 
> 
>   * config/arm/arm_mve.h (__arm_vmaxnmavq): Remove coercion of
> scalar
>   argument.
>   (__arm_vmaxnmvq): Likewise.
>   (__arm_vminnmavq): Likewise.
>   (__arm_vminnmvq): Likewise.
>   (__arm_vmaxnmavq_p): Likewise.
>   (__arm_vmaxnmvq_p): Likewise (and delete duplicate definition).
>   (__arm_vminnmavq_p): Likewise.
>   (__arm_vminnmvq_p): Likewise.
>   (__arm_vmaxavq): Likewise.
>   (__arm_vmaxavq_p): Likewise.
>   (__arm_vmaxvq): Likewise.
>   (__arm_vmaxvq_p): Likewise.
>   (__arm_vminavq): Likewise.
>   (__arm_vminavq_p): Likewise.
>   (__arm_vminvq): Likewise.
>   (__arm_vminvq_p): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-08-10  Joe Ramsay 
> 
>   * gcc.target/arm/mve/intrinsics/vmaxavq_p_s16.c: Add test for
> mismatched
>   width of scalar argument.
>   * gcc.target/arm/mve/intrinsics/vmaxavq_p_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxavq_p_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxavq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxavq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxavq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxnmavq_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxnmavq_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxnmvq_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxnmvq_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_p_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_p_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_p_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_p_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_p_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_p_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vmaxvq_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminavq_p_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminavq_p_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminavq_p_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminavq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminavq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminavq_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminnmavq_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminnmavq_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminnmavq_p_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminnmavq_p_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminnmvq_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminnmvq_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminnmvq_p_f16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminnmvq_p_f32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminvq_p_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminvq_p_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminvq_p_s8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminvq_p_u16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminvq_p_u32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminvq_p_u8.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminvq_s16.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminvq_s32.c: Likewise.
>   * gcc.target/arm/mve/intrinsics/vminvq_s8.c: Likewise.
>   *

Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-02 Thread Qing Zhao via Gcc-patches




> On Oct 1, 2020, at 11:20 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
>> Hi, Richard,
>> 
>> To answer the question, which registers should be included in “ALL”. 
>> I studied X86 hard register set in more details. And also consulted with 
>> H.J.Lu, And found:
>> 
>> In the current x86 implementation, mask registers, MM0-MM7 registers, and 
>> ST0-ST7 registers are not zeroed.
>> 
>> The reasons actually are:
>> 1. Mask registers are marked as “FIXED_REGS” by middle end,  (in the 
>> following place, reginfo.c, init_reg_sets_1)
>> 
>>  /* If a register is too limited to be treated as a register operand,
>> then it should never be allocated to a pseudo.  */
>>  if (!TEST_HARD_REG_BIT (operand_reg_set, i))
>>fixed_regs[i] = 1;
> 
> But isn't that only true when AVX512F is disabled?
> 
> The question is more why the registers shouldn't be zeroed when
> they're available.

If they are not treated as fixed, then we should zeroed them.

> 
>> 2. MM0-MM7 registers and ST0-ST7 registers are aliased with each other, 
>> (i.e, the same set of registers have two different
>> Names), so, zero individual mm or st register will be very impractical. 
>> However, we can zero them together with emms. 
> 
> Ah, OK.
> 
>> So, my conclusion is, 
>> 
>> 1. For “ALL”, we should include all call_used_regs that are not fixed_regs. 
>> 2. For X86 implementation, I added more comments, and also add clearing all 
>> mm and st registers with emms.
>> 
>> In general, “ALL” should include all call_used_regs that are not fixed_regs. 
> 
> Right.  I thought the original implementation already excluded fixed
> registers, but perhaps I'm misremembering.

Yes, my current implementation in middle end already excluded fixed registers.

>  I agree that that's the
> sensible default behaviour.
> 
> Going back to the default hook, I guess one option is:
> 
>   rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>   rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
>   if (!valid_insn_p (insn))
> sorry (…);

The valid_insn_p approach will exclude all FP registers for rs6000 as I 
mentioned in the previous email, I don’t think that this is the right fix to 
the rs6000 ICE. 
I’d rather leave this problem there, and hopefully rs6000 developer can 
override this issue very soon?
> 
> but with some mechanism to avoid spewing the user with messages
> for the same problem.
> 
> Thanks,
> Richard

Re: Track access ranges in ipa-modref

2020-10-02 Thread Richard Biener

On Fri, 2 Oct 2020, Jan Hubicka wrote:

> Hi,
> this patch implements tracking of access ranges.  This is only applied when
> base pointer is an arugment. Incrementally i will extend it to also track
> TBAA basetype so we can disambiguate ranges for accesses to same basetype
> (which makes is quite bit more effective). For this reason i track the access
> offset separately from parameter offset (the second track combined adjustments
> to the parameter). This is I think last feature I would like to add to the
> memory access summary this stage1.
> 
> Further work will be needed to opitmize the summary and merge adjacent
> range/make collapsing more intelingent (so we do not lose track that often),
> but I wanted to keep basic patch simple.
> 
> According to the cc1plus stats:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 64108082 disambiguations, 74386675 queries
>   ref_maybe_used_by_call_p: 142319 disambiguations, 65004781 queries
>   call_may_clobber_ref_p: 23587 disambiguations, 29420 queries
>   nonoverlapping_component_refs_p: 0 disambiguations, 38117 queries
>   nonoverlapping_refs_since_match_p: 19489 disambiguations, 55748 must 
> overlaps, 76044 queries
>   aliasing_component_refs_p: 54763 disambiguations, 755876 queries
>   TBAA oracle: 24184658 disambiguations 56823187 queries
>16260329 are in alias set 0
>10617146 queries asked about the same object
>125 queries asked about the same alias set
>0 access volatile
>3960555 are dependent in the DAG
>1800374 are aritificially in conflict with void *
> 
> Modref stats:
>   modref use: 10656 disambiguations, 47037 queries
>   modref clobber: 1473322 disambiguations, 1961464 queries
>   5027242 tbaa queries (2.563005 per modref query)
>   649087 base compares (0.330920 per modref query)
> 
> PTA query stats:
>   pt_solution_includes: 977385 disambiguations, 13609749 queries
>   pt_solutions_intersect: 1032703 disambiguations, 13187507 queries
> 
> Which should still compare with
> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554930.html
> there is about 2% more load disambiguations and 3.6% more store that is not
> great, but the TBAA part helps noticeably more and also this should help
> with -fno-strict-aliasing.
> 
> I plan to work on improving param tracking too.
> 
> Bootstrapped/regtested x86_64-linux with the other changes, OK?

LGTM.

Richard.

> 2020-10-02  Jan Hubicka  
> 
>   * ipa-modref-tree.c (test_insert_search_collapse): Update andling
>   of accesses.
>   (test_merge): Likewise.
>   * ipa-modref-tree.h (struct modref_access_node): Add offset, size,
>   max_size, parm_offset and parm_offset_known.
>   (modref_access_node::useful_p): Constify.
>   (modref_access_node::range_info_useful_p): New predicate.
>   (modref_access_node::operator==): New.
>   (struct modref_parm_map): New structure.
>   (modref_tree::merge): Update for racking parameters)
>   * ipa-modref.c (dump_access): Dump new fields.
>   (get_access): Fill in new fields.
>   (merge_call_side_effects): Update handling of parm map.
>   (write_modref_records): Stream new fields.
>   (read_modref_records): Stream new fields.
>   (compute_parm_map): Update for new parm map.
>   (ipa_merge_modref_summary_after_inlining): Update.
>   (modref_propagate_in_scc): Update.
>   * tree-ssa-alias.c (modref_may_conflict): Handle known ranges.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-10-02  Jan Hubicka  
> 
>   * gcc.dg/tree-ssa/modref-3.c: New test.
> 
> diff --git a/gcc/ipa-modref-tree.c b/gcc/ipa-modref-tree.c
> index 499dc60b75e..1a595090b6c 100644
> --- a/gcc/ipa-modref-tree.c
> +++ b/gcc/ipa-modref-tree.c
> @@ -35,7 +35,7 @@ test_insert_search_collapse ()
>  {
>modref_base_node *base_node;
>modref_ref_node *ref_node;
> -  modref_access_node a = { -1 };
> +  modref_access_node a = unspecified_modref_access_node;
>  
>modref_tree *t = new modref_tree(1, 2, 2);
>ASSERT_FALSE (t->every_base);
> @@ -118,7 +118,7 @@ test_merge ()
>  {
>modref_tree *t1, *t2;
>modref_base_node *base_node;
> -  modref_access_node a = { -1 };
> +  modref_access_node a = unspecified_modref_access_node;
>  
>t1 = new modref_tree(3, 4, 1);
>t1->insert (1, 1, a);
> diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
> index abf3fc18b05..b37280d18c7 100644
> --- a/gcc/ipa-modref-tree.h
> +++ b/gcc/ipa-modref-tree.h
> @@ -44,17 +44,56 @@ struct ipa_modref_summary;
>  /* Memory access.  */
>  struct GTY(()) modref_access_node
>  {
> +
> +  /* Access range information (in bits).  */
> +  poly_int64 offset;
> +  poly_int64 size;
> +  poly_int64 max_size;
> +
> +  /* Offset from parmeter pointer to the base of the access (in bytes).  */
> +  poly_int64 parm_offset;
> +
>/* Index of parameter which specifies the base of access. -1 if base is not
>   a function parameter.  */
>int

[PATCH] optimize permutes in SLP, remove vect_attempt_slp_rearrange_stmts

2020-10-02 Thread Richard Biener

This introduces a permute optimization phase for SLP which is
intended to cover the existing permute eliding for SLP reductions
plus handling commonizing the easy cases.

It currently uses graphds to compute a postorder on the reverse
SLP graph and it handles all cases vect_attempt_slp_rearrange_stmts
did (hopefully - I've adjusted most testcases that triggered it
a few days ago).  It restricts itself to move around bijective
permutations to simplify things for now, mainly around constant nodes.

As a prerequesite it makes the SLP graph cyclic (ugh).  It looks
like it would pay off to compute a PRE/POST order visit array
once and elide all the recursive SLP graph walks and their
visited hash-set.  At least for the time where we do not change
the SLP graph during such walk.

I do not like using graphds too much but at least I don't have to
re-implement yet another RPO walk, so maybe it isn't too bad.

Comments are welcome - I do want to see vect_attempt_slp_rearrange_stmts
go way for GCC 11 and the permute optimization helps non-store
BB vectorization opportunities where we can end up with a lot of
useless load permutes otherwise.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2020-10-02  Richard Biener  

* tree-vect-data-refs.c (vect_slp_analyze_instance_dependence):
Use SLP_TREE_REPRESENTATIVE.
* tree-vectorizer.h (_slp_tree::vertex): New member used
for graphds interfacing.
* tree-vect-slp.c (vect_build_slp_tree_2): Allocate space
for PHI SLP children.
(vect_analyze_slp_backedges): New function filling in SLP
node children for PHIs that correspond to backedge values.
(vect_analyze_slp): Call vect_analyze_slp_backedges for the
graph.
(vect_slp_analyze_node_operations): Deal with a cyclic graph.
(vect_schedule_slp_instance): Likewise.
(vect_schedule_slp): Likewise.
(slp_copy_subtree): Remove.
(vect_slp_rearrange_stmts): Likewise.
(vect_attempt_slp_rearrange_stmts): Likewise.
(vect_slp_build_vertices): New functions.
(vect_slp_permute): Likewise.
(vect_optimize_slp): Remove special code to elide
permutations with SLP reductions.  Implement generic
permute optimization.

* gcc.dg/vect/bb-slp-50.c: New testcase.
* gcc.dg/vect/bb-slp-51.c: Likewise.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-50.c |  20 +
 gcc/testsuite/gcc.dg/vect/bb-slp-51.c |  20 +
 gcc/tree-vect-data-refs.c |   2 +-
 gcc/tree-vect-slp.c   | 660 +-
 gcc/tree-vectorizer.h |   2 +
 5 files changed, 479 insertions(+), 225 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-50.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-51.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-50.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-50.c
new file mode 100644
index 000..80216be4ebf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-50.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+
+double a[2];
+double b[2];
+double c[2];
+double d[2];
+double e[2];
+void foo(double x)
+{
+  double tembc0 = b[1] + c[1];
+  double tembc1 = b[0] + c[0];
+  double temde0 = d[0] + e[1];
+  double temde1 = d[1] + e[0];
+  a[0] = tembc0 + temde0;
+  a[1] = tembc1 + temde1;
+}
+
+/* We should common the permutation on the tembc{0,1} operands.  */
+/* { dg-final { scan-tree-dump-times "add new stmt: \[^\\n\\r\]* = 
VEC_PERM_EXPR" 2 "slp2" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-51.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-51.c
new file mode 100644
index 000..1481018428e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-51.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+
+double a[2];
+double b[2];
+double c[2];
+double e[2];
+void foo(double x)
+{
+  double tembc0 = b[1] + c[1];
+  double tembc1 = b[0] + c[0];
+  double temde0 = 5 + e[1];
+  double temde1 = 11 + e[0];
+  a[0] = tembc0 + temde0;
+  a[1] = tembc1 + temde1;
+}
+
+/* We should common the permutations on the tembc{0,1} and temde{0,1}
+   operands.  */
+/* { dg-final { scan-tree-dump-times "add new stmt: \[^\\r\\n\]* 
VEC_PERM_EXPR" 1 "slp2" } } */
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 5bf93e2942b..fdc1f47dded 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -841,7 +841,7 @@ vect_slp_analyze_instance_dependence (vec_info *vinfo, 
slp_instance instance)
 
   /* The stores of this instance are at the root of the SLP tree.  */
   slp_tree store = SLP_INSTANCE_TREE (instance);
-  if (! STMT_VINFO_DATA_REF (SLP_TREE_SCALAR_STMTS (store)[0]))
+  if (! STMT_VINFO_DATA_REF (SLP_TREE_REPRESENTATIVE (store)))
 store = NULL;
 
   /* Verify we can sink stores to the vectorized stmt insert location.  */
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c

Re: Perforate fnspec attribute strings

2020-10-02 Thread Richard Biener

On Fri, 2 Oct 2020, Jan Hubicka wrote:

> Hi,
> as discussed this patch makes return value and arg specifiers to be 2
> characters long and updates (I hope) all fnspec strings.
> I also enabled part of the verification (just accepting the fortran bug
> with 'R' and 'W' in return value specifiers) to get some evidence that
> the update is complete.
> 
> Bootstrapped/regtested x86_64-linux, OK?

OK.

Richard.

> gcc/ChangeLog:
> 
> 2020-10-02  Jan Hubicka  
> 
>   * attr-fnspec.h: Update documentation.
>   (attr_fnsec::return_desc_size): Set to 2
>   (attr_fnsec::arg_desc_size): Set to 2
>   * builtin-attrs.def (STR1): Update fnspec.
>   * internal-fn.def (UBSAN_NULL): Update fnspec.
>   (UBSAN_VPTR): Update fnspec.
>   (UBSAN_PTR): Update fnspec.
>   (ASAN_CHECK): Update fnspec.
>   (GOACC_DIM_SIZE): Remove fnspec.
>   (GOACC_DIM_POS): Remove fnspec.
>   * tree-ssa-alias.c (attr_fnspec::verify): Update verification.
> 
> gcc/fortran/ChangeLog:
> 
> 2020-10-02  Jan Hubicka  
> 
>   * trans-decl.c (gfc_build_library_function_decl_with_spec): Verify
>   fnspec.
>   (gfc_build_intrinsic_function_decls): Update fnspecs.
>   (gfc_build_builtin_function_decls): Update fnspecs.
>   * trans-io.c (gfc_build_io_library_fndecls): Update fnspecs.
>   * trans-types.c (create_fn_spec): Update fnspecs.
> 
> 
> diff --git a/gcc/attr-fnspec.h b/gcc/attr-fnspec.h
> index 607c0cf0f54..921bb48ae6a 100644
> --- a/gcc/attr-fnspec.h
> +++ b/gcc/attr-fnspec.h
> @@ -25,15 +25,22 @@
>   '1'...'4'  specifies number of argument function returns (as in memset)
>   'm' specifies that returned value is noalias (as in malloc)
>   '.' specifies that nothing is known.
> +   character 1  specifies additional function properties
> + ' 'specifies that nothing is known
>  
> -   character 1+i specifies properties of argument number i as follows:
> +   character 2+2i specifies properties of argument number i as follows:
>   'x' or 'X' specifies that parameter is unused.
>   'r' or 'R' specifies that parameter is only read and memory pointed to 
> is
>   never dereferenced.
>   'w' or 'W' specifies that parameter is only written to.
>   '.' specifies that nothing is known.
> The uppercase letter in addition specifies that parameter
> -   is non-escaping.  */
> +   is non-escaping. 
> +
> +   character 3+2i specifies additional properties of argument number i
> +   as follows:
> + ' 'nothing is known
> + */
>  
>  #ifndef ATTR_FNSPEC_H
>  #define ATTR_FNSPEC_H
> @@ -46,9 +53,9 @@ private:
>/* length of the fn spec string.  */
>const unsigned len;
>/* Number of characters specifying return value.  */
> -  const unsigned int return_desc_size = 1;
> +  const unsigned int return_desc_size = 2;
>/* Number of characters specifying size.  */
> -  const unsigned int arg_desc_size = 1;
> +  const unsigned int arg_desc_size = 2;
>  
>/* Return start of specifier of arg i.  */
>unsigned int arg_idx (int i)
> diff --git a/gcc/builtin-attrs.def b/gcc/builtin-attrs.def
> index 3239311b5a4..778bc8a43a1 100644
> --- a/gcc/builtin-attrs.def
> +++ b/gcc/builtin-attrs.def
> @@ -66,7 +66,7 @@ DEF_ATTR_FOR_INT (6)
>DEF_ATTR_STRING (ATTR_##ENUM, VALUE)   \
>DEF_ATTR_TREE_LIST (ATTR_LIST_##ENUM, ATTR_NULL,   \
> ATTR_##ENUM, ATTR_NULL)
> -DEF_ATTR_FOR_STRING (STR1, "1")
> +DEF_ATTR_FOR_STRING (STR1, "1 ")
>  #undef DEF_ATTR_FOR_STRING
>  
>  /* Construct a tree for a list of two integers.  */
> diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
> index 2be9df40d2c..5940a1fd10c 100644
> --- a/gcc/fortran/trans-decl.c
> +++ b/gcc/fortran/trans-decl.c
> @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gomp-constants.h"
>  #include "gimplify.h"
>  #include "omp-general.h"
> +#include "attr-fnspec.h"
>  
>  #define MAX_LABEL_VALUE 9
>  
> @@ -3306,6 +3307,11 @@ gfc_build_library_function_decl_with_spec (tree name, 
> const char *spec,
>tree ret;
>va_list args;
>va_start (args, nargs);
> +  if (flag_checking)
> +{
> +  attr_fnspec fnspec (spec, strlen (spec));
> +  fnspec.verify ();
> +}
>ret = build_library_function_decl_1 (name, spec, rettype, nargs, args);
>va_end (args);
>return ret;
> @@ -3325,144 +3331,144 @@ gfc_build_intrinsic_function_decls (void)
>  
>/* String functions.  */
>gfor_fndecl_compare_string = gfc_build_library_function_decl_with_spec (
> - get_identifier (PREFIX("compare_string")), "..R.R",
> + get_identifier (PREFIX("compare_string")), ". . R . R ",
>   integer_type_node, 4, gfc_charlen_type_node, pchar1_type_node,
>   gfc_charlen_type_node, pchar1_type_node);
>DECL_PURE_P (gfor_fndecl_compare_string) = 1;
>TREE_NOTHROW (gfor_fndecl_compare_string) = 1;
>  
>gfor_fndecl_concat_string =

PING [PATCH] x86: Require MMX for __builtin_ia32_maskmovq

2020-10-02 Thread H.J. Lu via Gcc-patches

On Mon, Sep 21, 2020 at 6:09 AM H.J. Lu  wrote:
>
> On Mon, Sep 21, 2020 at 5:54 AM H.J. Lu  wrote:
> >
> > Since "MASKMOVQ mm1, mm2" is an SSE instruction which requires MMX and
> > MMX/SSE ISAs are handled separately, make __builtin_ia32_maskmovq require
> > MMX instead of SSE.
> >
> > gcc/
> >
> > PR target/97140
> > * config/i386/i386-expand.c (ix86_expand_builtin): Require MMX
> > for __builtin_ia32_maskmovq.
> > * config/i386/mmx.md (mmx_maskmovq): Replace TARGET_SSE with
> > TARGET_MMX.
> > (*mmx_maskmovq): Likewise.
> >
> > gcc/testsuite/
> >
> > PR target/97140
> > * gcc.target/i386/pr97140.c: New test.
> > ---
> >  gcc/config/i386/i386-expand.c   |  6 +-
> >  gcc/config/i386/mmx.md  |  4 ++--
> >  gcc/testsuite/gcc.target/i386/pr97140.c | 10 ++
> >  3 files changed, 17 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr97140.c
> >
> > diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> > index e6f8b314f18..e6285cf592e 100644
> > --- a/gcc/config/i386/i386-expand.c
> > +++ b/gcc/config/i386/i386-expand.c
> > @@ -10982,7 +10982,11 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
> > subtarget,
> > == (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4))
> >&& (isa & (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4)) != 0)
> >  isa |= (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_FMA4);
> > -  if ((bisa & OPTION_MASK_ISA_MMX) && !TARGET_MMX && TARGET_MMX_WITH_SSE)
> > +  /* NB: __builtin_ia32_maskmovq requires MMX.  */
> > +  if (fcode != IX86_BUILTIN_MASKMOVQ
> > +  && (bisa & OPTION_MASK_ISA_MMX)
> > +  && !TARGET_MMX
> > +  && TARGET_MMX_WITH_SSE)
> >  {
> >bisa &= ~OPTION_MASK_ISA_MMX;
> >bisa |= OPTION_MASK_ISA_SSE2;
> > diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
> > index 7c9640d4f9f..610e4b591f7 100644
> > --- a/gcc/config/i386/mmx.md
> > +++ b/gcc/config/i386/mmx.md
> > @@ -2549,7 +2549,7 @@ (define_expand "mmx_maskmovq"
> >   (match_operand:V8QI 2 "register_operand")
> >   (match_dup 0)]
> >  UNSPEC_MASKMOV))]
> > -  "TARGET_SSE || TARGET_3DNOW_A")
> > +  "TARGET_MMX || TARGET_3DNOW_A")
> >
> >  (define_insn "*mmx_maskmovq"
> >[(set (mem:V8QI (match_operand:P 0 "register_operand" "D"))
> > @@ -2557,7 +2557,7 @@ (define_insn "*mmx_maskmovq"
> >   (match_operand:V8QI 2 "register_operand" "y")
> >   (mem:V8QI (match_dup 0))]
> >  UNSPEC_MASKMOV))]
> > -  "TARGET_SSE || TARGET_3DNOW_A"
> > +  "TARGET_MMX || TARGET_3DNOW_A"
> >;; @@@ check ordering of operands in intel/nonintel syntax
> >"maskmovq\t{%2, %1|%1, %2}"
> >[(set_attr "type" "mmxcvt")
>
>  Leave mmx.md alone since maskmovq isn't an MMX instruction.
>

PING:

https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554385.html


-- 
H.J.

RE: [PATCH 1/2] system_data_types.7: Add 'void *'

2020-10-02 Thread David Laight via Gcc-patches

> > +.I void *
> > +.RS
> > +According to the C language standard,
> > +a pointer to any object type may be converted to a pointer to
> > +.I void
> > +and back.
> > +POSIX further requires that any pointer,
> > +including pointers to functions,
> > +may be converted to a pointer to
> > +.I void
> > +and back.
> > +.PP
> > +Conversions from and to any other pointer type are done implicitly,
> > +not requiring casts at all.
> > +Note that this feature prevents any kind of type checking:
> > +the programmer should be careful not to cast a
> > +.I void *
> > +value to a type incompatible to that of the underlying data,
> > +because that would result in undefined behavior.
> > +.PP
> > +This type is useful in function parameters and return value
> > +to allow passing values of any type.
> > +The function will usually use some mechanism to know
> > +of which type the underlying data passed to the function really is.
> 
> This sentence seems clunky.
> 
> How about "The function will typically use some mechanism to know the
> real type of the data being passed via a pointer to void."
> 
> An example of "some mechanism" might be useful, though I don't have
> one to offer.

It's also bollocks.

There are two main places 'void *' is used:
1) buffers (eg functions like read() and write()) when the
   associated byte length is also passed.
   This (sort of) includes memory allocation functions.
2) Passing a parameter for a callback function.
   In this case the pointer is always cast back to
   the original type before being used.
   
What it shouldn't be used for is structures you don't
want other code to look inside - use incomplete structs.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

Re: [PATCH] Add if-chain to switch conversion pass.

2020-10-02 Thread Martin Liška


On 9/29/20 10:46 AM, Richard Biener wrote:

On Fri, Sep 25, 2020 at 4:05 PM Martin Liška  wrote:


On 9/24/20 2:41 PM, Richard Biener wrote:

On Wed, Sep 2, 2020 at 1:53 PM Martin Liška  wrote:


On 9/1/20 4:50 PM, David Malcolm wrote:

Hope this is constructive
Dave


Thank you David. All of them very very useful!

There's updated version of the patch.


Hey.

What a juicy patch review!



I noticed several functions without a function-level comment.


Yep, but several of them are documented in a class declaration. Anyway, I will
improve for the next time.



-  cluster (tree case_label_expr, basic_block case_bb, profile_probability prob,
-  profile_probability subtree_prob);
+  inline cluster (tree case_label_expr, basic_block case_bb,
+ profile_probability prob, profile_probability subtree_prob);

I thought we generally leave this to the compiler ...

+@item -fconvert-if-to-switch
+@opindex fconvert-if-to-switch
+Perform conversion of an if cascade into a switch statement.
+Do so if the switch can be later transformed using a jump table
+or a bit test.  The transformation can help to produce faster code for
+the switch statement.  This flag is enabled by default
+at @option{-O2} and higher.

this mentions we do this only when we later can convert the
switch again but both passes (we still have two :/) have
independent guards.


Yes, we have the option for jump tables (-jump-tables), but we miss one for a 
bit-test.
Moreover, as mentioned in the cover email, one can see it beneficial to convert 
a if-chain
to switch as the expansion (without any BT and JT) can benefit from balanced 
tree.



+  /* For now, just wipe the dominator information.  */
+  free_dominance_info (CDI_DOMINATORS);

could at least be conditional on the vop renaming condition...

+  if (!all_candidates.is_empty ())
+mark_virtual_operands_for_renaming (fun);


Yep.



+  if (bitmap_bit_p (*visited_bbs, bb->index))
+   break;
+  bitmap_set_bit (*visited_bbs, bb->index);

since you are using a bitmap and not a sbitmap (why?)
you can combine those into


New to me, thanks.



 if (!bitmap_set_bit (*visited_bbs, bb->index))
  break;

+  /* Current we support following patterns (situations):
+
+1) if condition with equal operation:
+
...

did you see whether using

 register_edge_assert_for (lhs, true_edge, code, lhs, rhs, asserts);

works equally well?  It fills the 'asserts' vector with relations
derived from 'lhs'.  There's also
vr_values::extract_range_for_var_from_comparison_expr
to compute the case_range


Good point! I must admit that my patch doesn't properly handle negative 
conditions:

if (argc != 1)
{
  if (argc == 1)
global = 222;
  ...
}

which can VRP correctly identify as anti-range:
int ~[1, 1]  EQUIVALENCES: { argc_8(D) } (1 elements)$1 = void

I have question about OR and AND conditions:

 :
_1 = aChar_8(D) == 1;
_2 = aChar_8(D) == 10;
_3 = _1 | _2;
if (_3 != 0)
  goto ; [INV]
else
  goto ; [INV]

 :
_1 = aChar_8(D) != 1;
_2 = aChar_8(D) != 10;
_3 = _1 & _2;
if (_3 != 0)
  goto ; [INV]
else
  goto ; [INV]

Can I somehow get that from VRP (as I ask register_edge_assert_for only for LHS
of a condition)?


Yes, you simply get all sorts of conditions that hold when a condition is
true, not just those based on the SSA name you put in.  But it occured
to me that the use-case is somewhat different - for switch-conversion
you want to know whether the test _exactly_ matches a range test,
the VRP worker will not tell you that.  For example if you had
if (x &&  a > 3 && a < 7) then it will give you 'a in [4, 6]' and it might
not give you 'x in [1, 1]' (for example if x is float).  But that's required
for correctness.


Hello.

Adding Ranger guys. Is it something that can be handled by the upcoming changes 
in VRP?



So we're back to your custom crafted code unless we manage to somehow
refactor the VRP condition analysis to handle both cases (should be
possible I think, but have not looked too closely).  Maybe the actual
matching pieces can be shared at least.



+  /* If it's not the first condition, then we need a BB without
+any statements.  */
+  if (!first)
+   {
+ unsigned stmt_count = 0;
+ for (gimple_stmt_iterator gsi = gsi_start_nondebug_bb (bb);
+  !gsi_end_p (gsi); gsi_next_nondebug ())
+   ++stmt_count;
+
+ if (stmt_count - visited_stmt_count != 0)
+   break;

hmm, OK, this might be a bit iffy to get correct then, still it's a lot
of pattern maching code that is there elsewhere already.
ifcombine simply hoists any stmts without side-effects up the
dominator tree and thus only requires BBs without side-effects
(IIRC there's a predicate fn for that).


Yes, I completely miss support for code hoisting (expect first BB where we put 
gswitch).
If I'm correct hoisting should be possible

Re: [PATCH] Add if-chain to switch conversion pass.

2020-10-02 Thread Martin Liška


On 9/24/20 2:41 PM, Richard Biener wrote:

On Wed, Sep 2, 2020 at 1:53 PM Martin Liška  wrote:


On 9/1/20 4:50 PM, David Malcolm wrote:

Hope this is constructive
Dave


Thank you David. All of them very very useful!

There's updated version of the patch.


I noticed several functions without a function-level comment.

-  cluster (tree case_label_expr, basic_block case_bb, profile_probability prob,
-  profile_probability subtree_prob);
+  inline cluster (tree case_label_expr, basic_block case_bb,
+ profile_probability prob, profile_probability subtree_prob);

I thought we generally leave this to the compiler ...


Hey.

This one is needed, otherwise we'll have a compilation error (multiple 
definitions).



+@item -fconvert-if-to-switch
+@opindex fconvert-if-to-switch
+Perform conversion of an if cascade into a switch statement.
+Do so if the switch can be later transformed using a jump table
+or a bit test.  The transformation can help to produce faster code for
+the switch statement.  This flag is enabled by default
+at @option{-O2} and higher.

this mentions we do this only when we later can convert the
switch again but both passes (we still have two :/) have
independent guards.


All right, I'm planning to come up with -fbit-tests options and this 
transformation
will happen only if BT or JT are enabled.



+  /* For now, just wipe the dominator information.  */
+  free_dominance_info (CDI_DOMINATORS);

could at least be conditional on the vop renaming condition...


How do you mean this?



+  if (!all_candidates.is_empty ())
+mark_virtual_operands_for_renaming (fun);

+  if (bitmap_bit_p (*visited_bbs, bb->index))
+   break;
+  bitmap_set_bit (*visited_bbs, bb->index);

since you are using a bitmap and not a sbitmap (why?)
you can combine those into


Yes, sbitmap would be better.



if (!bitmap_set_bit (*visited_bbs, bb->index))
 break;


Unfortunately, bitmap_set_bit for sbitmap is a void return function.
Should I change it?



+  /* Current we support following patterns (situations):
+
+1) if condition with equal operation:
+
...

did you see whether using

register_edge_assert_for (lhs, true_edge, code, lhs, rhs, asserts);

works equally well?  It fills the 'asserts' vector with relations
derived from 'lhs'.  There's also
vr_values::extract_range_for_var_from_comparison_expr
to compute the case_range

+  /* If it's not the first condition, then we need a BB without
+any statements.  */
+  if (!first)
+   {
+ unsigned stmt_count = 0;
+ for (gimple_stmt_iterator gsi = gsi_start_nondebug_bb (bb);
+  !gsi_end_p (gsi); gsi_next_nondebug ())
+   ++stmt_count;
+
+ if (stmt_count - visited_stmt_count != 0)
+   break;

hmm, OK, this might be a bit iffy to get correct then, still it's a lot
of pattern maching code that is there elsewhere already.
ifcombine simply hoists any stmts without side-effects up the
dominator tree and thus only requires BBs without side-effects
(IIRC there's a predicate fn for that).

+  /* Prevent loosing information for a PHI node where 2 edges will
+be folded into one.  Note that we must do the same also for false_edge
+(for last BB in a if-elseif chain).  */
+  if (!chain->record_phi_arguments (true_edge)
+ || !chain->record_phi_arguments (false_edge))

I don't really get this - looking at record_phi_arguments it seems
we're requiring that all edges into the same PHI from inside the case
(irrespective of from which case label) have the same value for the
PHI arg?

+ if (arg != *v)
+   return false;


This one is really needed for situations like:

cat wchar.i
int i;

int
pg_utf_mblen() {
  int len;
  if (i == 4)
len = 3;
  else if (i == 2)
len = 4;
  else if (i == 6)
len = 1;
  return len;
}

where we end up just with one edge from switch BB to a destination BB where
we have the PHI:
  # len_4 = PHI <3(2), 4(3), len_6(D)(4), 1(5)>



should use operand_equal_p at least, REAL_CSTs are for example
not shared tree nodes.  I'll also notice that if record_phi_arguments
fails we still may have altered its hash-map even though the particular
edge will not participate in the current chain, so it will affect other
chains ending in the same BB.  Overall this looks a bit too conservative
(and random, based on visiting order).


No, the m_phi_map is destroyed when we call 'delete chain'.



+expanded_location loc
+= expand_location (gimple_location (chain->m_first_condition));
+  if (dump_file)
+   {
+ fprintf (dump_file, "Condition chain (at %s:%d) with %d conditions "
+  "(%d BBs) transformed into a switch statement.\n",
+  loc.file, loc.line, total_case_values,
+  chain->m_entries.length ());

Use dump_printf_loc and you can pass a gimple * stmt as location.


Good idea.



+  /* Follow if-elseif-elseif

Re: PING^3 [PATCH] x86: Inline strncmp only with -minline-all-stringops

2020-10-02 Thread Jan Hubicka

> > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> > > index fb677e17817..f3fbed81c4a 100644
> > > --- a/gcc/config/i386/i386.md
> > > +++ b/gcc/config/i386/i386.md
> > > @@ -18007,7 +18007,13 @@ (define_expand "cmpstrnsi"
> > >  {
> > >rtx addr1, addr2, countreg, align, out;
> > >
> > > -  if (optimize_insn_for_size_p () && !TARGET_INLINE_ALL_STRINGOPS)
> > > +  /* Expand strncmp only with -minline-all-stringops since
> > > + "repz cmpsb" can be much slower than strncmp functions
> > > + implemented with vector instructions, see
> > > +
> > > + https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > > +   */
> > > +  if (!TARGET_INLINE_ALL_STRINGOPS)
> > >  FAIL;

I think this is hitting the more general problem that we want to have
two levels of optimization for size (at least internally). One for parts
of program guessed to be cold and do not perform such extreme changes
(i.e. translate stringops, division etc.) and other when we really want
top optimize for size.

I will try to push out patches for two-state optimize_size next week.

Honza
> > >
> > >/* Can't use this if the user has appropriated ecx, esi or edi.  */
> > > diff --git a/gcc/testsuite/gcc.target/i386/pr95458-1.c 
> > > b/gcc/testsuite/gcc.target/i386/pr95458-1.c
> > > new file mode 100644
> > > index 000..231a4787dce
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/pr95458-1.c
> > > @@ -0,0 +1,11 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2 -minline-all-stringops" } */
> > > +
> > > +int
> > > +func (char *d, unsigned int l)
> > > +{
> > > +  return __builtin_strncmp (d, "foo", l) ? 1 : 2;
> > > +}
> > > +
> > > +/* { dg-final { scan-assembler-not "call\[\\t \]*_?strncmp" } } */
> > > +/* { dg-final { scan-assembler "cmpsb" } } */
> > > diff --git a/gcc/testsuite/gcc.target/i386/pr95458-2.c 
> > > b/gcc/testsuite/gcc.target/i386/pr95458-2.c
> > > new file mode 100644
> > > index 000..1a620444770
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/pr95458-2.c
> > > @@ -0,0 +1,7 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2 -mno-inline-all-stringops" } */
> > > +
> > > +#include "pr95458-1.c"
> > > +
> > > +/* { dg-final { scan-assembler "call\[\\t \]*_?strncmp" } } */
> > > +/* { dg-final { scan-assembler-not "cmpsb" } } */
> > > --
> > > 2.26.2
> > >
> >
> > PING.
> 
> PING.
> 
> -- 
> H.J.

PING^3 [PATCH] x86: Add cmpmemsi for -minline-all-stringops

2020-10-02 Thread H.J. Lu via Gcc-patches

On Wed, Sep 16, 2020 at 10:07 PM H.J. Lu  wrote:
>
> On Wed, Aug 19, 2020 at 6:09 AM H.J. Lu  wrote:
> >
> > On Tue, May 19, 2020 at 5:14 AM H.J. Lu  wrote:
> > >
> > > On Tue, May 19, 2020 at 1:48 AM Uros Bizjak  wrote:
> > > >
> > > > On Sun, May 17, 2020 at 7:06 PM H.J. Lu  wrote:
> > > > >
> > > > > Duplicate the cmpstrn pattern for cmpmem.  The only difference is that
> > > > > the length argument of cmpmem is guaranteed to be less than or equal 
> > > > > to
> > > > > lengths of 2 memory areas.  Since "repz cmpsb" can be much slower than
> > > > > memcmp function implemented with vector instruction, see
> > > > >
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > > > >
> > > > > expand cmpmem to "repz cmpsb" only with -mgeneral-regs-only.
> > > >
> > > > If there is no benefit compared to the library implementation, then
> > > > enable these patterns only when -minline-all-stringops is used.
> > >
> > > Fixed.
> > >
> > > > Eventually these should be reimplemented with SSE4 string instructions.
> > > >
> > > > Honza is the author of the block handling x86 system, I'll leave the
> > > > review to him.
> > >
> > > We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was changed
> > > by
> > >
> > > commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
> > > Author: Nick Clifton 
> > > Date:   Fri Aug 12 16:26:11 2011 +
> > >
> > > builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.
> > >
> > > * builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
> > > pattern.
> > > * doc/md.texi (cmpstrn): Note that the comparison stops if 
> > > both
> > > fetched bytes are zero.
> > > (cmpstr): Likewise.
> > > (cmpmem): Note that the comparison does not stop if both of 
> > > the
> > > fetched bytes are zero.
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95151
> > >
> > > is a regression.
> > >
> > > Honza, can you take a look at this?
> > >
> >
> > PING:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html
> >
>
> PING.
>

PING.

-- 
H.J.

Perforate fnspec attribute strings

2020-10-02 Thread Jan Hubicka

Hi,
as discussed this patch makes return value and arg specifiers to be 2
characters long and updates (I hope) all fnspec strings.
I also enabled part of the verification (just accepting the fortran bug
with 'R' and 'W' in return value specifiers) to get some evidence that
the update is complete.

Bootstrapped/regtested x86_64-linux, OK?

gcc/ChangeLog:

2020-10-02  Jan Hubicka  

* attr-fnspec.h: Update documentation.
(attr_fnsec::return_desc_size): Set to 2
(attr_fnsec::arg_desc_size): Set to 2
* builtin-attrs.def (STR1): Update fnspec.
* internal-fn.def (UBSAN_NULL): Update fnspec.
(UBSAN_VPTR): Update fnspec.
(UBSAN_PTR): Update fnspec.
(ASAN_CHECK): Update fnspec.
(GOACC_DIM_SIZE): Remove fnspec.
(GOACC_DIM_POS): Remove fnspec.
* tree-ssa-alias.c (attr_fnspec::verify): Update verification.

gcc/fortran/ChangeLog:

2020-10-02  Jan Hubicka  

* trans-decl.c (gfc_build_library_function_decl_with_spec): Verify
fnspec.
(gfc_build_intrinsic_function_decls): Update fnspecs.
(gfc_build_builtin_function_decls): Update fnspecs.
* trans-io.c (gfc_build_io_library_fndecls): Update fnspecs.
* trans-types.c (create_fn_spec): Update fnspecs.


diff --git a/gcc/attr-fnspec.h b/gcc/attr-fnspec.h
index 607c0cf0f54..921bb48ae6a 100644
--- a/gcc/attr-fnspec.h
+++ b/gcc/attr-fnspec.h
@@ -25,15 +25,22 @@
  '1'...'4'  specifies number of argument function returns (as in memset)
  'm'   specifies that returned value is noalias (as in malloc)
  '.'   specifies that nothing is known.
+   character 1  specifies additional function properties
+ ' 'specifies that nothing is known
 
-   character 1+i specifies properties of argument number i as follows:
+   character 2+2i specifies properties of argument number i as follows:
  'x' or 'X' specifies that parameter is unused.
  'r' or 'R' specifies that parameter is only read and memory pointed to is
never dereferenced.
  'w' or 'W' specifies that parameter is only written to.
  '.'   specifies that nothing is known.
The uppercase letter in addition specifies that parameter
-   is non-escaping.  */
+   is non-escaping. 
+
+   character 3+2i specifies additional properties of argument number i
+   as follows:
+ ' 'nothing is known
+ */
 
 #ifndef ATTR_FNSPEC_H
 #define ATTR_FNSPEC_H
@@ -46,9 +53,9 @@ private:
   /* length of the fn spec string.  */
   const unsigned len;
   /* Number of characters specifying return value.  */
-  const unsigned int return_desc_size = 1;
+  const unsigned int return_desc_size = 2;
   /* Number of characters specifying size.  */
-  const unsigned int arg_desc_size = 1;
+  const unsigned int arg_desc_size = 2;
 
   /* Return start of specifier of arg i.  */
   unsigned int arg_idx (int i)
diff --git a/gcc/builtin-attrs.def b/gcc/builtin-attrs.def
index 3239311b5a4..778bc8a43a1 100644
--- a/gcc/builtin-attrs.def
+++ b/gcc/builtin-attrs.def
@@ -66,7 +66,7 @@ DEF_ATTR_FOR_INT (6)
   DEF_ATTR_STRING (ATTR_##ENUM, VALUE) \
   DEF_ATTR_TREE_LIST (ATTR_LIST_##ENUM, ATTR_NULL, \
  ATTR_##ENUM, ATTR_NULL)
-DEF_ATTR_FOR_STRING (STR1, "1")
+DEF_ATTR_FOR_STRING (STR1, "1 ")
 #undef DEF_ATTR_FOR_STRING
 
 /* Construct a tree for a list of two integers.  */
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 2be9df40d2c..5940a1fd10c 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gomp-constants.h"
 #include "gimplify.h"
 #include "omp-general.h"
+#include "attr-fnspec.h"
 
 #define MAX_LABEL_VALUE 9
 
@@ -3306,6 +3307,11 @@ gfc_build_library_function_decl_with_spec (tree name, 
const char *spec,
   tree ret;
   va_list args;
   va_start (args, nargs);
+  if (flag_checking)
+{
+  attr_fnspec fnspec (spec, strlen (spec));
+  fnspec.verify ();
+}
   ret = build_library_function_decl_1 (name, spec, rettype, nargs, args);
   va_end (args);
   return ret;
@@ -3325,144 +3331,144 @@ gfc_build_intrinsic_function_decls (void)
 
   /* String functions.  */
   gfor_fndecl_compare_string = gfc_build_library_function_decl_with_spec (
-   get_identifier (PREFIX("compare_string")), "..R.R",
+   get_identifier (PREFIX("compare_string")), ". . R . R ",
integer_type_node, 4, gfc_charlen_type_node, pchar1_type_node,
gfc_charlen_type_node, pchar1_type_node);
   DECL_PURE_P (gfor_fndecl_compare_string) = 1;
   TREE_NOTHROW (gfor_fndecl_compare_string) = 1;
 
   gfor_fndecl_concat_string = gfc_build_library_function_decl_with_spec (
-   get_identifier (PREFIX("concat_string")), "..W.R.R",
+   get_identifier (PREFIX("concat_string")), ". . W . R . R ",
void_type_node, 6, gfc_charlen_type_node, pchar1_type_node,
gfc_charlen_type_node, pchar1_type_node,

[PATCH][omp, simt] Handle alternative IV

2020-10-02 Thread Tom de Vries

Hi,

Consider the test-case libgomp.c/pr81778.c added in this commit, with
this core loop (note: CANARY_SIZE set to 0 for simplicity):
...
  int s = 1;
  #pragma omp target simd
  for (int i = N - 1; i > -1; i -= s)
a[i] = 1;
...
which, given that N is 32, sets a[0..31] to 1.

After omp-expand, this looks like:
...
   :
  simduid.7 = .GOMP_SIMT_ENTER (simduid.7);
  .omp_simt.8 = .GOMP_SIMT_ENTER_ALLOC (simduid.7);
  D.3193 = -s;
  s.9 = s;
  D.3204 = .GOMP_SIMT_LANE ();
  D.3205 = -s.9;
  D.3206 = (int) D.3204;
  D.3207 = D.3205 * D.3206;
  i = D.3207 + 31;
  D.3209 = 0;
  D.3210 = -s.9;
  D.3211 = D.3210 - i;
  D.3210 = -s.9;
  D.3212 = D.3211 / D.3210;
  D.3213 = (unsigned int) D.3212;
  D.3213 = i >= 0 ? D.3213 : 0;

   :
  if (D.3209 < D.3213)
goto ; [87.50%]
  else
goto ; [12.50%]

   :
  a[i] = 1;
  D.3215 = -s.9;
  D.3219 = .GOMP_SIMT_VF ();
  D.3216 = (int) D.3219;
  D.3220 = D.3215 * D.3216;
  i = D.3220 + i;
  D.3209 = D.3209 + 1;
  goto ; [100.00%]
...

On nvptx, the first time bb6 is executed, i is in the 0..31 range (depending
on the lane that is executing) at bb entry.

So we have the following sequence:
- a[0..31] is set to 1
- i is updated to -32..-1
- D.3209 is updated to 1 (being 0 initially)
- bb19 is executed, and if condition (D.3209 < D.3213) == (1 < 32) evaluates
  to true
- bb6 is once more executed, which should not happen because all the elements
  that needed to be handled were already handled.
- consequently, elements that should not be written are written
- with CANARY_SIZE == 0, we may run into a libgomp error:
  ...
  libgomp: cuCtxSynchronize error: an illegal memory access was encountered
  ...
  and with CANARY_SIZE unmodified, we run into:
  ...
  Expected 0, got 1 at base[-961]
  Aborted (core dumped)
  ...

The cause of this is as follows:
- because the step s is a variable rather than a constant, an alternative
  IV (D.3209 in our example) is generated in expand_omp_simd, and the
  loop condition is tested in terms of the alternative IV rather than
  the original IV (i in our example).
- the SIMT code in expand_omp_simd works by modifying step and initial value.
- The initial value fd->loop.n1 is loaded into a variable n1, which is
  modified by the SIMT code and then used there-after.
- The step fd->loop.step is loaded into a variable step, which is is modified
  by the SIMT code, but afterwards there are uses of both step and
  fd->loop.step.
- There are uses of fd->loop.step in the alternative IV handling code,
  which should use step instead.

Fix this by introducing an additional variable orig_step, which is not
modified by the SIMT code and replacing all remaining uses of fd->loop.step
by either step or orig_step.

Build on x86_64-linux with nvptx accelerator, tested libgomp.

This fixes for-5.c and for-6.c FAILs I'm currently seeing on a quadro m1200
with driver 450.66.

OK for trunk?

Thanks,
- Tom

[omp, simt] Handle alternative IV

gcc/ChangeLog:

2020-10-02  Tom de Vries  

* omp-expand.c (expand_omp_simd): Add step_orig, and replace uses of
fd->loop.step by either step or orig_step.

libgomp/ChangeLog:

2020-10-02  Tom de Vries  

* testsuite/libgomp.c/pr81778.c: New test.

---
 gcc/omp-expand.c  | 11 
 libgomp/testsuite/libgomp.c/pr81778.c | 48 +++
 2 files changed, 54 insertions(+), 5 deletions(-)

diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index 99cb4f9dda4..80e35ac0294 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -6307,6 +6307,7 @@ expand_omp_simd (struct omp_region *region, struct 
omp_for_data *fd)
   n2 = OMP_CLAUSE_DECL (innerc);
 }
   tree step = fd->loop.step;
+  tree orig_step = step; /* May be different from step if is_simt.  */
 
   bool is_simt = omp_find_clause (gimple_omp_for_clauses (fd->for_stmt),
  OMP_CLAUSE__SIMT_);
@@ -6455,7 +6456,7 @@ expand_omp_simd (struct omp_region *region, struct 
omp_for_data *fd)
   tree altv = NULL_TREE, altn2 = NULL_TREE;
   if (fd->collapse == 1
   && !broken_loop
-  && TREE_CODE (fd->loops[0].step) != INTEGER_CST)
+  && TREE_CODE (orig_step) != INTEGER_CST)
 {
   /* The vectorizer currently punts on loops with non-constant steps
 for the main IV (can't compute number of iterations and gives up
@@ -6471,7 +6472,7 @@ expand_omp_simd (struct omp_region *region, struct 
omp_for_data *fd)
itype = signed_type_for (itype);
   t = build_int_cst (itype, (fd->loop.cond_code == LT_EXPR ? -1 : 1));
   t = fold_build2 (PLUS_EXPR, itype,
-  fold_convert (itype, fd->loop.step), t);
+  fold_convert (itype, step), t);
   t = fold_build2 (PLUS_EXPR, itype, t, fold_convert (itype, n2));
   t = fold_build2 (MINUS_EXPR, itype, t,
   fold_convert (itype, fd->loop.v));
@@ -6479,10 +6480,10 @@ expand_omp_simd (struct omp_region *region, struct 
omp_for_data *fd)

Re: [PATCH 1/2] system_data_types.7: Add 'void *'

2020-10-02 Thread Jonathan Wakely via Gcc-patches

On Fri, 2 Oct 2020 at 13:17, Alejandro Colomar  wrote:
>
> Signed-off-by: Alejandro Colomar 
>
> system_data_types.7: void *: Add info about generic function parameters and 
> return value
>
> Reported-by: Paul Eggert 
> Reported-by: David Laight 
> Signed-off-by: Alejandro Colomar 
>
> system_data_types.7: void *: Add info about pointer artihmetic
>
> Reported-by: Paul Eggert 
> Reported-by: David Laight 
> Signed-off-by: Alejandro Colomar 
>
> system_data_types.7: void *: Add Versions notes
>
> Compatibility between function pointers and void * hasn't always been so.
> Document when that was added to POSIX.
>
> Reported-by: Michael Kerrisk 
> Signed-off-by: Alejandro Colomar 
> ---
>  man7/system_data_types.7 | 80 +++-
>  1 file changed, 78 insertions(+), 2 deletions(-)
>
> diff --git a/man7/system_data_types.7 b/man7/system_data_types.7
> index c82d3b388..277e05b12 100644
> --- a/man7/system_data_types.7
> +++ b/man7/system_data_types.7
> @@ -679,7 +679,6 @@ See also the
>  .I uintptr_t
>  and
>  .I void *
> -.\" TODO: Document void *
>  types in this page.
>  .RE
>  .\"- lconv /
> @@ -1780,7 +1779,6 @@ See also the
>  .I intptr_t
>  and
>  .I void *
> -.\" TODO: Document void *
>  types in this page.
>  .RE
>  .\"- va_list --/
> @@ -1814,6 +1812,84 @@ See also:
>  .BR va_copy (3),
>  .BR va_end (3)
>  .RE
> +.\"- void * ---/
> +.TP
> +.I void *
> +.RS
> +According to the C language standard,
> +a pointer to any object type may be converted to a pointer to
> +.I void
> +and back.
> +POSIX further requires that any pointer,
> +including pointers to functions,
> +may be converted to a pointer to
> +.I void
> +and back.
> +.PP
> +Conversions from and to any other pointer type are done implicitly,
> +not requiring casts at all.
> +Note that this feature prevents any kind of type checking:
> +the programmer should be careful not to cast a
> +.I void *
> +value to a type incompatible to that of the underlying data,
> +because that would result in undefined behavior.
> +.PP
> +This type is useful in function parameters and return value
> +to allow passing values of any type.
> +The function will usually use some mechanism to know
> +of which type the underlying data passed to the function really is.

This sentence seems clunky.

How about "The function will typically use some mechanism to know the
real type of the data being passed via a pointer to void."

An example of "some mechanism" might be useful, though I don't have
one to offer.

> +.PP
> +A value of this type can't be dereferenced,
> +as it would give a value of type
> +.I void
> +which is not possible.
> +Likewise, pointer arithmetic is not possible with this type.
> +However, in GNU C, poitner arithmetic is allowed

Typo: pointer


> +as an extension to the standard;
> +this is done by treating the size of a
> +.I void
> +or of a function as 1.
> +A consequence of this is that
> +.I sizeof
> +is also allowed on
> +.I void
> +and on function types, and returns 1.
> +.PP
> +The conversion specifier for
> +.I void *
> +for the
> +.BR printf (3)
> +and the
> +.BR scanf (3)
> +families of functions is
> +.BR p ;
> +resulting commonly in
> +.B %p
> +for printing
> +.I void *
> +values.

What does "resulting commonly in %p for printing void * values" mean?

Is this just explaining that the format specifier p is commonly used
as "%p" (but sometimes as e.g. "%20p") ?
I'm not sure the "resulting commonly ..." part adds anything of value,
rather than just being confusing.

> +.PP
> +Versions:
> +The POSIX requirement about compatibility between
> +.I void *
> +and function pointers was added in
> +POSIX.1-2008 Technical Corrigendum 1 (2013).
> +.PP
> +Conforming to:
> +C99 and later; POSIX.1-2001 and later.
> +.PP
> +See also:
> +.BR malloc (3),
> +.BR memcmp (3),
> +.BR memcpy (3),
> +.BR memset (3)
> +.PP
> +See also the
> +.I intptr_t
> +and
> +.I uintptr_t
> +types in this page.
> +.RE
>  .\"/
>  .SH NOTES
>  The structures described in this manual page shall contain,
> --
> 2.28.0
>

PING^3 [PATCH] x86: Inline strncmp only with -minline-all-stringops

2020-10-02 Thread H.J. Lu via Gcc-patches

On Wed, Sep 16, 2020 at 10:07 PM H.J. Lu  wrote:
>
> On Wed, Aug 19, 2020 at 6:25 AM H.J. Lu  wrote:
> >
> > On Wed, Jul 15, 2020 at 10:42:27AM -0700, H.J. Lu wrote:
> > > Expand strncmp to "repz cmpsb" only with -minline-all-stringops since
> > > "repz cmpsb" can be much slower than strncmp function implemented with
> > > vector instructions, see
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > >
> > > gcc/
> > >
> > >   PR target/95458
> > >   * config/i386/i386-expand.c (ix86_expand_cmpstrn_or_cmpmem):
> > >   Return false for -mno-inline-all-stringops.
> > >
> > > gcc/testsuite/
> > >
> > >   PR target/95458
> > >   * gcc.target/i386/pr95458-1.c: New test.
> > >   * gcc.target/i386/pr95458-2.c: Likewise.
> >
> > Expand strncmp to "repz cmpsb" only with -minline-all-stringops since
> > "repz cmpsb" can be much slower than strncmp function implemented with
> > vector instructions, see
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> >
> > gcc/
> >
> > PR target/95458
> > * config/i386/i386.md (cmpstrnsi): Expand only with
> > TARGET_INLINE_ALL_STRINGOPS.
> >
> > gcc/testsuite/
> >
> > PR target/95458
> > * gcc.target/i386/pr95458-1.c: New test.
> > * gcc.target/i386/pr95458-2.c: Likewise.
> > ---
> >  gcc/config/i386/i386.md   |  8 +++-
> >  gcc/testsuite/gcc.target/i386/pr95458-1.c | 11 +++
> >  gcc/testsuite/gcc.target/i386/pr95458-2.c |  7 +++
> >  3 files changed, 25 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr95458-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr95458-2.c
> >
> > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> > index fb677e17817..f3fbed81c4a 100644
> > --- a/gcc/config/i386/i386.md
> > +++ b/gcc/config/i386/i386.md
> > @@ -18007,7 +18007,13 @@ (define_expand "cmpstrnsi"
> >  {
> >rtx addr1, addr2, countreg, align, out;
> >
> > -  if (optimize_insn_for_size_p () && !TARGET_INLINE_ALL_STRINGOPS)
> > +  /* Expand strncmp only with -minline-all-stringops since
> > + "repz cmpsb" can be much slower than strncmp functions
> > + implemented with vector instructions, see
> > +
> > + https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
> > +   */
> > +  if (!TARGET_INLINE_ALL_STRINGOPS)
> >  FAIL;
> >
> >/* Can't use this if the user has appropriated ecx, esi or edi.  */
> > diff --git a/gcc/testsuite/gcc.target/i386/pr95458-1.c 
> > b/gcc/testsuite/gcc.target/i386/pr95458-1.c
> > new file mode 100644
> > index 000..231a4787dce
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr95458-1.c
> > @@ -0,0 +1,11 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -minline-all-stringops" } */
> > +
> > +int
> > +func (char *d, unsigned int l)
> > +{
> > +  return __builtin_strncmp (d, "foo", l) ? 1 : 2;
> > +}
> > +
> > +/* { dg-final { scan-assembler-not "call\[\\t \]*_?strncmp" } } */
> > +/* { dg-final { scan-assembler "cmpsb" } } */
> > diff --git a/gcc/testsuite/gcc.target/i386/pr95458-2.c 
> > b/gcc/testsuite/gcc.target/i386/pr95458-2.c
> > new file mode 100644
> > index 000..1a620444770
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr95458-2.c
> > @@ -0,0 +1,7 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mno-inline-all-stringops" } */
> > +
> > +#include "pr95458-1.c"
> > +
> > +/* { dg-final { scan-assembler "call\[\\t \]*_?strncmp" } } */
> > +/* { dg-final { scan-assembler-not "cmpsb" } } */
> > --
> > 2.26.2
> >
>
> PING.

PING.

-- 
H.J.

PING: [PATCH] Use the section flag 'o' for __patchable_function_entries

2020-10-02 Thread H.J. Lu via Gcc-patches

On Thu, Feb 6, 2020 at 6:57 PM H.J. Lu  wrote:
>
> This commit in GNU binutils 2.35:
>
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=b7d072167715829eed0622616f6ae0182900de3e
>
> added the section flag 'o' to .section directive:
>
> .section __patchable_function_entries,"awo",@progbits,foo
>
> which specifies the symbol name which the section references.  Assembler
> creates a unique __patchable_function_entries section with the section,
> where foo is defined, as its linked-to section.  Linker keeps a section
> if its linked-to section is kept during garbage collection.
>
> This patch checks assembler support for the section flag 'o' and uses
> it to implement __patchable_function_entries section.  Since Solaris may
> use GNU assembler with Solairs ld.  Even if GNU assembler supports the
> section flag 'o', it doesn't mean that Solairs ld supports it.  This
> feature is disabled for Solairs targets.
>
> gcc/
>
> PR middle-end/93195
> PR middle-end/93197
> * configure.ac (HAVE_GAS_SECTION_LINK_ORDER): New.  Define if
> the assembler supports the section flag 'o' for specifying
> section with link-order.
> * dwarf2out.c (output_comdat_type_unit): Pass 0 as flags2
> to targetm.asm_out.named_section.
> * config/sol2.c (solaris_elf_asm_comdat_section): Likewise.
> * output.h (SECTION2_LINK_ORDER): New.
> (switch_to_section): Add an unsigned int argument.
> (default_no_named_section): Likewise.
> (default_elf_asm_named_section): Likewise.
> * target.def (asm_out.named_section): Likewise.
> * targhooks.c (default_print_patchable_function_entry): Pass
> current_function_decl to get_section and SECTION2_LINK_ORDER
> to switch_to_section.
> * varasm.c (default_no_named_section): Add an unsigned int
> argument.
> (default_elf_asm_named_section): Add an unsigned int argument,
> flags2.  Use 'o' flag for SECTION2_LINK_ORDER if assembler
> supports it.
> (switch_to_section): Add an unsigned int argument and pass it
> to targetm.asm_out.named_section.
> (handle_vtv_comdat_section): Pass 0 to
> targetm.asm_out.named_section.
> * config.in: Regenerated.
> * configure: Likewise.
> * doc/tm.texi: Likewise.
>
> gcc/testsuite/
>
> PR middle-end/93195
> * g++.dg/pr93195a.C: New test.
> * g++.dg/pr93195b.C: Likewise.
> * lib/target-supports.exp
> (check_effective_target_o_flag_in_section): New proc.

PING

https://gcc.gnu.org/pipermail/gcc-patches/2020-February/539963.html

-- 
H.J.

PING: [PATCH] x86: Add

2020-10-02 Thread H.J. Lu via Gcc-patches

On Wed, Sep 23, 2020 at 10:58 AM H.J. Lu  wrote:
>
> For sources which can't use any vector instructions,  and
>  cannot be included for compiler intrinsics:
>
> $ echo "#include " | gcc -S -O2 -mno-sse -mno-mmx -x c -
> In file included from /usr/include/stdlib.h:1013,
>  from 
> /usr/lib/gcc/x86_64-redhat-linux/10/include/mm_malloc.h:27,
>  from 
> /usr/lib/gcc/x86_64-redhat-linux/10/include/xmmintrin.h:34,
>  from 
> /usr/lib/gcc/x86_64-redhat-linux/10/include/immintrin.h:29,
>  from 
> /usr/lib/gcc/x86_64-redhat-linux/10/include/x86intrin.h:32,
>  from :1:
> /usr/include/bits/stdlib-float.h: In function ‘atof’:
> /usr/include/bits/stdlib-float.h:26:1: error: SSE register return with SSE 
> disabled
>26 | {
>   | ^
> $
>
> libgcc/config/i386/shadow-stack-unwind.h has a workaround:
>
> /* NB: We need _get_ssp and _inc_ssp from .  But we can't
>include  which ends up including , which
>includes  and  unconditionally.  But we can't
>include any libc system headers unconditionally from libgcc.  Avoid
>including  here by defining _IMMINTRIN_H_INCLUDED.  */
>  #define _IMMINTRIN_H_INCLUDED
>  #include 
>  #undef _IMMINTRIN_H_INCLUDED
>
> Add a standalone intrinsic header file, , to provide
> integer only intrinsics.  All integer only intrinsics are placed in
> .   and  simply include
> .
>
> Add the FSF copyright to ,  and
> .
>
> gcc/
>
> PR target/97148
> * config.gcc (extra_headers): Add x86gprintrin.h.
> * config/i386/adxintrin.h: Check _X86GPRINTRIN_H_INCLUDED for
> .
> * config/i386/bmi2intrin.h: Likewise.
> * config/i386/bmiintrin.h: Likewise.
> * config/i386/cetintrin.h: Likewise.
> * config/i386/cldemoteintrin.h: Likewise.
> * config/i386/clflushoptintrin.h: Likewise.
> * config/i386/clwbintrin.h: Likewise.
> * config/i386/fxsrintrin.h: Likewise.
> * config/i386/ia32intrin.h: Likewise.
> * config/i386/lwpintrin.h: Likewise.
> * config/i386/lzcntintrin.h: Likewise.
> * config/i386/movdirintrin.h: Likewise.
> * config/i386/pkuintrin.h: Likewise.
> * config/i386/rdseedintrin.h: Likewise.
> * config/i386/rtmintrin.h: Likewise.
> * config/i386/serializeintrin.h: Likewise.
> * config/i386/tbmintrin.h: Likewise.
> * config/i386/waitpkgintrin.h: Likewise.
> * config/i386/xsavecintrin.h: Likewise.
> * config/i386/xsaveintrin.h: Likewise.
> * config/i386/xsaveoptintrin.h: Likewise.
> * config/i386/xsavesintrin.h: Likewise.
> * config/i386/xtestintrin.h: Likewise.
> * config/i386/enqcmdintrin.h: Check _X86GPRINTRIN_H_INCLUDED for
> .  Replace  with 
> in the error message.
> * config/i386/immintrin.h: Include  instead of
> , , ,
> , , ,
> , , , ,
> , ,
> , , ,
> , , ,
> , ,  and
> .
> (_wbinvd): Moved to config/i386/x86gprintrin.h.
> (_rdrand16_step): Likewise.
> (_rdrand32_step): Likewise.
> (_rdpid_u32): Likewise.
> (_readfsbase_u32): Likewise.
> (_readfsbase_u64): Likewise.
> (_readgsbase_u32): Likewise.
> (_readgsbase_u64): Likewise.
> (_writefsbase_u32): Likewise.
> (_writefsbase_u64): Likewise.
> (_writegsbase_u32): Likewise.
> (_writegsbase_u64): Likewise.
> (_rdrand64_step): Likewise.
> (_ptwrite64): Likewise.
> (_ptwrite32): Likewise.
> * config/i386/x86gprintrin.h: New file.
> * config/i386/pconfigintrin.h: Add the FSF copyright.  Check
> _X86GPRINTRIN_H_INCLUDED for .
> * config/i386/tsxldtrkintrin.h: Likewise.
> * config/i386/wbnoinvdintrin.h: Likewise.
> * config/i386/x86intrin.h: Include .  Don't
> include , , ,
> ,  and .
>
> gcc/testsuite/
>
> * gcc.target/i386/avx-1.c (__builtin_ia32_lwpval32): New to
> support  included in .
> (__builtin_ia32_lwpval64): Likewise.
> (__builtin_ia32_lwpins32): Likewise.
> (__builtin_ia32_lwpins64): Likewise.
> (__builtin_ia32_bextri_u32): New to support 
> included in .
> (__builtin_ia32_bextri_u64): Likewise.
> * gcc.target/i386/x86gprintrin-1.c: New test.
> * gcc.target/i386/x86gprintrin-2.c: Likewise.
> * gcc.target/i386/x86gprintrin-3.c: Likewise.
> * gcc.target/i386/x86gprintrin-4.c: Likewise.
> * gcc.target/i386/x86gprintrin-4a.c: Likewise.
> * gcc.target/i386/x86gprintrin-5.c: Likewise.
> * gcc.target/i386/x86gprintrin-5a.c: Likewise.
> * gcc.target/i386/x86gprintrin-5b.c: Likewise.
> * gcc.target/i386/x86gprintrin-6.c: Likewise.
>
> libgcc/
>
> PR target/97148
> * config/i386/shadow-stack-unwind.h: Include 
>

Re: [PATCH 4/6] ipa: Multiple predicates for loop properties, with frequencies

2020-10-02 Thread Martin Jambor

Hi,

On Wed, Sep 30 2020, Jan Hubicka wrote:
>> This patch enhances the ability of IPA to reason under what conditions
>> loops in a function have known iteration counts or strides because it
>> replaces single predicates which currently hold conjunction of
>> predicates for all loops with vectors capable of holding multiple
>> predicates, each with a cumulative frequency of loops with the
>> property.
>> 
>> This second property is then used by IPA-CP to much more aggressively
>> boost its heuristic score for cloning opportunities which make
>> iteration counts or strides of frequent loops compile time constant.
>> 
>> gcc/ChangeLog:
>> 
>> 2020-09-03  Martin Jambor  
>> 
>>  * ipa-fnsummary.h (ipa_freqcounting_predicate): New type.
>>  (ipa_fn_summary): Change the type of loop_iterations and loop_strides
>>  to vectors of ipa_freqcounting_predicate.
>>  (ipa_fn_summary::ipa_fn_summary): Construct the new vectors.
>>  (ipa_call_estimates): New fields loops_with_known_iterations and
>>  loops_with_known_strides.
>>  * ipa-cp.c (hint_time_bonus): Multiply param_ipa_cp_loop_hint_bonus
>>  with the expected frequencies of loops with known iteration count or
>>  stride.
>>  * ipa-fnsummary.c (add_freqcounting_predicate): New function.
>>  (ipa_fn_summary::~ipa_fn_summary): Release the new vectors instead of
>>  just two predicates.
>>  (remap_hint_predicate_after_duplication): Replace with function
>>  remap_freqcounting_preds_after_dup.
>>  (ipa_fn_summary_t::duplicate): Use it or duplicate new vectors.
>>  (ipa_dump_fn_summary): Dump the new vectors.
>>  (analyze_function_body): Compute the loop property vectors.
>>  (ipa_call_context::estimate_size_and_time): Calculate also
>>  loops_with_known_iterations and loops_with_known_strides.  Adjusted
>>  dumping accordinly.
>>  (remap_hint_predicate): Replace with function
>>  remap_freqcounting_predicate.
>>  (ipa_merge_fn_summary_after_inlining): Use it.
>>  (inline_read_section): Stream loopcounting vectors instead of two
>>  simple predicates.
>>  (ipa_fn_summary_write): Likewise.
>>  * params.opt (ipa-max-loop-predicates): New parameter.
>>  * doc/invoke.texi (ipa-max-loop-predicates): Document new param.
>> 
>> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
>> index 6082f34d63f..94aa930 100644
>> --- a/gcc/ipa-fnsummary.c
>> +++ b/gcc/ipa-fnsummary.c
>> @@ -310,6 +310,36 @@ set_hint_predicate (predicate **p, predicate 
>> new_predicate)
>>  }
>>  }
>>  
>> +/* Find if NEW_PREDICATE is already in V and if so, increment its freq.
>> +   Otherwise add a new item to the vector with this predicate and frerq 
>> equal
>> +   to add_freq, unless the number of predicates would exceed 
>> MAX_NUM_PREDICATES
>> +   in which case the function does nothing.  */
>> +
>> +static void
>> +add_freqcounting_predicate (vec **v,
>> +const predicate _predicate, sreal add_freq,
>> +unsigned max_num_predicates)
>> +{
>> +  if (new_predicate == false || new_predicate == true)
>> +return;
>> +  ipa_freqcounting_predicate *f;
>> +  for (int i = 0; vec_safe_iterate (*v, i, ); i++)
>> +if (new_predicate == f->predicate)
>> +  {
>> +f->freq += add_freq;
>> +return;
>> +  }
>> +  if (vec_safe_length (*v) >= max_num_predicates)
>> +/* Too many different predicates to account for.  */
>> +return;
>> +
>> +  ipa_freqcounting_predicate fcp;
>> +  fcp.predicate = NULL;
>> +  set_hint_predicate (, new_predicate);
>> +  fcp.freq = add_freq;
>> +  vec_safe_push (*v, fcp);
>> +  return;
>> +}
>>  
>>  /* Compute what conditions may or may not hold given information about
>> parameters.  RET_CLAUSE returns truths that may hold in a specialized 
>> copy,
>> @@ -710,13 +740,17 @@ ipa_call_summary::~ipa_call_summary ()
>>  
>>  ipa_fn_summary::~ipa_fn_summary ()
>>  {
>> -  if (loop_iterations)
>> -edge_predicate_pool.remove (loop_iterations);
>> -  if (loop_stride)
>> -edge_predicate_pool.remove (loop_stride);
>> +  unsigned len = vec_safe_length (loop_iterations);
>> +  for (unsigned i = 0; i < len; i++)
>> +edge_predicate_pool.remove ((*loop_iterations)[i].predicate);
>> +  len = vec_safe_length (loop_strides);
>> +  for (unsigned i = 0; i < len; i++)
>> +edge_predicate_pool.remove ((*loop_strides)[i].predicate);
>
> For edges predicates are pointers since most of them have no interesting
> predicate and thus NULL is more compact.  I guess here it would make
> snese to make predicates inline. Is there a problem with vectors not
> liking non-pods?
>>vec_free (conds);
>>vec_free (size_time_table);
>>vec_free (call_size_time_table);
>> +  vec_free (loop_iterations);
>> +  vec_free (loop_strides);
>
> However auto_vecs should work in the brave new C++ world.

Well, the summary lives in GC memory, so I don't think I can put
auto_vecs there.

I will add a note to

[PATCH v2 2/4] __int128.3: New link to system_data_types(7)

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Signed-off-by: Alejandro Colomar 
---
 man3/__int128.3 | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 man3/__int128.3

diff --git a/man3/__int128.3 b/man3/__int128.3
new file mode 100644
index 0..db50c0f09
--- /dev/null
+++ b/man3/__int128.3
@@ -0,0 +1 @@
+.so man7/system_data_types.7
-- 
2.28.0

[PATCH v2 4/4] unsigned-__int128.3: New link to system_data_types(7)

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Signed-off-by: Alejandro Colomar 
---
 man3/unsigned-__int128.3 | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 man3/unsigned-__int128.3

diff --git a/man3/unsigned-__int128.3 b/man3/unsigned-__int128.3
new file mode 100644
index 0..db50c0f09
--- /dev/null
+++ b/man3/unsigned-__int128.3
@@ -0,0 +1 @@
+.so man7/system_data_types.7
-- 
2.28.0

[PATCH v2 3/4] system_data_types.7: Add 'unsigned __int128'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Signed-off-by: Alejandro Colomar 
---
 man7/system_data_types.7 | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/man7/system_data_types.7 b/man7/system_data_types.7
index 5f9aa648f..3cf3f0ec9 100644
--- a/man7/system_data_types.7
+++ b/man7/system_data_types.7
@@ -1822,6 +1822,41 @@ and
 .I void *
 types in this page.
 .RE
+.\"- unsigned __int128 /
+.TP
+.I unsigned __int128
+.RS
+An unsigned integer type
+of a fixed width of exactly 128 bits.
+.PP
+When using GCC,
+it is supported only for targets where
+the compiler is able to generate efficient code for 128-bit arithmetic.
+.PP
+Versions:
+GCC 4.6.0 and later.
+.PP
+Conforming to:
+This is a non-standard extension, present in GCC.
+It is not standardized by the C language standard nor POSIX.
+.PP
+Notes:
+This type is available without including any header.
+.PP
+Bugs:
+It is not possible to express an integer constant of type
+.I unsigned __int128
+in implementations where
+.I unsigned long long
+is less than 128 bits wide.
+.PP
+See also the
+.IR __int128 ,
+.I uintmax_t
+and
+.IR uint N _t
+types in this page.
+.RE
 .\"- va_list --/
 .TP
 .I va_list
-- 
2.28.0

[PATCH v2 1/4] system_data_types.7: Add '__int128'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Signed-off-by: Alejandro Colomar 
---
 man7/system_data_types.7 | 40 
 1 file changed, 40 insertions(+)

diff --git a/man7/system_data_types.7 b/man7/system_data_types.7
index e545aa1a0..5f9aa648f 100644
--- a/man7/system_data_types.7
+++ b/man7/system_data_types.7
@@ -40,6 +40,8 @@ system_data_types \- overview of system data types
 .\"* Description (no "Description" header)
 .\"A few lines describing the type.
 .\"
+.\"* Versions (optional)
+.\"
 .\"* Conforming to (see NOTES)
 .\"Format: CXY and later; POSIX.1- and later.
 .\"
@@ -48,6 +50,44 @@ system_data_types \- overview of system data types
 .\"* Bugs (if any)
 .\"
 .\"* See also
+.\"- __int128 -/
+.TP
+.I __int128
+.RS
+.RI [ signed ]
+.I __int128
+.PP
+A signed integer type
+of a fixed width of exactly 128 bits.
+.PP
+When using GCC,
+it is supported only for targets where
+the compiler is able to generate efficient code for 128-bit arithmetic.
+.PP
+Versions:
+GCC 4.6.0 and later.
+.PP
+Conforming to:
+This is a non-standard extension, present in GCC.
+It is not standardized by the C language standard nor POSIX.
+.PP
+Notes:
+This type is available without including any header.
+.PP
+Bugs:
+It is not possible to express an integer constant of type
+.I __int128
+in implementations where
+.I long long
+is less than 128 bits wide.
+.PP
+See also the
+.IR intmax_t ,
+.IR int N _t
+and
+.I unsigned __int128
+types in this page.
+.RE
 .\"- aiocb /
 .TP
 .I aiocb
-- 
2.28.0

[PATCH v2 0/4] Document 128-bit types

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Hi Michael,

I fixed the stray '"' noticed by Florian.

Cheers,

Alex

Alejandro Colomar (4):
  system_data_types.7: Add '__int128'
  __int128.3: New link to system_data_types(7)
  system_data_types.7: Add 'unsigned __int128'
  unsigned-__int128.3: New link to system_data_types(7)

 man3/__int128.3  |  1 +
 man3/unsigned-__int128.3 |  1 +
 man7/system_data_types.7 | 75 
 3 files changed, 77 insertions(+)
 create mode 100644 man3/__int128.3
 create mode 100644 man3/unsigned-__int128.3

-- 
2.28.0

[PATCH 2/2] void.3: New link to system_data_types(7)

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Signed-off-by: Alejandro Colomar 
---
 man3/void.3 | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 man3/void.3

diff --git a/man3/void.3 b/man3/void.3
new file mode 100644
index 0..db50c0f09
--- /dev/null
+++ b/man3/void.3
@@ -0,0 +1 @@
+.so man7/system_data_types.7
-- 
2.28.0

[PATCH 1/2] system_data_types.7: Add 'void *'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Signed-off-by: Alejandro Colomar 

system_data_types.7: void *: Add info about generic function parameters and 
return value

Reported-by: Paul Eggert 
Reported-by: David Laight 
Signed-off-by: Alejandro Colomar 

system_data_types.7: void *: Add info about pointer artihmetic

Reported-by: Paul Eggert 
Reported-by: David Laight 
Signed-off-by: Alejandro Colomar 

system_data_types.7: void *: Add Versions notes

Compatibility between function pointers and void * hasn't always been so.
Document when that was added to POSIX.

Reported-by: Michael Kerrisk 
Signed-off-by: Alejandro Colomar 
---
 man7/system_data_types.7 | 80 +++-
 1 file changed, 78 insertions(+), 2 deletions(-)

diff --git a/man7/system_data_types.7 b/man7/system_data_types.7
index c82d3b388..277e05b12 100644
--- a/man7/system_data_types.7
+++ b/man7/system_data_types.7
@@ -679,7 +679,6 @@ See also the
 .I uintptr_t
 and
 .I void *
-.\" TODO: Document void *
 types in this page.
 .RE
 .\"- lconv /
@@ -1780,7 +1779,6 @@ See also the
 .I intptr_t
 and
 .I void *
-.\" TODO: Document void *
 types in this page.
 .RE
 .\"- va_list --/
@@ -1814,6 +1812,84 @@ See also:
 .BR va_copy (3),
 .BR va_end (3)
 .RE
+.\"- void * ---/
+.TP
+.I void *
+.RS
+According to the C language standard,
+a pointer to any object type may be converted to a pointer to
+.I void
+and back.
+POSIX further requires that any pointer,
+including pointers to functions,
+may be converted to a pointer to
+.I void
+and back.
+.PP
+Conversions from and to any other pointer type are done implicitly,
+not requiring casts at all.
+Note that this feature prevents any kind of type checking:
+the programmer should be careful not to cast a
+.I void *
+value to a type incompatible to that of the underlying data,
+because that would result in undefined behavior.
+.PP
+This type is useful in function parameters and return value
+to allow passing values of any type.
+The function will usually use some mechanism to know
+of which type the underlying data passed to the function really is.
+.PP
+A value of this type can't be dereferenced,
+as it would give a value of type
+.I void
+which is not possible.
+Likewise, pointer arithmetic is not possible with this type.
+However, in GNU C, poitner arithmetic is allowed
+as an extension to the standard;
+this is done by treating the size of a
+.I void
+or of a function as 1.
+A consequence of this is that
+.I sizeof
+is also allowed on
+.I void
+and on function types, and returns 1.
+.PP
+The conversion specifier for
+.I void *
+for the
+.BR printf (3)
+and the
+.BR scanf (3)
+families of functions is
+.BR p ;
+resulting commonly in
+.B %p
+for printing
+.I void *
+values.
+.PP
+Versions:
+The POSIX requirement about compatibility between
+.I void *
+and function pointers was added in
+POSIX.1-2008 Technical Corrigendum 1 (2013).
+.PP
+Conforming to:
+C99 and later; POSIX.1-2001 and later.
+.PP
+See also:
+.BR malloc (3),
+.BR memcmp (3),
+.BR memcpy (3),
+.BR memset (3)
+.PP
+See also the
+.I intptr_t
+and
+.I uintptr_t
+types in this page.
+.RE
 .\"/
 .SH NOTES
 The structures described in this manual page shall contain,
-- 
2.28.0

[PATCH 0/2] Document 'void *'

2020-10-02 Thread Alejandro Colomar via Gcc-patches

Hi Michael,

As you asked, I squashed.
And added the POSIX.1-2008 note too. Thanks for that!

Cheers,

Alex

Alejandro Colomar (2):
  system_data_types.7: Add 'void *'
  void.3: New link to system_data_types(7)

 man3/void.3  |  1 +
 man7/system_data_types.7 | 80 +++-
 2 files changed, 79 insertions(+), 2 deletions(-)
 create mode 100644 man3/void.3

-- 
2.28.0

c++: Simplify FUNCTION creation

2020-10-02 Thread Nathan Sidwell


I had reason to wander into cp_make_fname, and noticed it's the only
caller of cp_fname_init.  Folding it in makes the code simpler.

gcc/cp/
* cp-tree.h (cp_fname_init): Delete declaration.
* decl.c (cp_fname_init): Merge into only caller ...
(cp_make_fname): ... here & refactor.

pushing to trunk

nathan

--
Nathan Sidwell
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 3ccd54ce24b..aa93b11b91f 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -6520,7 +6520,6 @@ extern tree create_implicit_typedef		(tree, tree);
 extern int local_variable_p			(const_tree);
 extern tree register_dtor_fn			(tree);
 extern tmpl_spec_kind current_tmpl_spec_kind	(int);
-extern tree cp_fname_init			(const char *, tree *);
 extern tree cxx_builtin_function		(tree decl);
 extern tree cxx_builtin_function_ext_scope	(tree decl);
 extern tree cxx_simulate_builtin_function_decl	(tree);
diff --git i/gcc/cp/decl.c w/gcc/cp/decl.c
index d2a8d4012ab..6b306ee4667 100644
--- i/gcc/cp/decl.c
+++ w/gcc/cp/decl.c
@@ -4592,38 +4592,6 @@ cxx_init_decl_processing (void)
 using_eh_for_cleanups ();
 }
 
-/* Generate an initializer for a function naming variable from
-   NAME. NAME may be NULL, to indicate a dependent name.  TYPE_P is
-   filled in with the type of the init.  */
-
-tree
-cp_fname_init (const char* name, tree *type_p)
-{
-  tree domain = NULL_TREE;
-  tree type;
-  tree init = NULL_TREE;
-  size_t length = 0;
-
-  if (name)
-{
-  length = strlen (name);
-  domain = build_index_type (size_int (length));
-  init = build_string (length + 1, name);
-}
-
-  type = cp_build_qualified_type (char_type_node, TYPE_QUAL_CONST);
-  type = build_cplus_array_type (type, domain);
-
-  *type_p = type;
-
-  if (init)
-TREE_TYPE (init) = type;
-  else
-init = error_mark_node;
-
-  return init;
-}
-
 /* Create the VAR_DECL for __FUNCTION__ etc. ID is the name to give
the decl, LOC is the location to give the decl, NAME is the
initialization string and TYPE_DEP indicates whether NAME depended
@@ -4634,31 +4602,45 @@ cp_fname_init (const char* name, tree *type_p)
 static tree
 cp_make_fname_decl (location_t loc, tree id, int type_dep)
 {
-  const char * name = NULL;
-  bool release_name = false;
+  tree domain = NULL_TREE;
+  tree init = NULL_TREE;
+
   if (!(type_dep && in_template_function ()))
 {
+  const char *name = NULL;
+  bool release_name = false;
+
   if (current_function_decl == NULL_TREE)
 	name = "top level";
-  else if (type_dep == 1) /* __PRETTY_FUNCTION__ */
-	name = cxx_printable_name (current_function_decl, 2);
-  else if (type_dep == 0) /* __FUNCTION__ */
+  else if (type_dep == 0)
 	{
+	  /* __FUNCTION__ */	  
 	  name = fname_as_string (type_dep);
 	  release_name = true;
 	}
   else
-	gcc_unreachable ();
+	{
+	  /* __PRETTY_FUNCTION__ */
+	  gcc_checking_assert (type_dep == 1);
+	  name = cxx_printable_name (current_function_decl, 2);
+	}
+
+  size_t length = strlen (name);
+  domain = build_index_type (size_int (length));
+  init = build_string (length + 1, name);
+  if (release_name)
+	free (const_cast (name));
 }
-  tree type;
-  tree init = cp_fname_init (name, );
-  tree decl = build_decl (loc, VAR_DECL, id, type);
 
-  if (release_name)
-free (CONST_CAST (char *, name));
+  tree type = cp_build_qualified_type (char_type_node, TYPE_QUAL_CONST);
+  type = build_cplus_array_type (type, domain);
 
-  /* As we're using pushdecl_with_scope, we must set the context.  */
-  DECL_CONTEXT (decl) = current_function_decl;
+  if (init)
+TREE_TYPE (init) = type;
+  else
+init = error_mark_node;
+
+  tree decl = build_decl (loc, VAR_DECL, id, type);
 
   TREE_READONLY (decl) = 1;
   DECL_ARTIFICIAL (decl) = 1;
@@ -4667,13 +4649,10 @@ cp_make_fname_decl (location_t loc, tree id, int type_dep)
 
   TREE_USED (decl) = 1;
 
-  if (init)
-{
-  SET_DECL_VALUE_EXPR (decl, init);
-  DECL_HAS_VALUE_EXPR_P (decl) = 1;
-  /* For decl_constant_var_p.  */
-  DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = 1;
-}
+  SET_DECL_VALUE_EXPR (decl, init);
+  DECL_HAS_VALUE_EXPR_P (decl) = 1;
+  /* For decl_constant_var_p.  */
+  DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (decl) = 1;
 
   if (current_function_decl)
 {
@@ -4685,7 +4664,7 @@ cp_make_fname_decl (location_t loc, tree id, int type_dep)
   else
 {
   DECL_THIS_STATIC (decl) = true;
-  pushdecl_top_level_and_finish (decl, NULL_TREE);
+  decl = pushdecl_top_level_and_finish (decl, NULL_TREE);
 }
 
   return decl;

Re: [PATCH 1/6] ipa: Bundle vectors describing argument values

2020-10-02 Thread Jan Hubicka

> 2020-09-01  Martin Jambor  
> 
>   * ipa-prop.h (ipa_auto_call_arg_values): New type.
>   (class ipa_call_arg_values): Likewise.
>   (ipa_get_indirect_edge_target): Replaced vector arguments with
>   ipa_call_arg_values in declaration.  Added an overload for
>   ipa_auto_call_arg_values.
>   * ipa-fnsummary.h (ipa_call_context): Removed members m_known_vals,
>   m_known_contexts, m_known_aggs, duplicate_from, release and equal_to,
>   new members m_avals, store_to_cache and equivalent_to_p.  Adjusted
>   construcotr arguments.
>   (estimate_ipcp_clone_size_and_time): Replaced vector arguments
>   with ipa_auto_call_arg_values in declaration.
>   (evaluate_properties_for_edge): Likewise.
>   * ipa-cp.c (ipa_get_indirect_edge_target): Adjusted to work on
>   ipa_call_arg_values rather than on separate vectors.  Added an
>   overload for ipa_auto_call_arg_values.
>   (devirtualization_time_bonus): Adjusted to work on
>   ipa_auto_call_arg_values rather than on separate vectors.
>   (gather_context_independent_values): Adjusted to work on
>   ipa_auto_call_arg_values rather than on separate vectors.
>   (perform_estimation_of_a_value): Likewise.
>   (estimate_local_effects): Likewise.
>   (modify_known_vectors_with_val): Adjusted both variants to work on
>   ipa_auto_call_arg_values and rename them to
>   copy_known_vectors_add_val.
>   (decide_about_value): Adjusted to work on ipa_call_arg_values rather
>   than on separate vectors.
>   (decide_whether_version_node): Likewise.
>   * ipa-fnsummary.c (evaluate_conditions_for_known_args): Likewise.
>   (evaluate_properties_for_edge): Likewise.
>   (ipa_fn_summary_t::duplicate): Likewise.
>   (estimate_edge_devirt_benefit): Adjusted to work on
>   ipa_call_arg_values rather than on separate vectors.
>   (estimate_edge_size_and_time): Likewise.
>   (estimate_calls_size_and_time_1): Likewise.
>   (summarize_calls_size_and_time): Adjusted calls to
>   estimate_edge_size_and_time.
>   (estimate_calls_size_and_time): Adjusted to work on
>   ipa_call_arg_values rather than on separate vectors.
>   (ipa_call_context::ipa_call_context): Construct from a pointer to
>   ipa_auto_call_arg_values instead of inividual vectors.
>   (ipa_call_context::duplicate_from): Adjusted to access vectors within
>   m_avals.
>   (ipa_call_context::release): Likewise.
>   (ipa_call_context::equal_to): Likewise.
>   (ipa_call_context::estimate_size_and_time): Adjusted to work on
>   ipa_call_arg_values rather than on separate vectors.
>   (estimate_ipcp_clone_size_and_time): Adjusted to work with
>   ipa_auto_call_arg_values rather than on separate vectors.
>   (ipa_merge_fn_summary_after_inlining): Likewise.  Adjusted call to
>   estimate_edge_size_and_time.
>   (ipa_update_overall_fn_summary): Adjusted call to
>   estimate_edge_size_and_time.
>   * ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to work with
>   ipa_auto_call_arg_values rather than with separate vectors.
>   (do_estimate_edge_size): Likewise.
>   (do_estimate_edge_hints): Likewise.
>   * ipa-prop.c (ipa_auto_call_arg_values::~ipa_auto_call_arg_values):
>   New destructor.

> @@ -328,14 +326,9 @@ private:
>/* Inline summary maintains info about change probabilities.  */
>vec m_inline_param_summary;
>  
> -  /* The following is used only to resolve indirect calls.  */
> -
> -  /* Vector describing known values of parameters.  */
> -  vec m_known_vals;
> -  /* Vector describing known polymorphic call contexts.  */
> -  vec m_known_contexts;
> -  /* Vector describing known aggregate values.  */
> -  vec m_known_aggs;
> +  /* Even after having calculated clauses, the information about argument
> + values is used to resolve indirect calls.  */
> +  ipa_call_arg_values m_avals;

In cached context we keep the vectors populated only if they are going
to be used by the predicates.  Is this preserved?

Otherwise the patch looks OK.  It would be nice to test it on building
bigger project and see how memory allocation is affeced by your change.

Honza

Re: [PATCH 1/4] system_data_types.7: Add '__int128'

2020-10-02 Thread Florian Weimer via Gcc-patches

* Alejandro Colomar via Libc-alpha:

> +the compiler is able to generate efficient code for 128-bit arithmetic".

Stray "?

> +.PP
> +See also the
> +.IR intmax_t ,
> +.IR int N _t

I think it's surprising that intmax_t can be smaller than __int128 (and
usually is, I think), so that's probably worth mentioning.  Assuming
that we want to document __int128 at all as part of the man-pages
project.

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill

Re: GCC 10 backports

2020-10-02 Thread Martin Liška


On 10/1/20 9:18 PM, Martin Liška wrote:

I'm going to install the following 3 tested backports.

Martin


... and one more.

Martin  
>From f97ef0b2dfdad07db3d564b932c7b7373654b7d4 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 25 Sep 2020 10:53:26 +0200
Subject: [PATCH] GCOV: do not mangle .gcno files.

gcc/ChangeLog:

	PR gcov-profile/97193
	* coverage.c (coverage_init): GCDA note files should not be
	mangled and should end in output directory.

(cherry picked from commit f8dcbea5d2fb17dca3a7de97f15fc49997222365)
---
 gcc/coverage.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/coverage.c b/gcc/coverage.c
index 7d82e44c152..38820bc170f 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -1200,6 +1200,8 @@ coverage_obj_finish (vec *ctor)
 void
 coverage_init (const char *filename)
 {
+  const char *original_filename = filename;
+  int original_len = strlen (original_filename);
 #if HAVE_DOS_BASED_FILE_SYSTEM
   const char *separator = "\\";
 #else
@@ -1271,9 +1273,9 @@ coverage_init (const char *filename)
 	bbg_file_name = xstrdup (profile_note_location);
   else
 	{
-	  bbg_file_name = XNEWVEC (char, len + strlen (GCOV_NOTE_SUFFIX) + 1);
-	  memcpy (bbg_file_name, filename, len);
-	  strcpy (bbg_file_name + len, GCOV_NOTE_SUFFIX);
+	  bbg_file_name = XNEWVEC (char, original_len + strlen (GCOV_NOTE_SUFFIX) + 1);
+	  memcpy (bbg_file_name, original_filename, original_len);
+	  strcpy (bbg_file_name + original_len, GCOV_NOTE_SUFFIX);
 	}
 
   if (!gcov_open (bbg_file_name, -1))
-- 
2.28.0

Re: GCC 9 backports

2020-10-02 Thread Martin Liška


On 10/2/20 12:05 PM, Martin Liška wrote:

There are 2 more I've just tested.

Martin


and one more.

Martin
>From e204fd5113a4ae92713442555ab4258abd84942a Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Fri, 25 Sep 2020 10:53:26 +0200
Subject: [PATCH] GCOV: do not mangle .gcno files.

gcc/ChangeLog:

	PR gcov-profile/97193
	* coverage.c (coverage_init): GCDA note files should not be
	mangled and should end in output directory.

(cherry picked from commit f8dcbea5d2fb17dca3a7de97f15fc49997222365)
---
 gcc/coverage.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/coverage.c b/gcc/coverage.c
index 9be446a862d..c442e2fb008 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -1201,6 +1201,8 @@ coverage_obj_finish (vec *ctor)
 void
 coverage_init (const char *filename)
 {
+  const char *original_filename = filename;
+  int original_len = strlen (original_filename);
 #if HAVE_DOS_BASED_FILE_SYSTEM
   const char *separator = "\\";
 #else
@@ -1255,9 +1257,9 @@ coverage_init (const char *filename)
   /* Name of bbg file.  */
   if (flag_test_coverage && !flag_compare_debug)
 {
-  bbg_file_name = XNEWVEC (char, len + strlen (GCOV_NOTE_SUFFIX) + 1);
-  memcpy (bbg_file_name, filename, len);
-  strcpy (bbg_file_name + len, GCOV_NOTE_SUFFIX);
+  bbg_file_name = XNEWVEC (char, original_len + strlen (GCOV_NOTE_SUFFIX) + 1);
+  memcpy (bbg_file_name, original_filename, original_len);
+  strcpy (bbg_file_name + original_len, GCOV_NOTE_SUFFIX);
 
   if (!gcov_open (bbg_file_name, -1))
 	{
-- 
2.28.0

[Patch] libgomp: Add, if existing, -latomic to libgomp.spec --as-needed (was: Re: [RFC] Offloading and automatic linking of libraries)

2020-10-02 Thread Tobias Burnus


On 10/2/20 12:55 AM, Joseph Myers wrote:


As discussed in bug 81358, I think --as-needed -latomic --no-as-needed
should be used by the driver by default (when the compiler is configured
with libatomic supported).


I make a thinko initially by believing that the '-latomic' has to
be specified by the user-invoked compiler via -foffload=-latomic'
and that had to be put into the .spec file, causing all kind
of problems.

However, this flag can be added into the offload-target's libgomp.spec,
which avoids all kind of issues. That's what this patch now does.

I tested it with x86_64-gnu-linux w/o + w/ nvptx-none. Result:
* x86_64-gnu-linux's libgomp.spec:
  "*link_gomp: -lgomp %{static: -ldl } --as-needed -latomic --no-as-needed"
* nvptx-none's libgomp.spec:
  "*link_gomp: -lgomp  -latomic"

On x86-64, a simple test program did not use and also did not link libatomic.

OK?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
libgomp: Add, if existing, -latomic to libgomp.spec --as-needed

libgomp/ChangeLog:

	* acinclude.m4 (LIBGOMP_CHECK_LIBATOMIC): New; set
	@LIBATOMICSPEC@ is the target libatomic is built.
	* configure.ac: Call LIBGOMP_CHECK_LIBATOMIC.
	* libgomp.spec.in: Add @LIBATOMICSPEC@.
	* Makefile.in: Regenerate.
	* configure: Regenerate.
	* testsuite/Makefile.in: Regenerate.

 libgomp/Makefile.in   |   1 +
 libgomp/acinclude.m4  |  63 ++
 libgomp/configure | 100 +-
 libgomp/configure.ac  |   2 +
 libgomp/libgomp.spec.in   |   2 +-
 libgomp/testsuite/Makefile.in |   1 +
 6 files changed, 166 insertions(+), 3 deletions(-)

diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index 00d5e2919ee..a8ec69f1822 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -395,6 +395,7 @@ INSTALL_STRIP_PROGRAM = @INSTALL_STRIP_PROGRAM@
 INTPTR_T_KIND = @INTPTR_T_KIND@
 LD = @LD@
 LDFLAGS = @LDFLAGS@
+LIBATOMICSPEC = @LIBATOMICSPEC@
 LIBOBJS = @LIBOBJS@
 LIBS = @LIBS@
 LIBTOOL = @LIBTOOL@
diff --git a/libgomp/acinclude.m4 b/libgomp/acinclude.m4
index dbf54d06db9..3d7e5e08c3a 100644
--- a/libgomp/acinclude.m4
+++ b/libgomp/acinclude.m4
@@ -365,3 +365,66 @@ if test $enable_symvers != no ; then
 esac
 fi
 ])
+
+dnl Check whether libatomic exists
+AC_DEFUN([LIBGOMP_CHECK_LIBATOMIC], [
+  LIBATOMICSPEC=
+  libgomp_libatomic=no
+
+  if echo " ${TARGET_CONFIGDIRS} " | grep " libatomic " > /dev/null 2>&1 ; then
+libgomp_libatomic=yes;
+  fi
+
+  AC_MSG_CHECKING([for target-libatomic support])
+  AC_MSG_RESULT([$libgomp_libatomic])
+
+  if test "x$libgomp_libatomic" = xyes; then
+dnl Check whether -Wl,--as-needed resp. -Wl,-zignore is supported
+dnl
+dnl Turn warnings into error to avoid testsuite breakage.  So enable
+dnl AC_LANG_WERROR, but there's currently (autoconf 2.64) no way to turn
+dnl it off again.  As a workaround, save and restore werror flag like
+dnl AC_PATH_XTRA.
+dnl Cf. http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01889.html
+ac_xsave_[]_AC_LANG_ABBREV[]_werror_flag=$ac_[]_AC_LANG_ABBREV[]_werror_flag
+AC_CACHE_CHECK([whether --as-needed/-z ignore works],
+  [libgomp_cv_have_as_needed],
+  [
+  # Test for native Solaris options first.
+  # No whitespace after -z to pass it through -Wl.
+  libgomp_cv_as_needed_option="-zignore"
+  libgomp_cv_no_as_needed_option="-zrecord"
+  save_LDFLAGS="$LDFLAGS"
+  LDFLAGS="$LDFLAGS -Wl,$libgomp_cv_as_needed_option -lm -Wl,$libompr_cv_no_as_needed_option"
+  libgomp_cv_have_as_needed=no
+  AC_LANG_WERROR
+  AC_LINK_IFELSE([AC_LANG_PROGRAM([])],
+		 [libgomp_cv_have_as_needed=yes],
+		 [libgomp_cv_have_as_needed=no])
+  LDFLAGS="$save_LDFLAGS"
+  if test "x$libgomp_cv_have_as_needed" = xno; then
+	libgomp_cv_as_needed_option="--as-needed"
+	libgomp_cv_no_as_needed_option="--no-as-needed"
+	save_LDFLAGS="$LDFLAGS"
+	LDFLAGS="$LDFLAGS -Wl,$libgomp_cv_as_needed_option -lm -Wl,$libgomp_cv_no_as_needed_option"
+	libgomp_cv_have_as_needed=no
+	AC_LANG_WERROR
+	AC_LINK_IFELSE([AC_LANG_PROGRAM([])],
+		   [libgomp_cv_have_as_needed=yes],
+		   [libgomp_cv_have_as_needed=no])
+	LDFLAGS="$save_LDFLAGS"
+  fi
+  ac_[]_AC_LANG_ABBREV[]_werror_flag=$ac_xsave_[]_AC_LANG_ABBREV[]_werror_flag
+])
+
+dnl For static libgfortran linkage, depend on libquadmath only if needed.
+if test "x$libgomp_cv_have_as_needed" = xyes; then
+  LIBATOMICSPEC="$libgomp_cv_as_needed_option -latomic $libgomp_cv_no_as_needed_option"
+else
+  LIBATOMICSPEC="-latomic"
+fi
+  fi
+
+  dnl For the spec file
+  AC_SUBST(LIBATOMICSPEC)
+])
diff --git a/libgomp/configure b/libgomp/configure
index e48371d5093..a4d93974084 100755
--- a/libgomp/configure
+++ b/libgomp/configure
@@ -630,6 +630,8 @@ ac_includes_default="\

[committed] aarch64: Remove aarch64_sve_pred_dominates_p

2020-10-02 Thread Richard Sandiford via Gcc-patches

In r11-2922, Przemek fixed a post-RA instruction match failure
caused by the SVE FP subtraction patterns.  This patch applies
the same fix to the other patterns.

To recap, the issue is around the handling of predication.
We want to do two things:

- Optimise cases in which a predicate is known to be all-true.

- Differentiate cases in which the predicate on an _x ACLE function has
  to be kept as-is from cases in which we can make more lanes active.
  The former is true by default, the latter is true for certain
  combinations of flags in the -ffast-math group.

This is handled by a boolean flag in the unspecs to say whether the
predicate is “strict” or “relaxed”.  When combining multiple strict
operations, the predicates used in the operations generally need to
match.  When combining multiple relaxed operations, we can ignore the
predicates on nested operations and just use the predicate on the
“outermost” operation.

Originally I'd tried to reduce combinatorial explosion by using
aarch64_sve_pred_dominates_p.  This required matching predicates
for strict operations but allowed more combinations for relaxed
operations.

The problem (as I should have remembered) is that C conditions on
insn patterns can't reliably enforce matching operands.  If the
same register is used in two different input operands, the RA is
allowed to use different hard registers for those input operands
(and sometimes it has to).  So operands that match before RA
might not match afterwards.  The only sure way to force a match
is via match_dup.

This patch splits the cases into two.  I cry bitter tears at having
to do this, but I think it's the only backportable fix.  There might
be some way of using define_subst to generate the cond_* patterns from
the pred_* patterns, with some alternatives strategically disabled in
each case, but that's future work and might not be an improvement.

Since so many patterns now do this, I moved the comments from the
subtraction pattern to a new banner comment at the head of the file.

Tested on aarch64-linux-gnu (with and without SVE), pushed to trunk.

Richard


gcc/
* config/aarch64/aarch64-protos.h (aarch64_sve_pred_dominates_p):
Delete.
* config/aarch64/aarch64.c (aarch64_sve_pred_dominates_p): Likewise.
* config/aarch64/aarch64-sve.md: Add banner comment describing
how merging predicated FP operations are represented.
(*cond__2): Split into...
(*cond__2_relaxed): ...this and...
(*cond__2_strict): ...this.
(*cond__any): Split into...
(*cond__any_relaxed): ...this and...
(*cond__any_strict): ...this.
(*cond__2): Split into...
(*cond__2_relaxed): ...this and...
(*cond__2_strict): ...this.
(*cond__any): Split into...
(*cond__any_relaxed): ...this
and...
(*cond__any_strict): ...this.
(*cond__2): Split into...
(*cond__2_relaxed): ...this and...
(*cond__2_strict): ...this.
(*cond__2_const): Split into...
(*cond__2_const_relaxed): ...this
and...
(*cond__2_const_strict): ...this.
(*cond__3): Split into...
(*cond__3_relaxed): ...this and...
(*cond__3_strict): ...this.
(*cond__any): Split into...
(*cond__any_relaxed): ...this and...
(*cond__any_strict): ...this.
(*cond__any_const): Split into...
(*cond__any_const_relaxed): ...this
and...
(*cond__any_const_strict): ...this.
(*cond_add_2_const): Split into...
(*cond_add_2_const_relaxed): ...this and...
(*cond_add_2_const_strict): ...this.
(*cond_add_any_const): Split into...
(*cond_add_any_const_relaxed): ...this and...
(*cond_add_any_const_strict): ...this.
(*cond__2): Split into...
(*cond__2_relaxed): ...this and...
(*cond__2_strict): ...this.
(*cond__any): Split into...
(*cond__any_relaxed): ...this and...
(*cond__any_strict): ...this.
(*cond_sub_3_const): Split into...
(*cond_sub_3_const_relaxed): ...this and...
(*cond_sub_3_const_strict): ...this.
(*aarch64_pred_abd): Split into...
(*aarch64_pred_abd_relaxed): ...this and...
(*aarch64_pred_abd_strict): ...this.
(*aarch64_cond_abd_2): Split into...
(*aarch64_cond_abd_2_relaxed): ...this and...
(*aarch64_cond_abd_2_strict): ...this.
(*aarch64_cond_abd_3): Split into...
(*aarch64_cond_abd_3_relaxed): ...this and...
(*aarch64_cond_abd_3_strict): ...this.
(*aarch64_cond_abd_any): Split into...
(*aarch64_cond_abd_any_relaxed): ...this and...
(*aarch64_cond_abd_any_strict): ...this.
(*cond__2): Split into...
(*cond__2_relaxed): ...this and...
(*cond__2_strict): ...this.
(*cond__4): Split into...
(*cond__4_relaxed): ...this and...
(*cond__4_strict): ...this.

[PATCH] testsuite: Fix FAIL for older ARM cores

2020-10-02 Thread Jonathan Wakely via Gcc-patches

Since hard-float is not implemented for cores that only support thumb1
(and not thumb2) this test fails for configurations using hard float but
older -mcpu settings.

gcc/testsuite/ChangeLog:

* g++.dg/inherit/thunk10.C: Skip test for arm_hf_eabi if -mthumb
doesn't generate thumb2 instructions.

Tested x86_64-linux, and armv7l-linux-gnueabihf with
RUNTESTFLAGS='--target_board=unix\{,-mcpu=cortex-a9\}

No change for x86_64, for ARM this test goes from:

PASS:   8
UNRESOLVED: 4
FAIL:   4

to

UNSUPPORTED:4
PASS:   8

OK for trunk?

commit ce743e1768c75445a9bf8e49d1116b4a7ce99cfc
Author: Jonathan Wakely 
Date:   Fri Oct 2 11:30:51 2020

testsuite: Fix FAIL for older ARM cores

Since hard-float is not implemented for cores that only support thumb1
(and not thumb2) this test fails for configurations using hard float but
older -mcpu settings.

gcc/testsuite/ChangeLog:

* g++.dg/inherit/thunk10.C: Skip test for arm_hf_eabi if -mthumb
doesn't generate thumb2 instructions.

diff --git a/gcc/testsuite/g++.dg/inherit/thunk10.C 
b/gcc/testsuite/g++.dg/inherit/thunk10.C
index 702067749fa..0188436b6cd 100644
--- a/gcc/testsuite/g++.dg/inherit/thunk10.C
+++ b/gcc/testsuite/g++.dg/inherit/thunk10.C
@@ -1,4 +1,5 @@
 /* { dg-options "-mthumb" { target arm*-*-* } } */
+/* { dg-skip-if "hf needs thumb2" { arm_hf_eabi && { ! arm_thumb2_ok } } } */
 /* { dg-do run } */
 /* { dg-timeout 100 } */

RE: [PATCH] arm: Make more use of the new mode macros

2020-10-02 Thread Kyrylo Tkachov via Gcc-patches




> -Original Message-
> From: Richard Sandiford 
> Sent: 02 October 2020 11:39
> To: Christophe Lyon 
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Dennis Zhang ;
> gcc-patches@gcc.gnu.org; Ramana Radhakrishnan
> 
> Subject: [PATCH] arm: Make more use of the new mode macros
> 
> Christophe Lyon  writes:
> > On Tue, 29 Sep 2020 at 12:38, Kyrylo Tkachov 
> wrote:
> >>
> >>
> >>
> >> > -Original Message-
> >> > From: Richard Sandiford 
> >> > Sent: 29 September 2020 11:27
> >> > To: Kyrylo Tkachov 
> >> > Cc: gcc-patches@gcc.gnu.org; ni...@redhat.com; Richard Earnshaw
> >> > ; Ramana Radhakrishnan
> >> > ; Dennis Zhang
> >> > 
> >> > Subject: Ping: [PATCH] arm: Add new vector mode macros
> >> >
> >> > Ping
> >> >
> >> > Richard Sandiford  writes:
> >> > > Kyrylo Tkachov  writes:
> >> > >> This looks like a productive way forward to me.
> >> > >> Okay if the other maintainer don't object by the end of the week.
> >> > >
> >> > > Thanks.  Dennis pointed out off-list that it regressed
> >> > > armv8_2-fp16-arith-2.c, which was expecting FP16 vectorisation
> >> > > to be rejected for -fno-fast-math.  As mentioned above, that shouldn't
> >> > > be necessary given that FP16 arithmetic (unlike FP32 arithmetic) has a
> >> > > flush-to-zero control.
> >> > >
> >> > > This version therefore updates the test to expect the same output
> >> > > as the -ffast-math version.
> >> > >
> >> > > Tested on arm-linux-gnueabi (hopefully for real this time -- I must
> >> > > have messed up the testing last time).  OK for trunk?
> >> > >
> >>
> >> Ok.
> >> Thanks,
> >> Kyrill
> >>
> >
> > Hi Richard,
> >
> > I didn't notice the initial regression you mention above, but after
> 
> (FWIW, the patch wasn't committed in the original form.  Dennis noticed
> the problem when applying it locally.)
> 
> > this commit (r11-3522),
> > I see:
> > FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> > vabs\\.f16\\ts[0-9]+, s[0-9]+ 2
> > FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> > vmul\\.f16\\td[0-9]+, d[0-9]+, d[0-9]+ 1
> > FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> > vmul\\.f16\\tq[0-9]+, q[0-9]+, q[0-9]+ 1
> > FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> > vmul\\.f16\\ts[0-9]+, s[0-9]+, s[0-9]+ 1
> > FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> > vsub\\.f16\\td[0-9]+, d[0-9]+, d[0-9]+ 1
> > FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> > vsub\\.f16\\tq[0-9]+, q[0-9]+, q[0-9]+ 1
> > FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> > vsub\\.f16\\ts[0-9]+, s[0-9]+, s[0-9]+ 1
> >
> > Looks like we are running validations differently?
> 
> Hmm, seems like these tests are unsupported on an arm-linux-gnueabihf
> bootstrap (even though Advanced SIMD is enabled by default) since the
> tests specifically want to be able to compile softfp code. :-(
> 
> This patch also uses the macros for patterns that are currently specific
> to neon.md.  Tested on arm-linux-gnueabihf (FWIW), arm-eabi with MVE,
> and armeb-eabi with various combinations.  I see the above failures
> are fixed for the latter two.  OK to install?

Ok.
Thanks,
Kyrill

> 
> Richard

[PATCH] arm: Make more use of the new mode macros

2020-10-02 Thread Richard Sandiford via Gcc-patches

Christophe Lyon  writes:
> On Tue, 29 Sep 2020 at 12:38, Kyrylo Tkachov  wrote:
>>
>>
>>
>> > -Original Message-
>> > From: Richard Sandiford 
>> > Sent: 29 September 2020 11:27
>> > To: Kyrylo Tkachov 
>> > Cc: gcc-patches@gcc.gnu.org; ni...@redhat.com; Richard Earnshaw
>> > ; Ramana Radhakrishnan
>> > ; Dennis Zhang
>> > 
>> > Subject: Ping: [PATCH] arm: Add new vector mode macros
>> >
>> > Ping
>> >
>> > Richard Sandiford  writes:
>> > > Kyrylo Tkachov  writes:
>> > >> This looks like a productive way forward to me.
>> > >> Okay if the other maintainer don't object by the end of the week.
>> > >
>> > > Thanks.  Dennis pointed out off-list that it regressed
>> > > armv8_2-fp16-arith-2.c, which was expecting FP16 vectorisation
>> > > to be rejected for -fno-fast-math.  As mentioned above, that shouldn't
>> > > be necessary given that FP16 arithmetic (unlike FP32 arithmetic) has a
>> > > flush-to-zero control.
>> > >
>> > > This version therefore updates the test to expect the same output
>> > > as the -ffast-math version.
>> > >
>> > > Tested on arm-linux-gnueabi (hopefully for real this time -- I must
>> > > have messed up the testing last time).  OK for trunk?
>> > >
>>
>> Ok.
>> Thanks,
>> Kyrill
>>
>
> Hi Richard,
>
> I didn't notice the initial regression you mention above, but after

(FWIW, the patch wasn't committed in the original form.  Dennis noticed
the problem when applying it locally.)

> this commit (r11-3522),
> I see:
> FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> vabs\\.f16\\ts[0-9]+, s[0-9]+ 2
> FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> vmul\\.f16\\td[0-9]+, d[0-9]+, d[0-9]+ 1
> FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> vmul\\.f16\\tq[0-9]+, q[0-9]+, q[0-9]+ 1
> FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> vmul\\.f16\\ts[0-9]+, s[0-9]+, s[0-9]+ 1
> FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> vsub\\.f16\\td[0-9]+, d[0-9]+, d[0-9]+ 1
> FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> vsub\\.f16\\tq[0-9]+, q[0-9]+, q[0-9]+ 1
> FAIL: gcc.target/arm/armv8_2-fp16-arith-2.c scan-assembler-times
> vsub\\.f16\\ts[0-9]+, s[0-9]+, s[0-9]+ 1
>
> Looks like we are running validations differently?

Hmm, seems like these tests are unsupported on an arm-linux-gnueabihf
bootstrap (even though Advanced SIMD is enabled by default) since the
tests specifically want to be able to compile softfp code. :-(

This patch also uses the macros for patterns that are currently specific
to neon.md.  Tested on arm-linux-gnueabihf (FWIW), arm-eabi with MVE,
and armeb-eabi with various combinations.  I see the above failures
are fixed for the latter two.  OK to install?

Richard

>From 0100f29846e2d775d6cd1f1cef7614d7b67f5fdd Mon Sep 17 00:00:00 2001
From: Richard Sandiford 
Date: Wed, 30 Sep 2020 20:16:19 +0100
Subject: [PATCH] arm: Make more use of the new mode macros

As Christophe pointed out, my r11-3522 patch didn't in fact fix
all of the armv8_2-fp16-arith-2.c failures introduced by allowing
FP16 vectorisation without -funsafe-math-optimizations.  I must have
only tested the final patch on my usual arm-linux-gnueabihf bootstrap,
which it turns out treats the test as unsupported.

The focus of the original patch was to use mode macros for
patterns that are shared between Advanced SIMD, iwMMXt and MVE.
This patch uses the mode macros for general neon.md patterns too.

gcc/
* config/arm/neon.md (*sub3_neon): Use the new mode macros
for the insn condition.
(sub3, *mul3_neon): Likewise.
(mul3add_neon): Likewise.
(mul3add_neon): Likewise.
(mul3negadd_neon): Likewise.
(fma4, fma4, *fmsub4): Likewise.
(quad_halves_v4sf, reduc_plus_scal_): Likewise.
(reduc_plus_scal_, reduc_smin_scal_): Likewise.
(reduc_smin_scal_, reduc_smax_scal_): Likewise.
(reduc_smax_scal_, mul3): Likewise.
(neon_vabd_2, neon_vabd_3): Likewise.
(fma4_intrinsic): Delete.
(neon_vadd): Use the new mode macros to decide which
form of instruction to generate.
(neon_vmla, neon_vmls): Likewise.
(neon_vsub): Likewise.
(neon_vfma): Generate the main fma4 form instead
of using fma4_intrinsic.

gcc/testsuite/
* gcc.target/arm/armv8_2-fp16-arith-2.c (float16_t): Use _Float16_t
rather than __fp16.
(float16x4_t, float16x4_t): Likewise.
(fp16_abs): Use __builtin_fabsf16.
---
 gcc/config/arm/neon.md| 64 ---
 .../gcc.target/arm/armv8_2-fp16-arith-2.c |  8 +--
 2 files changed, 29 insertions(+), 43 deletions(-)

diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 58832cbf484..85e424e6cf4 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -513,7 +513,7 @@ (define_insn "*sub3_neon"
   [(set (match_operand:VDQ 0 "s_register_operand" "=w")
 (minus:VDQ

Re: [PATCH] libstdc++: Add missing P0896 changes to

2020-10-02 Thread Jonathan Wakely via Gcc-patches


On 01/10/20 10:35 -0400, Patrick Palka via Libstdc++ wrote:

I noticed that the following changes from this paper were not yet
implemented.


Oops, thanks.


OK to commit after testing on x86_64-pc-linux-gnu finishes successfully?

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (reverse_iterator::iter_move):
Define for C++20 as per P0896.
(reverse_iterator::iter_swap): Likewise.
(move_iterator::operator*): Apply P0896 changes for C++20.
(move_iterator::operator[]): Likewise.
* testsuite/24_iterators/reverse_iterator/cust.cc: New test.
---
libstdc++-v3/include/bits/stl_iterator.h  | 33 
.../24_iterators/reverse_iterator/cust.cc | 51 +++
2 files changed, 84 insertions(+)
create mode 100644 libstdc++-v3/testsuite/24_iterators/reverse_iterator/cust.cc

diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index f29bae92706..ca3c4cda329 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -362,6 +362,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  operator[](difference_type __n) const
  { return *(*this + __n); }

+#if __cplusplus > 201703L && __cpp_lib_concepts
+  friend constexpr iter_rvalue_reference_t<_Iterator>
+  iter_move(const reverse_iterator& __i)
+  noexcept(is_nothrow_copy_constructible_v<_Iterator>
+  && noexcept(ranges::iter_move(--declval<_Iterator&>(


All the declval calls need to be std::-qualified.

OK for master and gcc-10 with that change, thanks.

[Patch,committed] Re-generate libgomp's Makefile.in with automake 1.15.1

2020-10-02 Thread Tobias Burnus


libgomp's Makefile.in were last re-generated by automake 1.16.1;
result: --enable-maintainer-mode now tries to find and use
 automake-16
when generating that file.

Committed as obvious in commit 
r11-3613-g2fe5a545e09edb4001bbfb74c571c806ebc7cb25

Tobias

commit 2fe5a545e09edb4001bbfb74c571c806ebc7cb25
Author: Tobias Burnus 
Date:   Fri Oct 2 12:07:57 2020 +0200

libgomp: Regenerate configure files with automake 1.15.1

libgomp/ChangeLog:
* Makefile.in: Regenerate with automake 1.15.1.
* aclocal.m4: Likewise.
* configure: Likewise.
* testsuite/Makefile.in: Likewise.


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

1 2 >

1 - 100 of 119 matches

Mail list logo