[PATCH] Add analyzer plugin support and CPython GIL example

2020-10-14 Thread David Malcolm via Gcc-patches
This patch adds a new GCC plugin event: PLUGIN_ANALYZER_INIT, called
when -fanalyzer is starting, allowing for GCC plugins to register
additional state-machine-based checks within -fanalyzer.  The idea
is that 3rd-party code might want to add domain-specific checks for
its own APIs - with the caveat that the analyzer is itself still
rather experimental.

As an example, the patch adds a proof-of-concept plugin to the testsuite
for checking CPython code: verifying that code that relinquishes
CPython's global interpreter lock doesn't attempt to do anything with
PyObjects in the sections where the lock isn't held.  It also adds a
warning about nested releases of the lock, which is forbidden.
For example:

demo.c: In function 'foo':
demo.c:11:3: warning: use of PyObject '*(obj)' without the GIL
   11 |   Py_INCREF (obj);
  |   ^
  'test': events 1-3
|
|   15 | void test (PyObject *obj)
|  |  ^~~~
|  |  |
|  |  (1) entry to 'test'
|   16 | {
|   17 |   Py_BEGIN_ALLOW_THREADS
|  |   ~~
|  |   |
|  |   (2) releasing the GIL here
|   18 |   foo (obj);
|  |   ~
|  |   |
|  |   (3) calling 'foo' from 'test'
|
+--> 'foo': events 4-5
   |
   |9 | foo (PyObject *obj)
   |  | ^~~
   |  | |
   |  | (4) entry to 'foo'
   |   10 | {
   |   11 |   Py_INCREF (obj);
   |  |   ~
   |  |   |
   |  |   (5) PyObject '*(obj)' used here without the GIL
   |

Doing so requires adding some logic for ignoring macro expansions in
analyzer diagnostics, since the insides of Py_INCREF and
Py_BEGIN_ALLOW_THREADS are not of interest to the user for these cases.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Are the non-analyzer parts OK for master?

gcc/analyzer/ChangeLog:
* analyzer-pass.cc (pass_analyzer::execute): Move sorry call to...
(sorry_no_analyzer): New.
* analyzer.h (class state_machine): New forward decl.
(class logger): New forward decl.
(class plugin_analyzer_init_iface): New.
(sorry_no_analyzer): New decl.
* checker-path.cc (checker_path::fixup_locations): New.
* checker-path.h (checker_event::set_location): New.
(checker_path::fixup_locations): New decl.
* diagnostic-manager.cc
(diagnostic_manager::emit_saved_diagnostic): Call
checker_path::fixup_locations, and call fixup_location
on the primary location.
* engine.cc: Include "plugin.h".
(class plugin_analyzer_init_impl): New.
(impl_run_checkers): Invoke PLUGIN_ANALYZER_INIT callbacks.
* pending-diagnostic.h (pending_diagnostic::fixup_location): New
vfunc.

gcc/ChangeLog:
* doc/plugins.texi (Plugin callbacks): Add PLUGIN_ANALYZER_INIT.
* plugin.c (register_callback): Likewise.
(invoke_plugin_callbacks_full): Likewise.
* plugin.def (PLUGIN_ANALYZER_INIT): New event.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/analyzer_gil_plugin.c: New test.
* gcc.dg/plugin/gil-1.c: New test.
* gcc.dg/plugin/gil.h: New header.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the new plugin
and test.
---
 gcc/analyzer/analyzer-pass.cc |  18 +-
 gcc/analyzer/analyzer.h   |  15 +
 gcc/analyzer/checker-path.cc  |   9 +
 gcc/analyzer/checker-path.h   |   4 +
 gcc/analyzer/diagnostic-manager.cc|   9 +-
 gcc/analyzer/engine.cc|  31 ++
 gcc/analyzer/pending-diagnostic.h |   8 +
 gcc/doc/plugins.texi  |   4 +
 gcc/plugin.c  |   2 +
 gcc/plugin.def|   4 +
 .../gcc.dg/plugin/analyzer_gil_plugin.c   | 436 ++
 gcc/testsuite/gcc.dg/plugin/gil-1.c   |  90 
 gcc/testsuite/gcc.dg/plugin/gil.h |  32 ++
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   2 +
 14 files changed, 660 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/gil-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/gil.h

diff --git a/gcc/analyzer/analyzer-pass.cc b/gcc/analyzer/analyzer-pass.cc
index a27421e46d4..1f65bf8b154 100644
--- a/gcc/analyzer/analyzer-pass.cc
+++ b/gcc/analyzer/analyzer-pass.cc
@@ -83,9 +83,7 @@ pass_analyzer::execute (function *)
 #if ENABLE_ANALYZER
   ana::run_checkers ();
 #else
-  sorry ("%qs was not enabled in this build of GCC"
-" (missing configure-time option %qs)",
-"-fanalyzer", "--enable-analyzer");
+  sorry_no_analyzer ();
 #endif
 
   return 0;
@@ -100,3 +98,17 @@ make_pass_analyzer (gcc::context *ctxt)
 {
   return new pass_analyzer (ctxt);
 }
+

Re: [PATCH] IPA: fix profile handling in IRA

2020-10-14 Thread Vladimir Makarov via Gcc-patches



On 2020-10-14 10:21 a.m., Martin Liška wrote:

Hello.

There's a new version of the patch that fixes profile scaling
in IRA.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?


Yes.  Thank you, Martin.




libgo patch committed: Print reason code if throw fails

2020-10-14 Thread Ian Lance Taylor via Gcc-patches
This libgo patch by Nikhil Benesch prints the reason code if throwing
an unwind exception fails.  Calls to _Unwind_RaiseException and
friends *can* return due to bugs in libgo or memory corruption.  When
this occurs, print a message to stderr with the reason code before
aborting to aid debugging.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
61cc6ab950a64bb1a49f9c07c3efdd11d19484d5
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index e6eb8e5c335..45a7b422a29 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-cc1f7d613f9b0666bbf8aac3dd208d5adfe88546
+b73a8f17dfe8d7c7ecc9ccd0317be5abe71c5509
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/runtime/go-unwind.c b/libgo/runtime/go-unwind.c
index ad3142cb05d..16e05252ec9 100644
--- a/libgo/runtime/go-unwind.c
+++ b/libgo/runtime/go-unwind.c
@@ -59,20 +59,22 @@ void
 rethrowException ()
 {
   struct _Unwind_Exception *hdr;
+  _Unwind_Reason_Code reason;
 
   hdr = (struct _Unwind_Exception *) runtime_g()->exception;
 
 #ifdef __USING_SJLJ_EXCEPTIONS__
-  _Unwind_SjLj_Resume_or_Rethrow (hdr);
+  reason = _Unwind_SjLj_Resume_or_Rethrow (hdr);
 #else
 #if defined(_LIBUNWIND_STD_ABI)
-  _Unwind_RaiseException (hdr);
+  reason = _Unwind_RaiseException (hdr);
 #else
-  _Unwind_Resume_or_Rethrow (hdr);
+  reason = _Unwind_Resume_or_Rethrow (hdr);
 #endif
 #endif
 
   /* Rethrowing the exception should not return.  */
+  runtime_printf ("failed to rethrow unwind exception (reason=%d)\n", reason);
   abort();
 }
 
@@ -105,6 +107,7 @@ throwException ()
 {
   struct _Unwind_Exception *hdr;
   uintptr align;
+  _Unwind_Reason_Code reason;
 
   hdr = (struct _Unwind_Exception *)runtime_g ()->exception;
 
@@ -119,12 +122,13 @@ throwException ()
   hdr->exception_cleanup = NULL;
 
 #ifdef __USING_SJLJ_EXCEPTIONS__
-  _Unwind_SjLj_RaiseException (hdr);
+  reason = _Unwind_SjLj_RaiseException (hdr);
 #else
-  _Unwind_RaiseException (hdr);
+  reason = _Unwind_RaiseException (hdr);
 #endif
 
   /* Raising an exception should not return.  */
+  runtime_printf ("failed to throw unwind exception (reason=%d)\n", reason);
   abort ();
 }
 


Re: [RFC] Automatic linking of libatomic via gcc.c or ...? [PR81358] (dependency for libgomp on nvptx dep, configure overriddable, ...)

2020-10-14 Thread Joseph Myers
On Wed, 14 Oct 2020, Tobias Burnus wrote:

> Question: Where do you think should it be in the driver?

I think it should be somewhere in the expansion of %(link_gcc_c_sequence) 
(i.e. LINK_GCC_C_SEQUENCE_SPEC, which has various target-specific 
definitions), since that's what expands to something like -lgcc -lc -lgcc.  
Maybe after the first -lgcc and before the -lc, since libatomic has 
references to libc functions.

-- 
Joseph S. Myers
jos...@codesourcery.com


Ping: Re: [PATCH, wwwdocs] gcc-11/changes: C++11 is now required to build GCC

2020-10-14 Thread David Malcolm via Gcc-patches
On Wed, 2020-10-07 at 06:26 -0400, David Malcolm wrote:
> This summarizes GCC 11's change in build requirements from C++98 to
> C++11, for the release notes.  I've put it in the Caveats immediately
> below the "The default mode for C++ is..." change hence the wording.
> 
> I've based it on the change to gcc/doc/install.texi in the
> GCC source tree, which was 5329b59a2e13dabbe2038af0fe2e3cf5fc7f98ed
> there.
> 
> Validates.
> 
> OK to commit?

Ping.
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555697.html

> ---
>  htdocs/gcc-11/changes.html | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
> index e2a32e51..e33abe44 100644
> --- a/htdocs/gcc-11/changes.html
> +++ b/htdocs/gcc-11/changes.html
> @@ -35,6 +35,11 @@ a work-in-progress.
>features with -fno-new-ttp-matching.
>
>  
> +  When building GCC itself, the host compiler must now support
> C++11,
> +rather than C++98.  In particular bootstrapping GCC 11 using an
> older
> +version of GCC requires a binary of GCC 4.8 or later, rather
> than of
> +GCC 3.4 or later as was the case for bootstrapping GCC 10.
> +
>Naming and location of auxiliary and dump output files
> changed.
>If you compile multiple input files in a single command, if
> you
>enable Link Time Optimization, or if you use -
> dumpbase,



Re: [PATCH] peel off one less layer of array types (PR 97391)

2020-10-14 Thread Jeff Law via Gcc-patches


On 10/14/20 2:55 PM, Martin Sebor wrote:
> The new gimple_parm_array_size() function computes the size
> of an array function parameter implied by its upper bound.
> The code that does this strips one too many layers of array
> types from the parameter, returning a smaller size than
> implied by the second most significant bound.  (I must have
> copied the code from somewhere else and not tested it very
> well.)
>
> The attached fix removes the additional peeling.  I'll commit
> this obvious patch shortly.
>
> Martin
>
> PS The test exposes two minor bugs/limitations in the warning.
> One is xfailed in the test itself and the other is pr97425.
>
> gcc-97391.diff
>
> PR middle-end/97391 - bogus -Warray-bounds accessing a multidimensional array 
> parameter
>
>   PR middle-end/97391
>   * builtins.c (gimple_parm_array_size): Peel off one less layer
>   of array types.
>
> gcc/testsuite/ChangeLog:
>
>   PR middle-end/97391
>   * gcc.dg/Warray-bounds-68.c: New test.

OK

jeff



Re: [PATCH] RISC-V: Add support for -mcpu option.

2020-10-14 Thread Jim Wilson
On Tue, Oct 13, 2020 at 3:09 AM Kito Cheng  wrote:
>  - The behavior of -mcpu basically equal to -march plus -mtune, but it
>has lower priority than -march and -mtune.

This looks OK to me.

I noticed a few things while testing.  These don't need to be fixed
before the patch is committed.

Using an invalid cpu name results in two errors.
rohan:2116$ ./xgcc -B./ -mcpu=sifive-e51 tmp.c
cc1: error: ‘-mcpu=sifive-e51’: unknown CPU
cc1: error: unknown cpu ‘sifive-e51’ for ‘-mtune’
Ideally that should be one error.  The second error may be confusing
as the user did not specify the -mtune option.  Maybe riscv_parse_tune
can have another option to indicate whether it was passed a tune
string or cpu string, and only error if it was given a tune string,
since the cpu string would already have given an error.  Or maybe pass
in both the cpu and tune strings, and it can ignore the cpu string if
a tune string was specified.  And only give an error if we are using
the tune string.  You can also avoid passing the tune string to
riscv_find_cpu which doesn't appear to be useful.

Using a valid cpu name that has different default arch than configured
gives a possibly confusing error
rohan:2117$ ./xgcc -B./ -mcpu=sifive-s51 tmp.c
cc1: error: requested ABI requires ‘-march’ to subsume the ‘D’ extension
This is for a toolchain with lp64d as default ABI.  There is a
deliberate choice here not to force the ABI from the -mcpu option, but
maybe the error can be improved, as using -mabi to fix this is more
likely to be correct than using -march.  Also, I never liked the use
of "subsume" here.  We shouldn't force users to crack open a
dictionary to figure out what an error message means.  Also, the user
didn't request an ABI, the compiler is using the configured default,
nor did the user specify a -march option.  Maybe something like
   cc1: error: ABI %s requires the 'D' extension, arch %s does not include it
where the %s expands to the ABi and arch that the compiler is using.
There is also another similar error that uses "subsume" to keep these
warnings consistent.  If we want to fix this, this should be a
separate patch.

Jim


libgo patch committed: Export NetBSD specific types in mksysinfo.sh

2020-10-14 Thread Ian Lance Taylor via Gcc-patches
The libgo syscall package depends on many NetBSD-specific types on
NetBSD.  This libgo patch by Nikhil Benesch teaches mksysinfo.sh to
export these types.

This alone is not sufficient to get the syscall package to compile on
NetBSD, but it's a start.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
ca4422ac96183e3c455b1de7f50187eba50bd587
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index c37df37db51..e6eb8e5c335 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-c5505c4e626fa4217911443b4db8b065855a0206
+cc1f7d613f9b0666bbf8aac3dd208d5adfe88546
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/configure.ac b/libgo/configure.ac
index 9a10d3305ab..f87ab65e3ba 100644
--- a/libgo/configure.ac
+++ b/libgo/configure.ac
@@ -580,7 +580,7 @@ AC_C_BIGENDIAN
 
 GCC_CHECK_UNWIND_GETIPINFO
 
-AC_CHECK_HEADERS(port.h sched.h semaphore.h sys/file.h sys/mman.h syscall.h 
sys/epoll.h sys/event.h sys/inotify.h sys/ptrace.h sys/syscall.h sys/user.h 
sys/utsname.h sys/select.h sys/socket.h net/if.h net/if_arp.h net/route.h 
netpacket/packet.h sys/prctl.h sys/mount.h sys/vfs.h sys/statfs.h sys/timex.h 
sys/sysinfo.h utime.h linux/ether.h linux/fs.h linux/ptrace.h linux/reboot.h 
netinet/in_syst.h netinet/ip.h netinet/ip_mroute.h netinet/if_ether.h)
+AC_CHECK_HEADERS(port.h sched.h semaphore.h sys/file.h sys/mman.h syscall.h 
sys/epoll.h sys/event.h sys/inotify.h sys/ptrace.h sys/syscall.h sys/sysctl.h 
sys/user.h sys/utsname.h sys/select.h sys/socket.h net/bpf.h net/if.h 
net/if_arp.h net/route.h netpacket/packet.h sys/prctl.h sys/mount.h sys/vfs.h 
sys/statfs.h sys/timex.h sys/sysinfo.h utime.h linux/ether.h linux/fs.h 
linux/ptrace.h linux/reboot.h netinet/in_syst.h netinet/ip.h 
netinet/ip_mroute.h netinet/if_ether.h)
 
 AC_CHECK_HEADERS([netinet/icmp6.h], [], [],
 [#include 
diff --git a/libgo/go/runtime/os_netbsd.go b/libgo/go/runtime/os_netbsd.go
index 89a8d076f12..9ebb6520771 100644
--- a/libgo/go/runtime/os_netbsd.go
+++ b/libgo/go/runtime/os_netbsd.go
@@ -33,13 +33,6 @@ func lwp_unpark(lwp int32, hint unsafe.Pointer) int32
 //extern sysctl
 func sysctl(*uint32, uint32, *byte, *uintptr, *byte, uintptr) int32
 
-// From NetBSD's 
-const (
-   _CTL_HW  = 6
-   _HW_NCPU = 3
-   _HW_PAGESIZE = 7
-)
-
 func getncpu() int32 {
mib := [2]uint32{_CTL_HW, _HW_NCPU}
out := uint32(0)
diff --git a/libgo/mksysinfo.sh b/libgo/mksysinfo.sh
index 9671e394cb8..607c97d26fe 100755
--- a/libgo/mksysinfo.sh
+++ b/libgo/mksysinfo.sh
@@ -225,6 +225,22 @@ if ! grep '^const _AT_FDCWD = ' ${OUT} >/dev/null 2>&1; 
then
   echo "const _AT_FDCWD = -100" >> ${OUT}
 fi
 
+# sysctl constants.
+grep '^const _CTL' gen-sysinfo.go |
+  sed -e 's/^\(const \)_\(CTL[^= ]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT}
+  grep '^const _SYSCTL' gen-sysinfo.go |
+  sed -e 's/^\(const \)_\(SYSCTL[^= ]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT}
+  grep '^const _NET_RT' gen-sysinfo.go |
+  sed -e 's/^\(const \)_\(NET_RT[^= ]*\)\(.*\)$/\1\2 = _\2/' >> ${OUT}
+
+# The sysctlnode struct.
+grep '^type _sysctlnode ' gen-sysinfo.go | \
+sed -e 's/_sysctlnode/Sysctlnode/' \
+   -e 's/sysctl_flags/Flags/' \
+-e 's/sysctl_name/Name/' \
+-e 's/sysctl_num/Num/' \
+   >> ${OUT}
+
 # sysconf constants.
 grep '^const __SC' gen-sysinfo.go |
   sed -e 's/^\(const \)__\(SC[^= ]*\)\(.*\)$/\1\2 = __\2/' >> ${OUT}
@@ -533,6 +549,7 @@ fi | sed -e 's/type _dirent64/type Dirent/' \
  -e 's/d_name \[0+1\]/d_name [0+256]/' \
  -e 's/d_name/Name/' \
  -e 's/]int8/]byte/' \
+ -e 's/d_fileno/Fileno/' \
  -e 's/d_ino/Ino/' \
  -e 's/d_namlen/Namlen/' \
  -e 's/d_off/Off/' \
@@ -994,6 +1011,39 @@ grep '^type _rtgenmsg ' gen-sysinfo.go | \
   -e 's/rtgen_family/Family/' \
 >> ${OUT}
 
+# The rt_msghdr struct.
+grep '^type _rt_msghdr ' gen-sysinfo.go | \
+sed -e 's/_rt_msghdr/RtMsghdr/g' \
+-e 's/rtm_msglen/Msglen/' \
+-e 's/rtm_version/Version/' \
+-e 's/rtm_type/Type/' \
+-e 's/rtm_index/Index/' \
+-e 's/rtm_flags/Flags/' \
+-e 's/rtm_addrs/Addrs/' \
+-e 's/rtm_pid/Pid/' \
+-e 's/rtm_seq/Seq/' \
+-e 's/rtm_errno/Errno/' \
+-e 's/rtm_use/Use/' \
+-e 's/rtm_inits/Inits/' \
+-e 's/rtm_rmx/Rmx/' \
+-e 's/_rt_metrics/RtMetrics/' \
+  >> ${OUT}
+
+# The rt_metrics struct.
+grep '^type _rt_metrics ' gen-sysinfo.go | \
+sed -e 's/_rt_metrics/RtMetrics/g' \
+-e 's/rmx_locks/Locks/' \
+-e 's/rmx_mtu/Mtu/' \
+-e 's/rmx_hopcount/Hopcount/' \
+-e 's/rmx_recvpipe/Recvpipe/' \
+-e 's/rmx_sendpipe/Sendpipe/' \
+-e 's/rmx_ssthresh/Ssthresh/' \
+-e 's/rmx_rtt/Rtt/' \
+-e 's/rmx_rttvar/Rttvar/' \
+-e 's/rmx_expire/Expire/' \
+-e 

libgo patch committed: correct semaphore implementation on NetBSD

2020-10-14 Thread Ian Lance Taylor via Gcc-patches
This libgo patch by Nikhil Benesch corrects the semaphore
implementation on NetBSD.  NetBSD's semaphores use the underlying
lighweight process mechanism (LWP) on NetBSD, rather than pthreads.
This means the m.prodcid needs to be set to the LWP ID rather than the
pthread ID in order for unpark notifications to get sent to the right
place.

This introduces a new getProcID() method that selects the correct ID
for the platform.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
05bc5d61b7a25381684b92932592f134a940135f
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 2c7a9bde825..c37df37db51 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-6cb7b9e924d84125f21f4a2a96aa0d59466056fe
+c5505c4e626fa4217911443b4db8b065855a0206
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/runtime/os_aix.go b/libgo/go/runtime/os_aix.go
index 951aeb6cffd..f49b83ccbe4 100644
--- a/libgo/go/runtime/os_aix.go
+++ b/libgo/go/runtime/os_aix.go
@@ -21,6 +21,10 @@ type mOS struct {
waitsema uintptr // semaphore for parking on locks
 }
 
+func getProcID() uint64 {
+   return uint64(gettid())
+}
+
 //extern malloc
 func libc_malloc(uintptr) unsafe.Pointer
 
diff --git a/libgo/go/runtime/os_gccgo.go b/libgo/go/runtime/os_gccgo.go
index ab190229860..a8859c085a3 100644
--- a/libgo/go/runtime/os_gccgo.go
+++ b/libgo/go/runtime/os_gccgo.go
@@ -27,8 +27,7 @@ func mpreinit(mp *m) {
 func minit() {
minitSignals()
 
-   // FIXME: only works on linux for now.
-   getg().m.procid = uint64(gettid())
+   getg().m.procid = getProcID()
 }
 
 // Called from dropm to undo the effect of an minit.
diff --git a/libgo/go/runtime/os_hurd.go b/libgo/go/runtime/os_hurd.go
index b3c6f8062ca..1613b410e2c 100644
--- a/libgo/go/runtime/os_hurd.go
+++ b/libgo/go/runtime/os_hurd.go
@@ -18,6 +18,10 @@ type mOS struct {
waitsema uintptr // semaphore for parking on locks
 }
 
+func getProcID() uint64 {
+   return uint64(gettid())
+}
+
 //extern malloc
 func libc_malloc(uintptr) unsafe.Pointer
 
diff --git a/libgo/go/runtime/os_linux.go b/libgo/go/runtime/os_linux.go
index 5d550646715..627b6d6d43c 100644
--- a/libgo/go/runtime/os_linux.go
+++ b/libgo/go/runtime/os_linux.go
@@ -13,6 +13,10 @@ type mOS struct {
unused byte
 }
 
+func getProcID() uint64 {
+   return uint64(gettid())
+}
+
 func futex(addr unsafe.Pointer, op int32, val uint32, ts, addr2 
unsafe.Pointer, val3 uint32) int32 {
return int32(syscall(_SYS_futex, uintptr(addr), uintptr(op), 
uintptr(val), uintptr(ts), uintptr(addr2), uintptr(val3)))
 }
diff --git a/libgo/go/runtime/os_netbsd.go b/libgo/go/runtime/os_netbsd.go
index 69d2c710449..89a8d076f12 100644
--- a/libgo/go/runtime/os_netbsd.go
+++ b/libgo/go/runtime/os_netbsd.go
@@ -14,12 +14,19 @@ type mOS struct {
waitsemacount uint32
 }
 
+func getProcID() uint64 {
+   return uint64(lwp_self())
+}
+
+//extern _lwp_self
+func lwp_self() int32
+
 //go:noescape
-//extern lwp_park
+//extern _lwp_park
 func lwp_park(ts int32, rel int32, abstime *timespec, unpark int32, hint, 
unparkhint unsafe.Pointer) int32
 
 //go:noescape
-//extern lwp_unpark
+//extern _lwp_unpark
 func lwp_unpark(lwp int32, hint unsafe.Pointer) int32
 
 //go:noescape
@@ -88,7 +95,7 @@ func semasleep(ns int64) int32 {
tsp = 
}
ret := lwp_park(_CLOCK_MONOTONIC, _TIMER_RELTIME, tsp, 0, 
unsafe.Pointer(&_g_.m.waitsemacount), nil)
-   if ret == _ETIMEDOUT {
+   if ret != 0 && errno() == _ETIMEDOUT {
return -1
}
}
@@ -101,10 +108,10 @@ func semawakeup(mp *m) {
// "If the target LWP is not currently waiting, it will return
// immediately upon the next call to _lwp_park()."
ret := lwp_unpark(int32(mp.procid), unsafe.Pointer())
-   if ret != 0 && ret != _ESRCH {
+   if ret != 0 && errno() != _ESRCH {
// semawakeup can be called on signal stack.
systemstack(func() {
-   print("thrwakeup addr=", , " sem=", 
mp.waitsemacount, " ret=", ret, "\n")
+   print("thrwakeup addr=", , " sem=", 
mp.waitsemacount, " errno=", errno(), "\n")
})
}
 }
diff --git a/libgo/go/runtime/os_solaris.go b/libgo/go/runtime/os_solaris.go
index 63b5cd70c8c..c568629e566 100644
--- a/libgo/go/runtime/os_solaris.go
+++ b/libgo/go/runtime/os_solaris.go
@@ -10,6 +10,10 @@ type mOS struct {
waitsema uintptr // semaphore for parking on locks
 }
 
+func getProcID() uint64 {
+   return uint64(gettid())
+}
+
 //extern malloc
 func libc_malloc(uintptr) unsafe.Pointer
 


[PATCH] peel off one less layer of array types (PR 97391)

2020-10-14 Thread Martin Sebor via Gcc-patches

The new gimple_parm_array_size() function computes the size
of an array function parameter implied by its upper bound.
The code that does this strips one too many layers of array
types from the parameter, returning a smaller size than
implied by the second most significant bound.  (I must have
copied the code from somewhere else and not tested it very
well.)

The attached fix removes the additional peeling.  I'll commit
this obvious patch shortly.

Martin

PS The test exposes two minor bugs/limitations in the warning.
One is xfailed in the test itself and the other is pr97425.
PR middle-end/97391 - bogus -Warray-bounds accessing a multidimensional array parameter

	PR middle-end/97391
	* builtins.c (gimple_parm_array_size): Peel off one less layer
	of array types.

gcc/testsuite/ChangeLog:

	PR middle-end/97391
	* gcc.dg/Warray-bounds-68.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 3f799e54d5f..72627b5b859 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -4493,12 +4493,9 @@ gimple_parm_array_size (tree ptr, wide_int rng[2],
 
   rng[0] = wi::zero (prec);
   rng[1] = wi::uhwi (access->minsize, prec);
-  /* If the PTR argument points to an array multiply MINSIZE by the size
- of array element type.  Otherwise, multiply it by the size of what
- the pointer points to.  */
+  /* Multiply the array bound encoded in the attribute by the size
+ of what the pointer argument to which it decays points to.  */
   tree eltype = TREE_TYPE (TREE_TYPE (ptr));
-  if (TREE_CODE (eltype) == ARRAY_TYPE)
-eltype = TREE_TYPE (eltype);
   tree size = TYPE_SIZE_UNIT (eltype);
   if (!size || TREE_CODE (size) != INTEGER_CST)
 return NULL_TREE;
diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-68.c b/gcc/testsuite/gcc.dg/Warray-bounds-68.c
new file mode 100644
index 000..d6616695471
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Warray-bounds-68.c
@@ -0,0 +1,118 @@
+/* PR middle-end/97391 - bogus -Warray-bounds accessing a multidimensional
+   array parameter
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+
+void nowarn_access_loop_idx (char a[3][5])
+{
+  for (int i = 0; i < 3; i++)
+for (int j = 0; j < 5; j++)
+  a[i][j] = 0;
+}
+
+void warn_access_loop_idx (char a[3][5])
+{
+  for (int i = 0; i < 3; i++)
+for (int j = 0; j < 5; j++)
+  a[j][i] = 0;// { dg-warning "\\\[-Warray-bounds" }
+}
+
+
+void nowarn_access_cst_idx (int a[5][7][9])
+{
+  a[0][0][0] = __LINE__;
+  a[0][0][8] = __LINE__;
+
+  a[0][6][0] = __LINE__;
+  a[0][6][8] = __LINE__;
+
+  a[4][0][0] = __LINE__;
+  a[4][0][8] = __LINE__;
+  a[4][6][8] = __LINE__;
+}
+
+
+void test_ptr_access_cst_idx (int a[5][7][9])
+{
+  int *p = [0][0][0];
+
+  p[0] = __LINE__;
+  p[8] = __LINE__;
+
+  /* The following access should trigger a warning but it's represented
+ the same  as the valid access in
+   p = a[0][1][0];
+   p[1] = __LINE__;
+ both as
+   MEM[(int *)a_1(D) + 36B] = __LINE__;  */
+
+  p[9] = __LINE__;// { dg-warning "\\\[-Warray-bounds" "pr?" { xfail *-*-* } }
+
+  p[315] = __LINE__;
+  // { dg-warning "subscript 315 is outside array bounds of 'int\\\[5]\\\[7]\\\[9]'" "pr97425" { xfail *-*-* } .-1 }
+  // { dg-warning "subscript 315 is outside array bounds " "" { target *-*-* } .-2 }
+
+  p = [0][6][0];
+  p[0] = __LINE__;
+  p[8] = __LINE__;
+
+  p = [4][6][0];
+  p[0] = __LINE__;
+  p[8] = __LINE__;
+}
+
+
+void warn_access_cst_idx (int a[5][7][9])
+{
+  a[0][0][9] = __LINE__;  // { dg-warning "subscript 9 is above array bounds of 'int\\\[9]'" }
+  a[0][7][0] = __LINE__;  // { dg-warning "subscript 7 is above array bounds of 'int\\\[7]\\\[9]'" }
+  a[5][0][0] = __LINE__;
+  // { dg-warning "subscript 5 is outside array bounds of 'int\\\[5]\\\[7]\\\[9]'" "pr97425" { xfail *-*-* } .-1 }
+  // { dg-warning "subscript \\d+ is outside array bounds" "" { target *-*-* } .-2 }
+}
+
+
+void test_ptrarray_access_cst_idx (int (*pa)[5][7][9])
+{
+  (*pa)[0][0][0] = __LINE__;
+  (*pa)[0][0][8] = __LINE__;
+  (*pa)[0][0][9] = __LINE__;  // { dg-warning "subscript 9 is above array bounds of 'int\\\[9]'" }
+
+  (*pa)[0][6][0] = __LINE__;
+  (*pa)[0][7][0] = __LINE__;  // { dg-warning "subscript 7 is above array bounds of 'int\\\[7]\\\[9]'" }
+  (*pa)[0][8][0] = __LINE__;  // { dg-warning "subscript 8 is above array bounds of 'int\\\[7]\\\[9]'" }
+
+  (*pa)[4][6][8] = __LINE__;
+  (*pa)[5][0][0] = __LINE__;  // { dg-warning "subscript 5 is above array bounds of 'int\\\[5]\\\[7]\\\[9]'" }
+}
+
+
+void test_ptr_ptrarray_access_cst_idx (int (*pa)[5][7][9])
+{
+  int *p = &(*pa)[0][0][0];
+
+  p[0] = __LINE__;
+  p[8] = __LINE__;
+
+  /* The following access should trigger a warning but it's represented
+ the same  as the valid access in
+   p = a[0][1][0];
+   p[1] = __LINE__;
+ both as
+   MEM[(int *)a_1(D) + 36B] = __LINE__;  */
+
+  p[9] = __LINE__;// { dg-warning "\\\[-Warray-bounds" "pr?" { xfail *-*-* } }
+
+  

[committed] analyzer: fix ICE on globals with unknown size [PR93388]

2020-10-14 Thread David Malcolm via Gcc-patches
This patch fixes an ICE seen when attempting to build various existing
tests in our testsuite with -fanalyzer, including
gcc.c-torture/compile/980816-1.c.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as 61a43de58cb6de7212a622060500ad0a0fd94fae.

gcc/analyzer/ChangeLog:
PR analyzer/93388
* region-model.cc (region_model::get_initial_value_for_global):
Fall back to returning an initial_svalue if
decl_region::get_svalue_for_initializer fails.
* region.cc (decl_region::get_svalue_for_initializer): Don't
attempt to create a compound_svalue if the region has an unknown
size.

gcc/testsuite/ChangeLog:
PR analyzer/93388
* gcc.dg/analyzer/data-model-21.c: New test.
---
 gcc/analyzer/region-model.cc  | 37 ++-
 gcc/analyzer/region.cc| 16 ++--
 gcc/testsuite/gcc.dg/analyzer/data-model-21.c |  8 
 3 files changed, 40 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/data-model-21.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 922e0361e59..06c0c8668ac 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -1345,26 +1345,27 @@ region_model::get_initial_value_for_global (const 
region *reg) const
   if ((called_from_main_p () && !DECL_EXTERNAL (decl))
   || TREE_READONLY (decl))
 {
-  /* Get the initializer value for base_reg.  */
-  const svalue *base_reg_init
-   = base_reg->get_svalue_for_initializer (m_mgr);
-  gcc_assert (base_reg_init);
-  if (reg == base_reg)
-   return base_reg_init;
-  else
+  /* Attempt to get the initializer value for base_reg.  */
+  if (const svalue *base_reg_init
+   = base_reg->get_svalue_for_initializer (m_mgr))
{
- /* Get the value for REG within base_reg_init.  */
- binding_cluster c (base_reg);
- c.bind (m_mgr->get_store_manager (), base_reg, base_reg_init,
- BK_direct);
- const svalue *sval
-   = c.get_any_binding (m_mgr->get_store_manager (), reg);
- if (sval)
+ if (reg == base_reg)
+   return base_reg_init;
+ else
{
- if (reg->get_type ())
-   sval = m_mgr->get_or_create_cast (reg->get_type (),
- sval);
- return sval;
+ /* Get the value for REG within base_reg_init.  */
+ binding_cluster c (base_reg);
+ c.bind (m_mgr->get_store_manager (), base_reg, base_reg_init,
+ BK_direct);
+ const svalue *sval
+   = c.get_any_binding (m_mgr->get_store_manager (), reg);
+ if (sval)
+   {
+ if (reg->get_type ())
+   sval = m_mgr->get_or_create_cast (reg->get_type (),
+ sval);
+ return sval;
+   }
}
}
 }
diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc
index 0820893a9b4..adf0e2c3ce3 100644
--- a/gcc/analyzer/region.cc
+++ b/gcc/analyzer/region.cc
@@ -927,7 +927,9 @@ decl_region::get_svalue_for_constructor (tree ctor,
 
Get an svalue for the initial value of this region at entry to
"main" (either based on DECL_INITIAL, or implicit initialization to
-   zero.  */
+   zero.
+
+   Return NULL if there is a problem.  */
 
 const svalue *
 decl_region::get_svalue_for_initializer (region_model_manager *mgr) const
@@ -935,12 +937,20 @@ decl_region::get_svalue_for_initializer 
(region_model_manager *mgr) const
   tree init = DECL_INITIAL (m_decl);
   if (!init)
 {
-  /* Implicit initialization to zero; use a compound_svalue for it.  */
+  /* Implicit initialization to zero; use a compound_svalue for it.
+Doing so requires that we have a concrete binding for this region,
+which can fail if we have a region with unknown size
+(e.g. "extern const char arr[];").  */
+  const binding_key *binding
+   = binding_key::make (mgr->get_store_manager (), this, BK_direct);
+  if (binding->symbolic_p ())
+   return NULL;
+
   binding_cluster c (this);
   c.zero_fill_region (mgr->get_store_manager (), this);
   return mgr->get_or_create_compound_svalue (TREE_TYPE (m_decl),
 c.get_map ());
- }
+}
 
   if (TREE_CODE (init) == CONSTRUCTOR)
 return get_svalue_for_constructor (init, mgr);
diff --git a/gcc/testsuite/gcc.dg/analyzer/data-model-21.c 
b/gcc/testsuite/gcc.dg/analyzer/data-model-21.c
new file mode 100644
index 000..b952bcb9748
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/data-model-21.c
@@ -0,0 +1,8 @@
+extern const char XtStrings[];
+
+void unknown_fn (void *);
+
+void test (void)
+{
+  unknown_fn ((char*)[429]);
+}
-- 
2.26.2



[committed] analyzer: fix build with ada [PR93723]

2020-10-14 Thread David Malcolm via Gcc-patches
This patch fixes an ICE seen in various ada source files within the
analyzer when attempting to bootstrap with
  --with-build-config=bootstrap-analyzer
where:
$ cat config/bootstrap-analyzer.mk 
STAGE2_CFLAGS += -fanalyzer
STAGE3_CFLAGS += -fanalyzer

With this patch, the bootstrap succeeded (after 7 hours; it normally
takes 40-45 minutes on this machine) and the only regression test
failures were in gcc.dg/plugin/poly-int-*_plugin.c, due to hitting
the 5 minutes per test timeouts (where -fanalyzer was being injected
into the test runs).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-3895-g12b267cc606a48a2fef809189c35573c4a51d3a5.

gcc/analyzer/ChangeLog:
PR analyzer/93723
* store.cc (binding_map::apply_ctor_to_region): Remove redundant
assertion.
---
 gcc/analyzer/store.cc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index 11585123561..7e91addd035 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -455,7 +455,6 @@ binding_map::apply_ctor_to_region (const region 
*parent_reg, tree ctor,
 {
   gcc_assert (parent_reg);
   gcc_assert (TREE_CODE (ctor) == CONSTRUCTOR);
-  gcc_assert (!CONSTRUCTOR_NO_CLEARING (ctor));
 
   unsigned ix;
   tree index;
-- 
2.26.2



[committed] analyzer: don't use in tests [PR97394]

2020-10-14 Thread David Malcolm via Gcc-patches
PR analyzer/97394 reports issues with analyzer setjmp results
when testing against MUSL.  This patch fixes up gcc.dg/analyzer
so that it doesn't use .

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-3894-g974e3975c5bd14ee8817f892532d1e55492227df.

gcc/testsuite/ChangeLog:
PR analyzer/97394
* gcc.dg/analyzer/setjmp-pr93378.c: Use test-setjmp.h rather than
.
* gcc.dg/analyzer/sigsetjmp-5.c: Likewise.
* gcc.dg/analyzer/sigsetjmp-6.c: Likewise.
* gcc.dg/analyzer/test-setjmp.h: Don't include .
Provide decls of jmp_buf, sigjmp_buf, setjmp, sigsetjmp,
longjmp, and siglongjmp.
---
 gcc/testsuite/gcc.dg/analyzer/setjmp-pr93378.c |  2 +-
 gcc/testsuite/gcc.dg/analyzer/sigsetjmp-5.c|  2 +-
 gcc/testsuite/gcc.dg/analyzer/sigsetjmp-6.c|  2 +-
 gcc/testsuite/gcc.dg/analyzer/test-setjmp.h| 15 ---
 4 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/analyzer/setjmp-pr93378.c 
b/gcc/testsuite/gcc.dg/analyzer/setjmp-pr93378.c
index 6e2468e701a..e31e127d09d 100644
--- a/gcc/testsuite/gcc.dg/analyzer/setjmp-pr93378.c
+++ b/gcc/testsuite/gcc.dg/analyzer/setjmp-pr93378.c
@@ -1,7 +1,7 @@
 /* { dg-additional-options "-O1 -g" } */
 /* { dg-require-effective-target indirect_jumps } */
 
-#include 
+#include "test-setjmp.h"
 
 jmp_buf buf;
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/sigsetjmp-5.c 
b/gcc/testsuite/gcc.dg/analyzer/sigsetjmp-5.c
index 2bc73e80f2d..d6a9910478c 100644
--- a/gcc/testsuite/gcc.dg/analyzer/sigsetjmp-5.c
+++ b/gcc/testsuite/gcc.dg/analyzer/sigsetjmp-5.c
@@ -1,6 +1,6 @@
 /* { dg-require-effective-target sigsetjmp } */
 
-#include 
+#include "test-setjmp.h"
 #include 
 #include "analyzer-decls.h"
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/sigsetjmp-6.c 
b/gcc/testsuite/gcc.dg/analyzer/sigsetjmp-6.c
index d45804b951a..f89277efc48 100644
--- a/gcc/testsuite/gcc.dg/analyzer/sigsetjmp-6.c
+++ b/gcc/testsuite/gcc.dg/analyzer/sigsetjmp-6.c
@@ -1,6 +1,6 @@
 /* { dg-require-effective-target sigsetjmp } */
 
-#include 
+#include "test-setjmp.h"
 #include 
 #include 
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/test-setjmp.h 
b/gcc/testsuite/gcc.dg/analyzer/test-setjmp.h
index ee0e1ec7d75..db2422709e2 100644
--- a/gcc/testsuite/gcc.dg/analyzer/test-setjmp.h
+++ b/gcc/testsuite/gcc.dg/analyzer/test-setjmp.h
@@ -7,10 +7,19 @@
 
setjmp is a function on some systems and a macro on others.
This header provides a SETJMP macro in a (fake) system header,
-   for consistency of output across such systems.  */
-
-#include 
+   along with precanned decls of setjmp, for consistency of output across
+   different systems.  */
 
 #pragma GCC system_header
 
+struct __jmp_buf_tag {};
+typedef struct __jmp_buf_tag jmp_buf[1];
+typedef struct __jmp_buf_tag sigjmp_buf[1];
+
+extern int setjmp(jmp_buf env);
+extern int sigsetjmp(sigjmp_buf env, int savesigs);
+
+extern void longjmp(jmp_buf env, int val);
+extern void siglongjmp(sigjmp_buf env, int val);
+
 #define SETJMP(E) setjmp(E)
-- 
2.26.2



[PATCH] openmp: Implement support for OMP_TARGET_OFFLOAD

2020-10-14 Thread Kwok Cheung Yeung

Hello

This implements support for the OMP_TARGET_OFFLOAD environment variable 
introduced in the OpenMP 5.0 standard, which controls how offloading is handled 
in an OpenMP program.


If set to MANDATORY, then libgomp will cause the program to abort with a 
gomp_fatal if an offload device is not found, or if it falls back to the host 
for some reason. When DISABLED, then gomp_target_init will return early, so that 
libgomp acts as if no offload devices were found and the host fallback is always 
used. For DEFAULT, nothing is done, resulting in the original behaviour.


I'm not sure how this can be tested automatically, as the behaviour depends on 
whether the compiler has been built with offloading support, and whether any 
supported offloading hardware has been installed on the system. I have not 
included any testcases for now.


Okay for trunk?

Thanks

Kwok

commit a22f434d5ec9e62c158912b693275ce89a2cbab0
Author: Kwok Cheung Yeung 
Date:   Thu Oct 8 10:08:27 2020 -0700

openmp: Implement support for OMP_TARGET_OFFLOAD environment variable

This implements support for the OMP_TARGET_OFFLOAD environment variable
introduced in the OpenMP 5.0 standard, which controls how offloading
is handled.  It may be set to MANDATORY (abort if offloading cannot be
performed), DISABLED (no offloading to devices) or DEFAULT (offload to
device if possible, fall back to host if not).

2020-10-14  Kwok Cheung Yeung  

libgomp/
* env.c (gomp_target_offload_var): New.
(parse_target_offload): New.
(handle_omp_display_env): Print value of OMP_TARGET_OFFLOAD.
(initialize_env): Parse OMP_TARGET_OFFLOAD.
* libgomp.h (gomp_target_offload_t): New.
(gomp_target_offload_var): New.
* libgomp.texi (OMP_TARGET_OFFLOAD): New section.
* target.c (resolve_device): Generate error if device not found and
offloading is mandatory.
(gomp_target_fallback): Generate error if offloading is mandatory.
(gomp_target_fallback): Likewise.
(gomp_target_init): Return early if offloading is disabled.

diff --git a/libgomp/env.c b/libgomp/env.c
index d730c48..d0eae8d 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -75,6 +75,7 @@ struct gomp_task_icv gomp_global_icv = {
 
 unsigned long gomp_max_active_levels_var = gomp_supported_active_levels;
 bool gomp_cancel_var = false;
+enum gomp_target_offload_t gomp_target_offload_var = 
GOMP_TARGET_OFFLOAD_DEFAULT;
 int gomp_max_task_priority_var = 0;
 #ifndef HAVE_SYNC_BUILTINS
 gomp_mutex_t gomp_managed_threads_lock;
@@ -374,6 +375,48 @@ parse_unsigned_long_list (const char *name, unsigned long 
*p1stvalue,
   return false;
 }
 
+static void
+parse_target_offload (const char *name, enum gomp_target_offload_t *offload)
+{
+  const char *env;
+  bool found = false;
+  enum gomp_target_offload_t new_offload;
+
+  env = getenv (name);
+  if (env == NULL)
+return;
+
+  while (isspace ((unsigned char) *env))
+++env;
+  if (strncasecmp (env, "default", 7) == 0)
+{
+  env += 7;
+  found = true;
+  new_offload = GOMP_TARGET_OFFLOAD_DEFAULT;
+}
+  else if (strncasecmp (env, "mandatory", 9) == 0)
+{
+  env += 9;
+  found = true;
+  new_offload = GOMP_TARGET_OFFLOAD_MANDATORY;
+}
+  else if (strncasecmp (env, "disabled", 8) == 0)
+{
+  env += 8;
+  found = true;
+  new_offload = GOMP_TARGET_OFFLOAD_DISABLED;
+}
+  while (isspace ((unsigned char) *env))
+++env;
+  if (found && *env == '\0')
+{
+  *offload = new_offload;
+  return;
+}
+
+  gomp_error ("Invalid value for environment variable OMP_TARGET_OFFLOAD");
+}
+
 /* Parse environment variable set to a boolean or list of omp_proc_bind_t
enum values.  Return true if one was present and it was successfully
parsed.  */
@@ -1334,6 +1377,21 @@ handle_omp_display_env (unsigned long stacksize, int 
wait_policy)
 }
   fputs ("'\n", stderr);
 
+  fputs ("  OMP_TARGET_OFFLOAD = '", stderr);
+  switch (gomp_target_offload_var)
+{
+case GOMP_TARGET_OFFLOAD_DEFAULT:
+  fputs ("DEFAULT", stderr);
+  break;
+case GOMP_TARGET_OFFLOAD_MANDATORY:
+  fputs ("MANDATORY", stderr);
+  break;
+case GOMP_TARGET_OFFLOAD_DISABLED:
+  fputs ("DISABLED", stderr);
+  break;
+}
+  fputs ("'\n", stderr);
+
   if (verbose)
 {
   fputs ("  GOMP_CPU_AFFINITY = ''\n", stderr);
@@ -1366,6 +1424,7 @@ initialize_env (void)
   parse_boolean ("OMP_CANCELLATION", _cancel_var);
   parse_boolean ("OMP_DISPLAY_AFFINITY", _display_affinity_var);
   parse_int ("OMP_DEFAULT_DEVICE", _global_icv.default_device_var, true);
+  parse_target_offload ("OMP_TARGET_OFFLOAD", _target_offload_var);
   parse_int ("OMP_MAX_TASK_PRIORITY", _max_task_priority_var, true);
   parse_unsigned_long ("OMP_MAX_ACTIVE_LEVELS", _max_active_levels_var,
   true);
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 

Re: [PATCH] wrap long tree chains in a list to avoid attribute error (PR 97413)

2020-10-14 Thread Jeff Law via Gcc-patches


On 10/14/20 2:09 PM, Martin Sebor via Gcc-patches wrote:
> The attribute access implicitly added to function declarations
> with VLA parameters includes the top-level VLA bounds chained
> together similarly to other attribute arguments.  However,
> since attribute access is limited to at most 3 arguments
> including the mode, a function that takes three or more VLA
> arguments triggers an error from decl_attributes complaining
> about excess arguments.
>
> The attached patch gets around it by wrapping the VLA bound
> chain in a list.  It doesn't feel like the most elegant solution
> but it's simple and  I couldn't think of anything better.  If no
> one comes up with a suggestion for a better way of dealing with
> this I'll commit the fix sometime tomorrow as obvious.
>
> Martin
>
>
> gcc-97413.diff
>
> PR c/97413 - bogus error on function declaration with many VLA arguments
>
> gcc/ChangeLog:
>
>   PR c/97413
>   * attribs.c (init_attr_rdwr_indices): Unwrap extra list layer.
>
> gcc/c-family/ChangeLog:
>
>   PR c/97413
>   * c-attribs.c (build_attr_access_from_parms): Wrap chain of VLA
>   bounds in an extra list.
>
> gcc/testsuite/ChangeLog:
>
>   PR c/97413
>   * gcc.dg/Wvla-parameter-8.c: New test.

So as I mentioned in IRC, at the time you posted this I had just
extracted a testcase from a package build failure.  Your timing was
impeccable to get my immediate attention :-)


OK.


jeff



Re: [PATCH 1/8] [RS6000] rs6000_rtx_costs comment

2020-10-14 Thread Segher Boessenkool
Hi!

On Thu, Oct 08, 2020 at 09:27:53AM +1030, Alan Modra wrote:
> This lays out the ground rules for following patches.
> 
>   * config/rs6000/rs6000.c (rs6000_rtx_costs): Expand comment.

This is okay for trunk.  Thanks!


Segher


[PATCH] wrap long tree chains in a list to avoid attribute error (PR 97413)

2020-10-14 Thread Martin Sebor via Gcc-patches

The attribute access implicitly added to function declarations
with VLA parameters includes the top-level VLA bounds chained
together similarly to other attribute arguments.  However,
since attribute access is limited to at most 3 arguments
including the mode, a function that takes three or more VLA
arguments triggers an error from decl_attributes complaining
about excess arguments.

The attached patch gets around it by wrapping the VLA bound
chain in a list.  It doesn't feel like the most elegant solution
but it's simple and  I couldn't think of anything better.  If no
one comes up with a suggestion for a better way of dealing with
this I'll commit the fix sometime tomorrow as obvious.

Martin

PR c/97413 - bogus error on function declaration with many VLA arguments

gcc/ChangeLog:

	PR c/97413
	* attribs.c (init_attr_rdwr_indices): Unwrap extra list layer.

gcc/c-family/ChangeLog:

	PR c/97413
	* c-attribs.c (build_attr_access_from_parms): Wrap chain of VLA
	bounds in an extra list.

gcc/testsuite/ChangeLog:

	PR c/97413
	* gcc.dg/Wvla-parameter-8.c: New test.

diff --git a/gcc/attribs.c b/gcc/attribs.c
index 94b9e02699f..3bdb2ffda81 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -2049,6 +2049,8 @@ init_attr_rdwr_indices (rdwr_map *rwm, tree attrs)
 
   /* The (optional) list of VLA bounds.  */
   tree vblist = TREE_CHAIN (mode);
+  if (vblist)
+   vblist = TREE_VALUE (vblist);
 
   mode = TREE_VALUE (mode);
   if (TREE_CODE (mode) != STRING_CST)
diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index c779d13f023..8283e959c89 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -4547,10 +4547,11 @@ handle_access_attribute (tree node[3], tree name, tree args,
result in the following attribute access:
 
  value: "+^2[*],$0$1^3[*],$1$1"
- chain: <0, x> <1, y>
+ list:  < <0, x> <1, y> >
 
-   where each  on the chain corresponds to one VLA bound for each
-   of the two parameters.  */
+   where the list has a single value which itself is is a list each
+   of whose s corresponds to one VLA bound for each of the two
+   parameters.  */
 
 tree
 build_attr_access_from_parms (tree parms, bool skip_voidptr)
@@ -4654,13 +4655,17 @@ build_attr_access_from_parms (tree parms, bool skip_voidptr)
   if (!spec.length ())
 return NULL_TREE;
 
+  /* Attribute access takes a two or three arguments.  Wrap VBLIST in
+ another list in case it has more nodes than would otherwise fit.  */
+vblist = build_tree_list (NULL_TREE, vblist);
+
   /* Build a single attribute access with the string describing all
  array arguments and an optional list of any non-parameter VLA
  bounds in order.  */
   tree str = build_string (spec.length (), spec.c_str ());
   tree attrargs = tree_cons (NULL_TREE, str, vblist);
   tree name = get_identifier ("access");
-  return tree_cons (name, attrargs, NULL_TREE);
+  return build_tree_list (name, attrargs);
 }
 
 /* Handle a "nothrow" attribute; arguments as in
diff --git a/gcc/testsuite/gcc.dg/Wvla-parameter-8.c b/gcc/testsuite/gcc.dg/Wvla-parameter-8.c
new file mode 100644
index 000..11e417df7e6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wvla-parameter-8.c
@@ -0,0 +1,86 @@
+/* PR c/97413 - bogus error on function declaration with many VLA arguments:
+   wrong number of arguments specified for 'access' attribute
+   { dg-do compile }
+   { dg-options "-Wall" } */
+
+extern int n;
+
+void f1 (int[n]);
+void f2 (int[n], int[n]);
+void f3 (int[n], int[n], int[n]);
+void f4 (int[n], int[n], int[n], int[n]);
+void f5 (int[n], int[n], int[n], int[n], int[n]);
+void f6 (int[n], int[n], int[n], int[n], int[n], int[n]);
+void f7 (int[n], int[n], int[n], int[n], int[n], int[n], int[n]);
+void f8 (int[n], int[n], int[n], int[n], int[n], int[n], int[n], int[n]);
+void f9 (int[n], int[n], int[n], int[n], int[n], int[n], int[n], int[n],
+	 int[n]);
+void f10 (int[n], int[n], int[n], int[n], int[n], int[n], int[n], int[n],
+	  int[n], int[n]);
+
+
+void f1 (int[n]);
+void f2 (int[n], int[n]);
+void f3 (int[n], int[n], int[n]);
+void f4 (int[n], int[n], int[n], int[n]);
+void f5 (int[n], int[n], int[n], int[n], int[n]);
+void f6 (int[n], int[n], int[n], int[n], int[n], int[n]);
+void f7 (int[n], int[n], int[n], int[n], int[n], int[n], int[n]);
+void f8 (int[n], int[n], int[n], int[n], int[n], int[n], int[n], int[n]);
+void f9 (int[n], int[n], int[n], int[n], int[n], int[n], int[n], int[n],
+	 int[n]);
+void f10 (int[n], int[n], int[n], int[n], int[n], int[n], int[n], int[n],
+	  int[n], int[n]);
+
+
+void g (int n)
+{
+  typedef int A[n];
+
+  void g1 (A);
+  void g2 (A, A);
+  void g3 (A, A, A);
+  void g4 (A, A, A, A);
+  void g5 (A, A, A, A, A);
+  void g6 (A, A, A, A, A, A);
+  void g7 (A, A, A, A, A, A, A);
+  void g8 (A, A, A, A, A, A, A, A);
+  void g9 (A, A, A, A, A, A, A, A, A);
+  void g10 (A, A, A, A, A, A, A, A, A, A);
+
+  void g1 (A);
+  void g2 (A, A);
+  void g3 (A, A, A);
+  void g4 (A, A, A, 

Re: [PATCH][testsuite] Don't overwrite compiler_flags in check_compile

2020-10-14 Thread Mike Stump via Gcc-patches
On Oct 14, 2020, at 6:46 AM, Tom de Vries  wrote:
> 
> OK for trunk?

Ok.


[PATCH] c++: Improve printing of pointers-to-members [PR97406, PR85901]

2020-10-14 Thread Marek Polacek via Gcc-patches
This PR points out that when printing the parameter mapping for a
pointer-to-member-function, the output was truncated:

  [with T = void (X::*]

Fixed by printing the abstract declarator for pointers-to-members in
cxx_pretty_printer::type_id.  So now we print:

  [with T = void (X::*)()]

But when I tried a pointer-to-data-member, I got

  [with T = ‘offset_type’ not supported by simple_type_specifier)‘offset_type’ 
not supported by direct_abstract_declarator]

so had to fix that too so that we now print:

  [with T = int X::*]

or

  [with T = int (X::*)[5]]

when the type is an array type.  Which is what PR85901 was about.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97406
PR c++/85901
* cxx-pretty-print.c (pp_cxx_type_specifier_seq): Handle OFFSET_TYPE.
(cxx_pretty_printer::abstract_declarator): Fix the printing of ')'.
(cxx_pretty_printer::direct_abstract_declarator): Handle OFFSET_TYPE.
(cxx_pretty_printer::type_id): Likewise.  Print the abstract declarator
for pointers-to-members.

gcc/testsuite/ChangeLog:

PR c++/97406
PR c++/85901
* g++.dg/diagnostic/ptrtomem1.C: New test.
* g++.dg/diagnostic/ptrtomem2.C: New test.
---
 gcc/cp/cxx-pretty-print.c   | 33 -
 gcc/testsuite/g++.dg/diagnostic/ptrtomem1.C | 31 +++
 gcc/testsuite/g++.dg/diagnostic/ptrtomem2.C | 14 +
 3 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/ptrtomem1.C
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/ptrtomem2.C

diff --git a/gcc/cp/cxx-pretty-print.c b/gcc/cp/cxx-pretty-print.c
index 8bea79b93a2..058b9c2f4fc 100644
--- a/gcc/cp/cxx-pretty-print.c
+++ b/gcc/cp/cxx-pretty-print.c
@@ -1420,6 +1420,16 @@ pp_cxx_type_specifier_seq (cxx_pretty_printer *pp, tree 
t)
}
   /* fall through */
 
+case OFFSET_TYPE:
+  if (TYPE_PTRDATAMEM_P (t))
+   {
+ pp_cxx_type_specifier_seq (pp, TREE_TYPE (t));
+ pp_cxx_whitespace (pp);
+ pp_cxx_ptr_operator (pp, t);
+ break;
+   }
+  /* fall through */
+
 default:
   if (!(TREE_CODE (t) == FUNCTION_DECL && DECL_CONSTRUCTOR_P (t)))
pp_c_specifier_qualifier_list (pp, t);
@@ -1753,7 +1763,20 @@ pp_cxx_function_definition (cxx_pretty_printer *pp, tree 
t)
 void
 cxx_pretty_printer::abstract_declarator (tree t)
 {
-  if (TYPE_PTRMEM_P (t))
+  /* pp_cxx_ptr_operator prints '(' for a pointer-to-member function,
+ or a pointer-to-data-member of array type:
+
+   void (X::*)()
+   int (X::*)[5]
+
+ but not for a pointer-to-data-member of non-array type:
+
+   int X::*
+
+ so be mindful of that.  */
+  if (TYPE_PTRMEMFUNC_P (t)
+  || (TYPE_PTRDATAMEM_P (t)
+ && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE))
 pp_cxx_right_paren (this);
   else if (INDIRECT_TYPE_P (t))
 {
@@ -1785,6 +1808,11 @@ cxx_pretty_printer::direct_abstract_declarator (tree t)
direct_abstract_declarator (TYPE_PTRMEMFUNC_FN_TYPE (t));
   break;
 
+case OFFSET_TYPE:
+  if (TYPE_PTRDATAMEM_P (t))
+   direct_abstract_declarator (TREE_TYPE (t));
+  break;
+
 case METHOD_TYPE:
 case FUNCTION_TYPE:
   pp_cxx_parameter_declaration_clause (this, t);
@@ -1837,7 +1865,10 @@ cxx_pretty_printer::type_id (tree t)
 case UNDERLYING_TYPE:
 case DECLTYPE_TYPE:
 case TEMPLATE_ID_EXPR:
+case OFFSET_TYPE:
   pp_cxx_type_specifier_seq (this, t);
+  if (TYPE_PTRMEM_P (t))
+   abstract_declarator (t);
   break;
 
 case TYPE_PACK_EXPANSION:
diff --git a/gcc/testsuite/g++.dg/diagnostic/ptrtomem1.C 
b/gcc/testsuite/g++.dg/diagnostic/ptrtomem1.C
new file mode 100644
index 000..bb1327f7af1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/diagnostic/ptrtomem1.C
@@ -0,0 +1,31 @@
+// PR c++/97406
+// { dg-do compile { target c++20 } }
+
+struct X {
+  void f() { }
+  int a;
+  int arr[5];
+};
+
+// Duplicated so that I can check dg-message.
+template
+requires (sizeof(T)==1) // { dg-message {\[with T = void \(X::\*\)\(\)\]} }
+void f1(T)
+{ }
+
+template
+requires (sizeof(T)==1) // { dg-message {\[with T = int X::\*\]} }
+void f2(T)
+{ }
+
+template
+requires (sizeof(T)==1) // dg-message {\[with T = int \(X::\*\)\[5\]\]} }
+void f3(T)
+{ }
+
+int main()
+{
+  f1(::f); // { dg-error "no matching function for call" }
+  f2(::a); // { dg-error "no matching function for call" }
+  f3(::arr); // { dg-error "no matching function for call" }
+}
diff --git a/gcc/testsuite/g++.dg/diagnostic/ptrtomem2.C 
b/gcc/testsuite/g++.dg/diagnostic/ptrtomem2.C
new file mode 100644
index 000..f3b29a07a99
--- /dev/null
+++ b/gcc/testsuite/g++.dg/diagnostic/ptrtomem2.C
@@ -0,0 +1,14 @@
+// PR c++/85901
+// { dg-do compile { target c++11 } }
+
+template struct A;
+
+template
+struct A {
+template
+static auto c(int U::*p, TT o) -> decltype(o.*p); 

[pushed] c++: Diagnose bogus variadic lambda. [PR97358]

2020-10-14 Thread Jason Merrill via Gcc-patches
If the lambda has a capture pack, it cannot be used unexpanded within the
body of the lambda.  If you want to expand the pack across multiple lambdas,
don't capture the whole pack.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/97358
* pt.c (check_for_bare_parameter_packs): Diagnose use of
capture pack.

gcc/testsuite/ChangeLog:

PR c++/97358
* g++.dg/cpp0x/lambda/lambda-variadic11.C: New test.
---
 gcc/cp/pt.c   | 17 +++-
 .../g++.dg/cpp0x/lambda/lambda-variadic11.C   | 20 +++
 2 files changed, 32 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic11.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 555dc47b464..e98bf83117d 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -4230,11 +4230,6 @@ check_for_bare_parameter_packs (tree t, location_t loc 
/* = UNKNOWN_LOCATION */)
   if (!processing_template_decl || !t || t == error_mark_node)
 return false;
 
-  /* A lambda might use a parameter pack from the containing context.  */
-  if (current_class_type && LAMBDA_TYPE_P (current_class_type)
-  && CLASSTYPE_TEMPLATE_INFO (current_class_type))
-return false;
-
   if (TREE_CODE (t) == TYPE_DECL)
 t = TREE_TYPE (t);
 
@@ -4244,6 +4239,18 @@ check_for_bare_parameter_packs (tree t, location_t loc 
/* = UNKNOWN_LOCATION */)
   cp_walk_tree (, _parameter_packs_r, , ppd.visited);
   delete ppd.visited;
 
+  /* It's OK for a lambda to have an unexpanded parameter pack from the
+ containing context, but do complain about unexpanded capture packs.  */
+  if (current_class_type && LAMBDA_TYPE_P (current_class_type)
+  && CLASSTYPE_TEMPLATE_INFO (current_class_type))
+for (; parameter_packs;
+parameter_packs = TREE_CHAIN (parameter_packs))
+  {
+   tree pack = TREE_VALUE (parameter_packs);
+   if (is_capture_proxy (pack))
+ break;
+  }
+
   if (parameter_packs)
 {
   if (loc == UNKNOWN_LOCATION)
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic11.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic11.C
new file mode 100644
index 000..aa4ffd70df7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic11.C
@@ -0,0 +1,20 @@
+// PR c++/97358
+// { dg-do compile { target c++11 } }
+
+template  void foo (T... x) {}
+
+template  void bar (T... x)
+{
+  foo ([x...] { return x; }...); // { dg-error "not expanded|no parameter 
packs" }
+#if __cpp_init_captures >= 201803L
+  foo ([...y = x] { return y; }...); // { dg-error "not expanded|no parameter 
packs" "" { target c++20 } }
+#endif
+}
+
+void
+test ()
+{
+  bar ();
+  bar (1);
+  bar (2.0, 3LL, 4);
+}

base-commit: 87d75a11a5cb93668ae0bf6d97030e01b2eae3f2
-- 
2.18.1



[r11-3876 Regression] FAIL: gcc.dg/tree-ssa/modref-4.c (test for excess errors) on Linux/x86_64

2020-10-14 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

4d90edb96e199e2e73ba71de5ab3b7c1c0aad6d0 is the first bad commit
commit 4d90edb96e199e2e73ba71de5ab3b7c1c0aad6d0
Author: Jan Hubicka 
Date:   Wed Oct 14 16:01:39 2020 +0200

Handle POINTER_PLUS_EXPR in jump functions in ipa-modref.

caused

FAIL: gcc.dg/ipa/modref-1.c scan-ipa-dump modref "param offset: 1"
FAIL: gcc.dg/ipa/modref-1.c scan-ipa-dump modref "param offset: 2"
FAIL: gcc.dg/ipa/modref-1.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/modref-4.c scan-tree-dump modref1 "param offset: 1"
FAIL: gcc.dg/tree-ssa/modref-4.c (test for excess errors)

with GCC configured with

Configured with: ../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3876/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="ipa.exp=gcc.dg/ipa/modref-1.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="ipa.exp=gcc.dg/ipa/modref-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="ipa.exp=gcc.dg/ipa/modref-1.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="ipa.exp=gcc.dg/ipa/modref-1.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-4.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-4.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-4.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-4.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH] [PR rtl-optimization/97249]Simplify vec_select of paradoxical subreg.

2020-10-14 Thread Segher Boessenkool
On Wed, Oct 14, 2020 at 07:55:55PM +0200, Richard Biener wrote:
> On October 14, 2020 7:35:32 PM GMT+02:00, Segher Boessenkool 
>  wrote:
> >On Wed, Oct 14, 2020 at 01:43:45PM +0800, Hongtao Liu wrote:
> >> On Wed, Oct 14, 2020 at 4:01 AM Segher Boessenkool
> >>  wrote:
> >> > On Tue, Oct 13, 2020 at 04:40:53PM +0800, Hongtao Liu wrote:
> >> > >   For rtx like
> >> > >   (vec_select:V2SI (subreg:V4SI (inner:V2SI) 0)
> >> > >(parallel [(const_int 0) (const_int 1)]))
> >> > >  it could be simplified as inner.
> >> >
> >> > You could even simplify any vec_select of a subreg of X to just a
> >> > vec_select of X, by changing the selection vector a bit (well, only
> >do
> >> 
> >> Yes, when SUBREG_BYTE of trueop0 is not 0, we need to add offset to
> >selection.
> >
> >Exactly.
> >
> >> > this if that is a constant vector, I suppose).  Not just for
> >paradoxical
> >> > subregs either, just for *all* subregs.
> >> 
> >> Yes, and only when X has the same inner mode and more elements.
> >
> >No, for *all*.  The mode of the first argument of vec_select does not
> >have to equal its result mode.
> 
> But IIRC the component mode needs to match. 

Yeah, good point, at least the i386 backend uses crazy subregs, which
is why validate_subreg does not test this :-(


Segher


[committed] libstdc++: Fix unspecified comparison to null pointer [PR 97415]

2020-10-14 Thread Jonathan Wakely via Gcc-patches
The standard doesn't guarantee that null pointers compare less than
non-null pointers. AddressSanitizer complains about the pptr()> egptr()
comparison in basic_stringbuf::str() when egptr() is null.

libstdc++-v3/ChangeLog:

PR libstdc++/97415
* include/std/sstream (basic_stringbuf::str()): Check for
null egptr() before comparing to non-null pptr().

Tested powerpc64le-linux. Committed to trunk.


commit 78198b6021a9695054dab039340202170b88423c
Author: Jonathan Wakely 
Date:   Wed Oct 14 18:55:14 2020

libstdc++: Fix unspecified comparison to null pointer [PR 97415]

The standard doesn't guarantee that null pointers compare less than
non-null pointers. AddressSanitizer complains about the pptr()> egptr()
comparison in basic_stringbuf::str() when egptr() is null.

libstdc++-v3/ChangeLog:

PR libstdc++/97415
* include/std/sstream (basic_stringbuf::str()): Check for
null egptr() before comparing to non-null pptr().

diff --git a/libstdc++-v3/include/std/sstream b/libstdc++-v3/include/std/sstream
index 9cca54d17d1..06960e30bf2 100644
--- a/libstdc++-v3/include/std/sstream
+++ b/libstdc++-v3/include/std/sstream
@@ -178,13 +178,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   str() const
   {
__string_type __ret(_M_string.get_allocator());
-   if (this->pptr())
+   if (char_type* __pptr = this->pptr())
  {
+   char_type* __egptr = this->egptr();
// The current egptr() may not be the actual string end.
-   if (this->pptr() > this->egptr())
- __ret.assign(this->pbase(), this->pptr());
+   if (!__egptr || __pptr > __egptr)
+ __ret.assign(this->pbase(), __pptr);
else
- __ret.assign(this->pbase(), this->egptr());
+ __ret.assign(this->pbase(), __egptr);
  }
else
  __ret = _M_string;


Re: [patch] Add an if-exists-then-else spec function

2020-10-14 Thread Olivier Hainque
Hello,

Here’s an updated version of originally proposed at

  https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555271.html

with an extra documentation bit, as suggested by Armin’s
comment quoted below.

Re-tested with a couple of VxWorks builds.

Ok to commit ?

Thanks in advance!

Best Regards,

2020-10-14  Douglas Rupp  

* gcc.c (if-exists-then-else): New built-in spec function.
* doc/invoke.texi: Document it.

>> On 1 Oct 2020, at 18:20, Armin Brauns via Gcc-patches 
>>  wrote:
>> 
>> could you please make sure to update the documentation around 
>> gcc/doc/invoke.texi:31574 accordingly?

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 47aa69530ab6..288792214e72 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -31665,6 +31665,19 @@ crt0%O%s %:if-exists(crti%O%s) \
 %:if-exists-else(crtbeginT%O%s crtbegin%O%s)
 @end smallexample
 
+@item @code{if-exists-then-else}
+The @code{if-exists-then-else} spec function takes at least two arguments
+and an optional third one. The first argument is an absolute pathname to a
+file.  If the file exists, the function returns the second argument.
+If the file does not exist, the function returns the third argument if there
+is one, or NULL otherwise. This can be used to expand one text, or optionally
+another, based on the existence of a file.  Here is a small example of its
+usage:
+
+@smallexample
+-l%:if-exists-then-else(%:getenv(VSB_DIR rtnet.h) rtnet net)
+@end smallexample
+
 @item @code{replace-outfile}
 The @code{replace-outfile} spec function takes two arguments.  It looks for the
 first argument in the outfiles array and replaces it with the second argument. 
 Here
diff --git a/gcc/gcc.c b/gcc/gcc.c
index ff7b6c4a3205..337c27442a39 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -416,6 +416,7 @@ static void try_generate_repro (const char **argv);
 static const char *getenv_spec_function (int, const char **);
 static const char *if_exists_spec_function (int, const char **);
 static const char *if_exists_else_spec_function (int, const char **);
+static const char *if_exists_then_else_spec_function (int, const char **);
 static const char *sanitize_spec_function (int, const char **);
 static const char *replace_outfile_spec_function (int, const char **);
 static const char *remove_outfile_spec_function (int, const char **);
@@ -1723,6 +1724,7 @@ static const struct spec_function static_spec_functions[] 
=
   { "getenv",   getenv_spec_function },
   { "if-exists",   if_exists_spec_function },
   { "if-exists-else",  if_exists_else_spec_function },
+  { "if-exists-then-else", if_exists_then_else_spec_function },
   { "sanitize",sanitize_spec_function },
   { "replace-outfile", replace_outfile_spec_function },
   { "remove-outfile",  remove_outfile_spec_function },
@@ -10087,6 +10089,29 @@ if_exists_else_spec_function (int argc, const char 
**argv)
   return argv[1];
 }
 
+/* if-exists-then-else built-in spec function.
+
+   Checks to see if the file specified by the absolute pathname in
+   the first arg exists.  Returns the second arg if so, otherwise returns
+   the third arg if it is present.  */
+
+static const char *
+if_exists_then_else_spec_function (int argc, const char **argv)
+{
+
+  /* Must have two or three arguments.  */
+  if (argc != 2 && argc != 3)
+return NULL;
+
+  if (IS_ABSOLUTE_PATH (argv[0]) && ! access (argv[0], R_OK))
+return argv[1];
+
+  if (argc == 3)
+return argv[2];
+
+  return NULL;
+}
+
 /* sanitize built-in spec function.
 
This returns non-NULL, if sanitizing address, thread or


Re: Aw: Re: [PATCH] PR fortran/97408 - Diagnose non-constant KIND argument to intrinsics

2020-10-14 Thread Tobias Burnus

Hi Harald,

On 10/14/20 8:25 PM, Harald Anlauf wrote:

Or worded differently: If
integer, parameter :: A(*) = [(i, i=1,5)]
is valid, which should
integer, parameter :: B(*) = [integer :: (int(i, kind=i), i=1,2)]
be invalid?

Well, my copy of the F2018-FDIS says about the KIND argument to INT:
"KIND (optional) shall be a scalar integer constant expression."

Which applies to "B". For "A" (PARAMETER) it states: "entity has the
value specified byits constant-expr,"

Are you saying that (int(i, kind=i), i=1,2) is legal?
It would be helpful if you explained why "i" in kind=i is a constant expression.

I only say that it might be valid. – It would be likewise helpful if you
could explain why "i" is a const-expr in "[(i, i=1,5)]" – which we agree
is valid, don't we?. And what about "i" in "int(i)" for "[(kind(i),
i=1,5)]"?

I don't know whether it is valid – I just find it not obvious that [(i,
i=1,5)] is valid and [(int(1, kind=i), i=1,1)] is not.

Surely, if one first expands the array, it is valid: "[integer ::
(int(i, kind=i), i=1,2)]" → "[integer :: int(1, kind=1), int(2,kind=2)]"
→ "[integer :: 1_1, 2_2]" → "[1,2]".

In any case, gfc_check_init_expr is supposed to check for const-expr –
and if that does not work, gfc_check_init_expr should be fixed or at
least clearly understood when it can be used and when it cannot be used.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [PATCH 6/X] libsanitizer: Add hwasan pass and associated gimple changes

2020-10-14 Thread Richard Sandiford via Gcc-patches
Matthew Malcomson  writes:
> @@ -133,6 +137,13 @@ enum asan_mark_flags
>  #undef DEF
>  };
>  
> +enum hwasan_mark_flags
> +{
> +#define DEF(X) HWASAN_MARK_##X
> +  IFN_ASAN_MARK_FLAGS
> +#undef DEF
> +};

Are these used anywhere?  It looks like expand_HWASAN_MARK uses the
plain asan versions.

> @@ -640,6 +684,85 @@ handle_builtin_alloca (gcall *call, gimple_stmt_iterator 
> *iter)
>  = DECL_FUNCTION_CODE (callee) == BUILT_IN_ALLOCA
>? 0 : tree_to_uhwi (gimple_call_arg (call, 1));
>  
> +  if (hwasan_sanitize_allocas_p ())
> +{
> +  /*
> +  HWASAN needs a different expansion.
> +
> +  addr = __builtin_alloca (size, align);
> +
> +  should be replaced by
> +
> +  new_size = size rounded up to HWASAN_TAG_GRANULE_SIZE byte alignment;
> +  untagged_addr = __builtin_alloca (new_size, align);
> +  tag = __hwasan_choose_alloca_tag ();
> +  addr = __hwasan_tag_pointer (untagged_addr, tag);
> +  __hwasan_tag_memory (untagged_addr, tag, new_size);
> + */
> +  /* Ensure alignment at least HWASAN_TAG_GRANULE_SIZE bytes so we start 
> on
> +  a tag granule.  */
> +  align = align > HWASAN_TAG_GRANULE_SIZE ? align : 
> HWASAN_TAG_GRANULE_SIZE;
> +
> +  uint8_t tg_mask = HWASAN_TAG_GRANULE_SIZE - 1;
> +  /* tree new_size = (old_size + tg_mask) & ~tg_mask;  */
> +  tree old_size = gimple_call_arg (call, 0);
> +  tree tree_mask = build_int_cst (size_type_node, tg_mask);
> +  g = gimple_build_assign (make_ssa_name (size_type_node), PLUS_EXPR,
> +old_size, tree_mask);
> +  gsi_insert_before (iter, g, GSI_SAME_STMT);
> +  tree oversize = gimple_assign_lhs (g);
> +
> +  g = gimple_build_assign (make_ssa_name (size_type_node), BIT_NOT_EXPR,
> +tree_mask);
> +  tree mask = gimple_assign_lhs (g);
> +  gsi_insert_before (iter, g, GSI_SAME_STMT);

Seems simpler to use:

  tree mask = build_int_cst (size_type_node, -HWASAN_TAG_GRANULE_SIZE);

> +
> +  g = gimple_build_assign (make_ssa_name (size_type_node), BIT_AND_EXPR,
> +oversize, mask);
> +  gsi_insert_before (iter, g, GSI_SAME_STMT);
> +  tree new_size = gimple_assign_lhs (g);
> +
> +  /* emit the alloca call */
> +  tree fn = builtin_decl_implicit (BUILT_IN_ALLOCA_WITH_ALIGN);
> +  gg = gimple_build_call (fn, 2, new_size,
> +   build_int_cst (size_type_node, align));
> +  tree untagged_addr = make_ssa_name (ptr_type, gg);
> +  gimple_call_set_lhs (gg, untagged_addr);
> +  gsi_insert_before (iter, gg, GSI_SAME_STMT);
> +
> +  /* Insert code choosing the tag.
> +  Here we use an internal function so we can choose the tag at expand
> +  time.  We need the decision to be made after stack variables have been
> +  assigned their tag (i.e. once the tag_offset variable has been set to
> +  one after the last stack variables tag).  */
> +
> +  gg = gimple_build_call_internal (IFN_HWASAN_CHOOSE_TAG, 0);
> +  tree tag = make_ssa_name (unsigned_char_type_node, gg);
> +  gimple_call_set_lhs (gg, tag);
> +  gsi_insert_before (iter, gg, GSI_SAME_STMT);
> +
> +  /* Insert code adding tag to pointer.  */
> +  fn = builtin_decl_implicit (BUILT_IN_HWASAN_TAG_PTR);
> +  gg = gimple_build_call (fn, 2, untagged_addr, tag);
> +  tree addr = make_ssa_name (ptr_type, gg);
> +  gimple_call_set_lhs (gg, addr);
> +  gsi_insert_before (iter, gg, GSI_SAME_STMT);
> +
> +  /* Insert code tagging shadow memory.
> +  NOTE: require using `untagged_addr` here for libhwasan API.  */
> +  fn = builtin_decl_implicit (BUILT_IN_HWASAN_TAG_MEM);
> +  gg = gimple_build_call (fn, 3, untagged_addr, tag, new_size);
> +  gsi_insert_before (iter, gg, GSI_SAME_STMT);
> +
> +  /* Finally, replace old alloca ptr with NEW_ALLOCA.  */
> +  replace_call_with_value (iter, addr);
> +  return;
> +}
> +
> +  tree last_alloca = get_last_alloca_addr ();
> +  const HOST_WIDE_INT redzone_mask = ASAN_RED_ZONE_SIZE - 1;
> +
> +
>/* If ALIGN > ASAN_RED_ZONE_SIZE, we embed left redzone into first ALIGN
>   bytes of allocated space.  Otherwise, align alloca to ASAN_RED_ZONE_SIZE
>   manually.  */
> @@ -792,6 +915,31 @@ get_mem_refs_of_builtin_call (gcall *call,
>break;
>  
>  case BUILT_IN_STRLEN:
> +  /* Special case strlen here since its length is taken from its return
> +  value.
> +
> +  The approach taken by the sanitizers is to check a memory access
> +  before it's taken.  For ASAN strlen is intercepted by libasan, so no
> +  check is inserted by the compiler.
> +
> +  This function still returns `true` and provides a length to the rest
> +  of the ASAN pass in order to record what areas have been checked,
> +  avoiding superfluous checks later on.
> +
> +  HWASAN does not intercept any of these internal functions.
> +  This 

Aw: Re: [PATCH] PR fortran/97408 - Diagnose non-constant KIND argument to intrinsics

2020-10-14 Thread Harald Anlauf
Hi Tobias,

> > The KIND argument to intrinsics must be a compile-time argument.
> > Improve check so that the proper diagnostics is emitted.
> >
> >
> > -  if (!gfc_check_init_expr (k))
> > +  if (!gfc_check_init_expr (k) || k->expr_type == EXPR_VARIABLE)
> 
> I think the real question is why is the following regarded as initialization 
> expression:
>t = true;
> …
>if (gfc_check_iter_variable (e))
>  break;

you completely lost me here.  Did you accidentally delete some context?

> Or worded differently: If
>integer, parameter :: A(*) = [(i, i=1,5)]
> is valid, which should
>integer, parameter :: B(*) = [integer :: (int(i, kind=i), i=1,2)]
> be invalid?

Well, my copy of the F2018-FDIS says about the KIND argument to INT:

"KIND (optional) shall be a scalar integer constant expression."

Are you saying that (int(i, kind=i), i=1,2) is legal?
It would be helpful if you explained why "i" in kind=i is a constant expression.

> Thus, the first question should be whether that is valid code
> according to the Fortran standard or not.

Indeed.

Harald



Re: [PATCH] Add if-chain to switch conversion pass.

2020-10-14 Thread Andrew MacLeod via Gcc-patches

On 10/12/20 8:39 AM, Martin Liška wrote:

On 10/6/20 4:12 PM, Jakub Jelinek wrote:

On Tue, Oct 06, 2020 at 03:48:38PM +0200, Martin Liška wrote:

On 10/6/20 9:47 AM, Richard Biener wrote:
But is it really extensible with the current implementation?  I 
doubt so.


I must agree with the statement. So let's make the pass properly.
I would need a help with the algorithm where I'm planning to do the 
following

steps:

1) for each BB ending with a gcond, parse index variable and it's VR;
    I'll support:
    a) index == 123 ([123, 123])
    b) 1 <= index && index <= 9 ([1, 9])
    c) index == 123 || index == 12345 ([123, 123] [12345, 12345])
    d) index != 1 ([1, 1])
    e) index != 1 && index != 5 ([1, 1] [5, 5])


The fold_range_test created cases are essential to support, so
f) index - 123U < 456U ([123, 456+123])
g) (unsigned) index - 123U < 456U (ditto)
but the discovery should actually recurse on all of those forms, so 
it will

handle
(unsigned) index - 123U < 456U || (unsigned) index - 16384U <= 32711U
etc.
You can see what reassoc init_range_entry does and do something similar?


All right, I started to use init_range_entry in combination with 
linearize_expr_tree.
One thing I have problem with is that linearize_expr_tree doesn't 
properly mark

all statements as visited for cases like:

   :
  index2.1_1 = (unsigned int) index2_16(D);
  _2 = index2.1_1 + 4294967196;
  _3 = _2 <= 100;
  _5 = index2.1_1 + 4294966996;
  _6 = _5 <= 33;
  _7 = _3 | _6;
  if (_7 != 0)
    goto ; [INV]
  else
    goto ; [INV]

As seen, all statements in this BB are used by the final _7 != 0 and 
it would

be handy for me to identify all statements that should be hoisted.


The ranger infrastructure includes definition chains for what can 
potentially have a range calculated on an outgoing edge.  It contains 
all the ssa-names defined in the block for which we have range-ops 
entries that allow us to potentially wind back thru a calculation.   ie, 
any name which is defined in the block whose value can be changed based 
on which edge is taken...



I created:

foo (int index)
{
 if (index - 123U < 456U || index - 16384U <= 32711U )
    foo (42);
}

the exports range list contains
 === BB 2 
index_9(D)  int VARYING
     :
    index.0_1 = (unsigned int) index_9(D);
    _2 = index.0_1 + 4294967173;
    _3 = _2 <= 455;
    _5 = index.0_1 + 4294950912;
    _6 = _5 <= 32711;
    _7 = _3 | _6;
    if (_7 != 0)
  goto ; [INV]
    else
  goto ; [INV]

2->3  (T) index.0_1 :   unsigned int [123, 578][16384, 49095]
2->3  (T) _7 :  _Bool [1, 1]
2->3  (T) index_9(D) :  int [123, 578][16384, 49095]
2->4  (F) index.0_1 :   unsigned int [0, 122][579, 16383][49096, +INF]
2->4  (F) _2 :  unsigned int [456, +INF]
2->4  (F) _3 :  _Bool [0, 0]
2->4  (F) _5 :  unsigned int [32712, +INF]
2->4  (F) _6 :  _Bool [0, 0]
2->4  (F) _7 :  _Bool [0, 0]
2->4  (F) index_9(D) :  int [-INF, 122][579, 16383][49096, +INF]

and importantly, the defchain structure which lists names which are used 
to define this name  looks like:


DUMPING GORI MAP
bb2    index.0_1 : index_9(D)
   _2 : index.0_1  index_9(D)
   _3 : index.0_1  _2  index_9(D)
   _5 : index.0_1  index_9(D)
   _6 : index.0_1  _5  index_9(D)
   _7 : index.0_1  _2  _3  _5  _6  index_9(D)
   exports: index.0_1  _2  _3  _5  _6  _7  index_9(D)


This indicates that if you are using _7 as the control name of the branch,

_7 : index.0_1  _2  _3  _5  _6  index_9(D)

is the list of names that _7 uses in its definition chain...and we can 
calculate ranges for.      index_9 is not defined in this BB, so it is 
considered an import.  you'd probably be looking for all the names in 
this list whose SSA_NAME_DEF_STMT is in this block.. That looks a lot 
like the list of statements you want to hoist.


Caveats are that
  1)  this is currently only used internally by the ranger, so there 
are some  minor warts that may currently limit its usefulness elsewhere
  2) its is limited to only processing statements for which we have 
range-ops understanding.    which means we know how to calculate ranges 
for the statement.  Perhaps this is also not an issue since if there are 
statements we cant generate a range for, perhaps you dont care.


This might be more ionfo than  you need, but also

  3) before the enxt stage 1 I plan to rework the GORI component, and I 
plan to split this into additional "parts"  in particualr, this entire 
export list will still exist, but there will be 3 subcomponents which 
form it:
       a)  control names :   These are booleans which contribute to the 
TRUE/FALSEness of the edge
       b) direct exports  : These are ssanames which are directly 
affected by relations on the edge.. Ie, the edge gives them a range
       c) calculable exports  : These are other ssa_names which can be 
calculated based on the direct exports. Ie, the direct export is used in 
calculating this value

for the above block,
  control names :  _7, _6 and _3
  

Re: [PATCH] configure: Suppress output from multi-do recipes

2020-10-14 Thread Jonathan Wakely via Gcc-patches

On 14/10/20 17:29 +0100, Jonathan Wakely wrote:

The FIXME comment saying "Leave out until this is tested a bit more" is
from 1997. I think it's been sufficiently tested.

ChangeLog:

* config-ml.in (multi-do): Add @ to silence recipe. Remove FIXME
comment.

OK for trunk?

This removes 44 lines of irrelevant noise from various build targets,
such as the 'check' target that runs the libstdc++ testsuite.


Actually there are two instances of this FIXME in that file. This
revised patch deals with both.

It looks like this file is shared with binutils-gdb and newlib-cygwin,
I've only tested it for GCC.

commit e257e460d4241bfbb31fd1714ee1c3000b1a378b
Author: Jonathan Wakely 
Date:   Wed Oct 14 16:15:50 2020

config-ml.in: Suppress output from multi-do recipes

The FIXME comments saying "Leave out until this is tested a bit more"
are from 1997. I think they've been sufficiently tested.

ChangeLog:

* config-ml.in (multi-do, multi-clean): Add @ to silence recipes.
Remove FIXME comments.

diff --git a/config-ml.in b/config-ml.in
index 5720d38d23f..68854a4f16c 100644
--- a/config-ml.in
+++ b/config-ml.in
@@ -499,10 +499,8 @@ cat > Multi.tem <<\EOF
 
 PWD_COMMAND=$${PWDCMD-pwd}
 
-# FIXME: There should be an @-sign in front of the `if'.
-# Leave out until this is tested a bit more.
 multi-do:
-	if [ -z "$(MULTIDIRS)" ]; then \
+	@if [ -z "$(MULTIDIRS)" ]; then \
 	  true; \
 	else \
 	  rootpre=`${PWD_COMMAND}`/; export rootpre; \
@@ -547,10 +545,8 @@ multi-do:
 	  done; \
 	fi
 
-# FIXME: There should be an @-sign in front of the `if'.
-# Leave out until this is tested a bit more.
 multi-clean:
-	if [ -z "$(MULTIDIRS)" ]; then \
+	@if [ -z "$(MULTIDIRS)" ]; then \
 	  true; \
 	else \
 	  lib=`${PWD_COMMAND} | sed -e 's,^.*/\([^/][^/]*\)$$,\1,'`; \


Re: [PATCH] [PR rtl-optimization/97249]Simplify vec_select of paradoxical subreg.

2020-10-14 Thread Richard Biener via Gcc-patches
On October 14, 2020 7:35:32 PM GMT+02:00, Segher Boessenkool 
 wrote:
>Hi!
>
>On Wed, Oct 14, 2020 at 01:43:45PM +0800, Hongtao Liu wrote:
>> On Wed, Oct 14, 2020 at 4:01 AM Segher Boessenkool
>>  wrote:
>> > On Tue, Oct 13, 2020 at 04:40:53PM +0800, Hongtao Liu wrote:
>> > >   For rtx like
>> > >   (vec_select:V2SI (subreg:V4SI (inner:V2SI) 0)
>> > >(parallel [(const_int 0) (const_int 1)]))
>> > >  it could be simplified as inner.
>> >
>> > You could even simplify any vec_select of a subreg of X to just a
>> > vec_select of X, by changing the selection vector a bit (well, only
>do
>> 
>> Yes, when SUBREG_BYTE of trueop0 is not 0, we need to add offset to
>selection.
>
>Exactly.
>
>> > this if that is a constant vector, I suppose).  Not just for
>paradoxical
>> > subregs either, just for *all* subregs.
>> 
>> Yes, and only when X has the same inner mode and more elements.
>
>No, for *all*.  The mode of the first argument of vec_select does not
>have to equal its result mode.

But IIRC the component mode needs to match. 

>Any (constant indices) vec_select of a subreg can be written as just a
>vec_select.
>
>> +  /* Simplify vec_select of a subreg of X to just a vec_select of X
>> + when available.  */
>
>What does "when available" mean here?
>
>> +  int l2;
>> +  if (GET_CODE (trueop0) == SUBREG
>> +  && (GET_MODE_INNER (mode)
>> +  == GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0
>
>Don't use unnecessary parens here please, it makes it harder to read
>(there are quite enough parens already :-) )
>
>> +  gcc_assert (can_div_trunc_p (SUBREG_BYTE (trueop0),
>> +   GET_MODE_SIZE (GET_MODE_INNER 
>> (mode)),
>> +   _offset));
>
>Why is this needed?
>
>> +  bool success = true;
>> +  for (int i = 0;i != l1; i++)
>
>(space after ; )
>
>> +{
>> +  rtx j = XVECEXP (trueop1, 0, i);
>
>(i and j and k ususally are integers, not rtx)
>
>> +  if (!CONST_INT_P (j)
>> +  || known_ge (UINTVAL (j), l2 - subreg_offset))
>> +{
>> +  success = false;
>> +  break;
>> +}
>> +}
>
>You don't have to test if the input RTL is valid.  You can assume it
>is.
>
>> +  if (success)
>> +{
>> +  rtx par = trueop1;
>> +  if (subreg_offset)
>> +{
>> +  rtvec vec = rtvec_alloc (l1);
>> +  for (int i = 0; i < l1; i++)
>> +RTVEC_ELT (vec, i)
>> +  = GEN_INT (INTVAL (XVECEXP (trueop1, 0, i)
>> + + subreg_offset));
>> +  par = gen_rtx_PARALLEL (VOIDmode, vec);
>> +}
>> +  return gen_rtx_VEC_SELECT (mode, XEXP (trueop0, 0), par);
>> +}
>> +}
>
>subreg_offset will differ in meaning if big-endian; is this correct
>there, do all the stars align so this code works out fine there as
>well?
>
>Looks fine otherwise, thanks :-)
>
>
>Segher



Re: [PATCH] [PR rtl-optimization/97249]Simplify vec_select of paradoxical subreg.

2020-10-14 Thread Segher Boessenkool
Hi!

On Wed, Oct 14, 2020 at 01:43:45PM +0800, Hongtao Liu wrote:
> On Wed, Oct 14, 2020 at 4:01 AM Segher Boessenkool
>  wrote:
> > On Tue, Oct 13, 2020 at 04:40:53PM +0800, Hongtao Liu wrote:
> > >   For rtx like
> > >   (vec_select:V2SI (subreg:V4SI (inner:V2SI) 0)
> > >(parallel [(const_int 0) (const_int 1)]))
> > >  it could be simplified as inner.
> >
> > You could even simplify any vec_select of a subreg of X to just a
> > vec_select of X, by changing the selection vector a bit (well, only do
> 
> Yes, when SUBREG_BYTE of trueop0 is not 0, we need to add offset to selection.

Exactly.

> > this if that is a constant vector, I suppose).  Not just for paradoxical
> > subregs either, just for *all* subregs.
> 
> Yes, and only when X has the same inner mode and more elements.

No, for *all*.  The mode of the first argument of vec_select does not
have to equal its result mode.

Any (constant indices) vec_select of a subreg can be written as just a
vec_select.

> +   /* Simplify vec_select of a subreg of X to just a vec_select of X
> +  when available.  */

What does "when available" mean here?

> +   int l2;
> +   if (GET_CODE (trueop0) == SUBREG
> +   && (GET_MODE_INNER (mode)
> +   == GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0

Don't use unnecessary parens here please, it makes it harder to read
(there are quite enough parens already :-) )

> +   gcc_assert (can_div_trunc_p (SUBREG_BYTE (trueop0),
> +GET_MODE_SIZE (GET_MODE_INNER 
> (mode)),
> +_offset));

Why is this needed?

> +   bool success = true;
> +   for (int i = 0;i != l1; i++)

(space after ; )

> + {
> +   rtx j = XVECEXP (trueop1, 0, i);

(i and j and k ususally are integers, not rtx)

> +   if (!CONST_INT_P (j)
> +   || known_ge (UINTVAL (j), l2 - subreg_offset))
> + {
> +   success = false;
> +   break;
> + }
> + }

You don't have to test if the input RTL is valid.  You can assume it is.

> +   if (success)
> + {
> +   rtx par = trueop1;
> +   if (subreg_offset)
> + {
> +   rtvec vec = rtvec_alloc (l1);
> +   for (int i = 0; i < l1; i++)
> + RTVEC_ELT (vec, i)
> +   = GEN_INT (INTVAL (XVECEXP (trueop1, 0, i)
> +  + subreg_offset));
> +   par = gen_rtx_PARALLEL (VOIDmode, vec);
> + }
> +   return gen_rtx_VEC_SELECT (mode, XEXP (trueop0, 0), par);
> + }
> + }

subreg_offset will differ in meaning if big-endian; is this correct
there, do all the stars align so this code works out fine there as well?

Looks fine otherwise, thanks :-)


Segher


c++: DECL_FRIEND_P cleanup

2020-10-14 Thread Nathan Sidwell


DECL_FRIEND_P's meaning has changed over time.  It now (almost) means
the the friend function decl has not been met via an explicit decl.
This completes that transition, renaming it to DECL_UNIQUE_FRIEND_P,
so one doesn't think it is the sole indicator of friendliness (plenty
of friends do not have the flag set).  This allows reduction in the
complexity of managing the field -- all in duplicate_decls now.

gcc/cp/
* cp-tree.h (struct lang_decl_fn): Adjust context comment.
(DECL_FRIEND_P): Replace with ...
(DECL_UNIQUE_FRIEND_P): ... this.  Only for FUNCTION_DECLs.
(DECL_FRIEND_CONTEXT): Adjust.
* class.c (add_implicitly_declared_members): Detect friendly
spaceship from context.
* constraint.cc (remove_constraints): Use a checking assert.
(maybe_substitute_reqs_for): Use DECL_UNIQUE_FRIEND_P.
* decl.c (check_no_redeclaration_friend_default_args):
DECL_UNIQUE_FRIEND_P is signficant, not hiddenness.
(duplicate_decls): Adjust DECL_UNIQUE_FRIEND_P clearing.
(redeclaration_error_message): Use DECL_UNIQUE_FRIEND_P.
(start_preparsed_function): Correct in-class friend processing.
Refactor some initializers.
(grokmethod): Directly check friend decl-spec.
* decl2.c (grokfield): Check DECL_UNIQUE_FRIEND_P.
* friend.c (do_friend): Set DECL_UNIQUE_FRIEND_P first, remove
extraneous conditions.  Don't re set it afterwards.
* name-lookup.c (lookup_elaborated_type_1): Simplify revealing
code.
(do_pushtag): Likewise.
* pt.c (optimize_specialization_lookup_p): Check
DECL_UNIQUE_FRIEND_P.
(push_template_decl): Likewise.  Drop unneeded friend setting.
(type_dependent_expression_p): Check DECL_UNIQUE_FRIEND_P.
libcc1/
* libcp1plugin.cc (plugin_add_friend): Set DECL_UNIQUE_FRIEND_P.

pushing to trunk

nathan

--
Nathan Sidwell
diff --git i/gcc/cp/class.c w/gcc/cp/class.c
index 01780fe8291..26f996b7f4b 100644
--- i/gcc/cp/class.c
+++ w/gcc/cp/class.c
@@ -3283,7 +3283,8 @@ add_implicitly_declared_members (tree t, tree* access_decls,
   {
 	tree eq = implicitly_declare_fn (sfk_comparison, t, false, space,
 	 NULL_TREE);
-	if (DECL_FRIEND_P (space))
+	bool is_friend = DECL_CONTEXT (space) != t;
+	if (is_friend)
 	  do_friend (NULL_TREE, DECL_NAME (eq), eq,
 		 NULL_TREE, NO_SPECIAL, true);
 	else
@@ -3292,7 +3293,7 @@ add_implicitly_declared_members (tree t, tree* access_decls,
 	DECL_CHAIN (eq) = TYPE_FIELDS (t);
 	TYPE_FIELDS (t) = eq;
 	  }
-	maybe_add_class_template_decl_list (t, eq, DECL_FRIEND_P (space));
+	maybe_add_class_template_decl_list (t, eq, is_friend);
   }
 
   while (*access_decls)
diff --git i/gcc/cp/constraint.cc w/gcc/cp/constraint.cc
index 050b55ce092..f4f5174eff3 100644
--- i/gcc/cp/constraint.cc
+++ w/gcc/cp/constraint.cc
@@ -1201,7 +1201,7 @@ set_constraints (tree t, tree ci)
 void
 remove_constraints (tree t)
 {
-  gcc_assert (DECL_P (t));
+  gcc_checking_assert (DECL_P (t));
   if (TREE_CODE (t) == TEMPLATE_DECL)
 t = DECL_TEMPLATE_RESULT (t);
 
@@ -1217,11 +1217,16 @@ maybe_substitute_reqs_for (tree reqs, const_tree decl_)
 {
   if (reqs == NULL_TREE)
 return NULL_TREE;
+
   tree decl = CONST_CAST_TREE (decl_);
   tree result = STRIP_TEMPLATE (decl);
-  if (DECL_FRIEND_P (result))
+
+  if (DECL_UNIQUE_FRIEND_P (result))
 {
-  tree tmpl = decl == result ? DECL_TI_TEMPLATE (result) : decl;
+  tree tmpl = decl;
+  if (TREE_CODE (decl) != TEMPLATE_DECL)
+	tmpl = DECL_TI_TEMPLATE (result);
+
   tree gargs = generic_targs_for (tmpl);
   processing_template_decl_sentinel s;
   if (uses_template_parms (gargs))
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 467256117ec..5c06ac3789e 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -2736,12 +2736,14 @@ struct GTY(()) lang_decl_fn {
  thunked to function decl.  */
   tree befriending_classes;
 
-  /* For a non-virtual FUNCTION_DECL, this is
- DECL_FRIEND_CONTEXT.  For a virtual FUNCTION_DECL for which
+  /* For a virtual FUNCTION_DECL for which
  DECL_THIS_THUNK_P does not hold, this is DECL_THUNKS. Both
  this pointer and result pointer adjusting thunks are
  chained here.  This pointer thunks to return pointer thunks
- will be chained on the return pointer thunk.  */
+ will be chained on the return pointer thunk.
+ For a DECL_CONSTUCTOR_P FUNCTION_DECL, this is the base from
+ whence we inherit.  Otherwise, it is the class in which a
+ (namespace-scope) friend is defined (if any).   */
   tree context;
 
   union lang_decl_u5
@@ -3088,10 +3090,14 @@ struct GTY(()) lang_decl {
   (DECL_LANG_SPECIFIC (VAR_OR_FUNCTION_DECL_CHECK (DECL)) \
->u.base.odr_used)
 
-/* Nonzero for DECL means that this decl is just a friend declaration,
-   and should not be added to the list of members for this class.  */
-#define DECL_FRIEND_P(NODE) \

Re: [PUSHED] operator_trunc_mod::wi_fold: Return VARYING for mod by zero.

2020-10-14 Thread Andrew MacLeod via Gcc-patches

On 10/13/20 2:10 AM, Richard Biener wrote:

On Mon, Oct 12, 2020 at 6:57 PM Aldy Hernandez via Gcc-patches
 wrote:

Division by zero should return VARYING, otherwise we propagate undefine all 
over the
ranger and cause bad things to happen :)

So we never should propagate UNDEFINED?


I added a comment in the PR.

the problem was that we were feeding an undefined into the branch at the 
bottom, and the old ranger model was that undefined meant unreachable.. 
and it would therefore make BOTH sides of the branch unreachable, and 
that was triggering some unpleasant side effects when interactive with 
the subst-and-fold model and trying to calculate a range feeding into 
the condition.


I had audited rangeops so that when we folded undefined values we used 
varying as their value (as in we don't know what the value is) but 
missed the mod code.  I plan to  make UNDEFINED more consistent, i just 
need to find the time to sit down and audit it.  I suspect there are 
still a couple of lingering dark corners where undefined interactions 
aren't quite right.


Andrew





[PATCH] configure: Suppress output from multi-do recipes

2020-10-14 Thread Jonathan Wakely via Gcc-patches
The FIXME comment saying "Leave out until this is tested a bit more" is
from 1997. I think it's been sufficiently tested.

ChangeLog:

* config-ml.in (multi-do): Add @ to silence recipe. Remove FIXME
comment.

OK for trunk?

This removes 44 lines of irrelevant noise from various build targets,
such as the 'check' target that runs the libstdc++ testsuite.

commit bf8497941453a68e7d1e79001ef7309e5adbad8b
Author: Jonathan Wakely 
Date:   Wed Oct 14 16:15:50 2020

configure: Suppress output from multi-do recipes

The FIXME comment saying "Leave out until this is tested a bit more" is
from 1997. I think it's been sufficiently tested.

ChangeLog:

* config-ml.in (multi-do): Add @ to silence recipe. Remove FIXME
comment.

diff --git a/config-ml.in b/config-ml.in
index 5720d38d23f..e799cd6a919 100644
--- a/config-ml.in
+++ b/config-ml.in
@@ -499,10 +499,8 @@ cat > Multi.tem <<\EOF
 
 PWD_COMMAND=$${PWDCMD-pwd}
 
-# FIXME: There should be an @-sign in front of the `if'.
-# Leave out until this is tested a bit more.
 multi-do:
-   if [ -z "$(MULTIDIRS)" ]; then \
+   @if [ -z "$(MULTIDIRS)" ]; then \
  true; \
else \
  rootpre=`${PWD_COMMAND}`/; export rootpre; \


Re: libbacktrace integration for _GLIBCXX_DEBUG mode

2020-10-14 Thread François Dumont via Gcc-patches
After further testing this version was bugged because ld considered that 
__create_backtrace/__render_backtrace symbols existed several times in 
the different linked .o.


I tried making those inline but it failed, __render_backtrace was not 
substituted anymore, only __create_backtrace was.


The correct (tested) fix was to make _Error_formatter methods using 
those symbols outline. So here is the new patch.



    libstdc++: [_GLIBCXX_DEBUG] Integrate libbacktrace

  New _GLIBCXX_DEBUG_BACKTRACE macro to activate backtrace 
generation on

    _GLIBCXX_DEBUG assertions using libbacktrace.

    * config/abi/pre/gnu.ver: Add new symbols.
    * include/debug/formatter.h [_GLIBCXX_DEBUG_BACKTRACE]:
    Include .
    [_GLIBCXX_DEBUG_BACKTRACE && BACKTRACE_SUPPORTED]:
    Include .
    [(!_GLIBCXX_DEBUG_BACKTRACE || !BACKTRACE_SUPPORTED) &&
    _GLIBCXX_USE_C99_STDINT_TR1]: Include .
    [BACKTRACE_SUPPORTED || _GLIBCXX_USE_C99_STDINT_TR1]
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (__gnu_debug::__create_backtrace_state): New.
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (__gnu_debug::__render_backtrace): New.
[_GLIBCXX_DEBUG_USE_LIBBACKTRACE](_Error_formatter::_M_print_backtrace):
    New.
[_GLIBCXX_DEBUG_USE_LIBBACKTRACE](_Error_formatter::_M_backtrace_state):
    New.
    (_Error_formatter::_Error_formatter): Outline definition.
    * src/c++11/debug.cc: Include .
    (_Print_func_t): New.
    (print_word): Use '%.*s' format in fprintf to render only 
expected

    number of chars.
    (print_raw(PrintContext&, const char*, ptrdiff_t)): New.
    (print_function(PrintContext&, const char*, 
_Print_func_t)): New.

    (print_type): Use latter.
    (print_string(PrintContext&, const char*, const 
_Parameter*, size_t)):

    Change signature to...
    (print_string(PrintContext&, const char*, ptrdiff_t, const 
_Parameter*,
    size_t)): ...this and adapt. Remove intermediate buffer to 
render input

    string.
    (print_string(PrintContext&, const char*, ptrdiff_t)): New.
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (print_backtrace(void*, uintptr_t, const char*, int, const 
char*)): New.

    (_Error_formatter::_M_error()): Adapt.
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (__gnu_debug::__create_backtrace_state): New, weak symbol.
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (__gnu_debug::__render_backtrace): New, weak symbol.
    * doc/xml/manual/debug_mode.xml: Document 
_GLIBCXX_DEBUG_BACKTRACE.

    * doc/xml/manual/using.xml: Likewise.

Ok to commit once I run all testsuite in _GLIBCXX_DEBUG with backtrace ?

François


On 08/10/20 9:32 pm, François Dumont wrote:
I eventually consider your last remark about using weak symbols to 
inject libbacktrace calls when _GLIBCXX_DEBUG_BACKTRACE is defined.


    libstdc++: [_GLIBCXX_DEBUG] Integrate libbacktrace

  Add _GLIBCXX_DEBUG_BACKTRACE macro to ask for a backtrace on 
_GLIBCXX_DEBUG

    assertions using libbacktrace.

    * config/abi/pre/gnu.ver: Add new symbols.
    * include/debug/formatter.h [_GLIBCXX_DEBUG_BACKTRACE]:
    Include .
    [_GLIBCXX_DEBUG_BACKTRACE && BACKTRACE_SUPPORTED]:
    Include .
    [(!_GLIBCXX_DEBUG_BACKTRACE || !BACKTRACE_SUPPORTED) &&
    _GLIBCXX_USE_C99_STDINT_TR1]: Include .
    [BACKTRACE_SUPPORTED || _GLIBCXX_USE_C99_STDINT_TR1]
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (__gnu_debug::__create_backtrace_state): New.
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (__gnu_debug::__render_backtrace): New.
[_GLIBCXX_DEBUG_USE_LIBBACKTRACE](_Error_formatter::_M_print_backtrace):
    New.
[_GLIBCXX_DEBUG_USE_LIBBACKTRACE](_Error_formatter::_M_backtrace_state):
    New.
    * src/c++11/debug.cc: Include .
    (_Print_func_t): New.
    (print_word): Use '%.*s' format in fprintf to render only 
expected

    number of chars.
    (print_raw(PrintContext&, const char*, ptrdiff_t)): New.
    (print_function(PrintContext&, const char*, 
_Print_func_t)): New.

    (print_type): Use latter.
    (print_string(PrintContext&, const char*, const 
_Parameter*, size_t)):

    Change signature to...
    (print_string(PrintContext&, const char*, ptrdiff_t, const 
_Parameter*,
    size_t)): ...this and adapt. Remove intermediate buffer to 
render input

    string.
    (print_string(PrintContext&, const char*, ptrdiff_t)): New.
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (print_backtrace(void*, uintptr_t, const char*, int, const 
char*)): New.

    (_Error_formatter::_M_error()): Adapt.
    

Re: [PATCH 5/X] libsanitizer: mid-end: Introduce stack variable handling for HWASAN

2020-10-14 Thread Richard Sandiford via Gcc-patches
Matthew Malcomson  writes:
> @@ -75,6 +89,31 @@ extern hash_set  *asan_used_labels;
>  
>  #define ASAN_USE_AFTER_SCOPE_ATTRIBUTE   "use after scope memory"
>  
> +/* NOTE: The values below define an ABI and are hard-coded to these values in
> +   libhwasan, hence they can't be changed independently here.  */
> +/* How many bits are used to store a tag in a pointer.
> +   HWASAN uses the entire top byte of a pointer (i.e. 8 bits).  */
> +#define HWASAN_TAG_SIZE 8
> +/* Tag Granule of HWASAN shadow stack.
> +   This is the size in real memory that each byte in the shadow memory refers
> +   to.  I.e. if a variable is X bytes long in memory then it's tag in shadow

s/it's/its/

> +   memory will span X / HWASAN_TAG_GRANULE_SIZE bytes.
> +   Most variables will need to be aligned to this amount since two variables
> +   that are neighbours in memory and share a tag granule would need to share

“neighbors” (alas)

> +   the same tag (the shared tag granule can only store one tag).  */
> +#define HWASAN_TAG_SHIFT_SIZE 4
> +#define HWASAN_TAG_GRANULE_SIZE (1ULL << HWASAN_TAG_SHIFT_SIZE)
> +/* Define the tag for the stack background.
> +   This defines what tag the stack pointer will be and hence what tag all
> +   variables that are not given special tags are (e.g. spilled registers,
> +   and parameters passed on the stack).  */
> +#define HWASAN_STACK_BACKGROUND 0
> +/* How many bits to shift in order to access the tag bits.
> +   The tag is stored in the top 8 bits of a pointer hence shifting 56 bits 
> will
> +   leave just the tag.  */
> +#define HWASAN_SHIFT 56
> +#define HWASAN_SHIFT_RTX const_int_rtx[MAX_SAVED_CONST_INT + HWASAN_SHIFT]
> +
>  /* Various flags for Asan builtins.  */
>  enum asan_check_flags
>  {
> diff --git a/gcc/asan.c b/gcc/asan.c
> index 
> 9c9aa4cae35832c1534a2cffac1d3d13eed0e687..f755a3290f1091be14fbe4c51d9579389e5eb245
>  100644
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -257,6 +257,15 @@ hash_set *asan_handled_variables = NULL;
>  
>  hash_set  *asan_used_labels = NULL;
>  
> +/* Global variables for HWASAN stack tagging.  */
> +/* tag_offset records the offset from the frame base tag that the next object
> +   should have.  */
> +static uint8_t tag_offset = 0;
> +/* hwasan_base_ptr is a pointer with the same address as
> +   `virtual_stack_vars_rtx` for the current frame, and with the frame base 
> tag
> +   stored in it.  */
> +static rtx hwasan_base_ptr = NULL_RTX;

Was initially surprised that this didn't need to be GTY, but I guess
there are no ggc_collect calls during the relevant parts of cfgexpand.

It's normally easy to tell when GTY is needed if all the users of the
state are in the same file, but for cases like this it's a little harder.
Might be worth a comment.

Might also be worth having “stack” or “frame” in these names, and similarly
for the public functions.

> @@ -1352,6 +1361,28 @@ asan_redzone_buffer::flush_if_full (void)
>  flush_redzone_payload ();
>  }
>  
> +/* Returns whether we are tagging pointers and checking those tags on memory
> +   access.  */
> +bool
> +hwasan_sanitize_p ()
> +{
> +return sanitize_flags_p (SANITIZE_HWADDRESS);

Nit: excess indentation.

> @@ -2901,6 +2932,11 @@ initialize_sanitizer_builtins (void)
>  = build_function_type_list (void_type_node, uint64_type_node,
>   ptr_type_node, NULL_TREE);
>  
> +  tree BT_FN_VOID_PTR_UINT8_SIZE
> += build_function_type_list (void_type_node, ptr_type_node,
> + unsigned_char_type_node, size_type_node,
> + NULL_TREE);

The function prototype seems to be:

  void __hwasan_tag_memory(uptr p, u8 tag, uptr sz)

and size_t doesn't have the same precision as pointers on all targets.
Maybe pointer_sized_int_node would be more accurate.  (Despite its name,
it's unsigned rather than signed.)

> @@ -3702,4 +3740,269 @@ make_pass_asan_O0 (gcc::context *ctxt)
>return new pass_asan_O0 (ctxt);
>  }
>  
> +/* For stack tagging:
> + Initialise tag of the base register.

“Initialize”

Very minor, but I've not seen this style of comment elsewhere in GCC,
where the main comment is indented by two extra spaces.  Think a blank
line and no extra indentation is more usual.

> + This has to be done as soon as the stack is getting expanded to ensure
> + anything emitted with `get_dynamic_stack_base` will use the value set 
> here
> + instead of using a register without a tag.
> + Especially note that RTL expansion of large aligned values does that.  
> */
> +void
> +hwasan_record_base (rtx base)
> +{
> +  targetm.memtag.gentag (base, virtual_stack_vars_rtx);
> +  hwasan_base_ptr = base;
> +}
> +
> +/* For stack tagging:
> + Return the offset from the frame base tag that the "next" expanded 
> object
> + should have.  */
> +uint8_t
> +hwasan_current_tag ()
> +{
> +  return tag_offset;
> +}
> +
> +/* For stack tagging:
> + Increment the tag offset modulo the size a tag can represent.  

[PATCH] c-family: Fix regression in location-overflow-test-1.c [PR97117]

2020-10-14 Thread Patrick Palka via Gcc-patches
The r11-3266 patch that added macro support to -Wmisleading-indentation
accidentally suppressed the column-tracking diagnostic in
get_visual_column in some cases, e.g. in the location-overflow-test-1.c
testcase.

More generally, when all three tokens are on the same line and we've run
out of locations with column info, then their location_t values will be
equal, and we exit early from should_warn_for_misleading_indentation due
to the new check

  /* Give up if the loci are not all distinct.  */
  if (guard_loc == body_loc || body_loc == next_stmt_loc)
return false;

before we ever call get_visual_column.

[ This new check is needed to detect and give up on analyzing code
  fragments where exactly two out of the three tokens come from the same
  macro expansion, e.g.

#define MACRO \
  if (a)  \
foo ();

MACRO; bar ();

  Here, guard_loc and body_loc will be equal and point to the macro
  expansion point.  The heuristics the warning uses are not really valid
  in scenarios like these.  ]

In order to restore the column-tracking diagnostic, this patch moves the
the diagnostic code out from get_visual_column to earlier in
should_warn_for_misleading_indentation.  Moreover, it tests the three
location_t values for a zero column all at once, which I suppose should
make us issue the diagnostic more consistently.

Tested on x86_64-pc-linux-gnu, does this look OK to commit?

gcc/c-family/ChangeLog:

PR testsuite/97117
* c-indentation.c (get_visual_column): Remove location_t
parameter.  Move the column-tracking diagnostic code from here
to ...
(should_warn_for_misleading_indentation): ... here, before the
early exit for when the loci are not all distinct.  Don't pass a
location_t argument to get_visual_column.
(assert_get_visual_column_succeeds): Don't pass a location_t
argument to get_visual_column.
(assert_get_visual_column_fails): Likewise.
---
 gcc/c-family/c-indentation.c | 70 ++--
 1 file changed, 34 insertions(+), 36 deletions(-)

diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index 8b88a8adc7c..836a524f266 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -45,36 +45,11 @@ next_tab_stop (unsigned int vis_column, unsigned int 
tab_width)
on the line (up to or before EXPLOC).  */
 
 static bool
-get_visual_column (expanded_location exploc, location_t loc,
+get_visual_column (expanded_location exploc,
   unsigned int *out,
   unsigned int *first_nws,
   unsigned int tab_width)
 {
-  /* PR c++/68819: if the column number is zero, we presumably
- had a location_t > LINE_MAP_MAX_LOCATION_WITH_COLS, and so
- we have no column information.
- Act as if no conversion was possible, triggering the
- error-handling path in the caller.  */
-  if (!exploc.column)
-{
-  static bool issued_note = false;
-  if (!issued_note)
-   {
- /* Notify the user the first time this happens.  */
- issued_note = true;
- inform (loc,
- "%<-Wmisleading-indentation%> is disabled from this point"
- " onwards, since column-tracking was disabled due to"
- " the size of the code/headers");
- if (!flag_large_source_files)
-   inform (loc,
-   "adding %<-flarge-source-files%> will allow for more" 
-   " column-tracking support, at the expense of compilation"
-   " time and memory");
-   }
-  return false;
-}
-
   char_span line = location_get_source_line (exploc.file, exploc.line);
   if (!line)
 return false;
@@ -325,14 +300,37 @@ should_warn_for_misleading_indentation (const 
token_indent_info _tinfo,
NULL);
 }
 
-  /* Give up if the loci are not all distinct.  */
-  if (guard_loc == body_loc || body_loc == next_stmt_loc)
-return false;
-
   expanded_location body_exploc = expand_location (body_loc);
   expanded_location next_stmt_exploc = expand_location (next_stmt_loc);
   expanded_location guard_exploc = expand_location (guard_loc);
 
+  /* PR c++/68819: if the column number is zero, we presumably
+ had a location_t > LINE_MAP_MAX_LOCATION_WITH_COLS, and so
+ we have no column information.  */
+  if (!guard_exploc.column || !body_exploc.column || !next_stmt_exploc.column)
+{
+  static bool issued_note = false;
+  if (!issued_note)
+   {
+ /* Notify the user the first time this happens.  */
+ issued_note = true;
+ inform (guard_loc,
+ "%<-Wmisleading-indentation%> is disabled from this point"
+ " onwards, since column-tracking was disabled due to"
+ " the size of the code/headers");
+ if (!flag_large_source_files)
+   inform (guard_loc,
+  

Re: PING [PATCH] Enable GCC support for Intel Key Locker extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
Hello!

> This patch is about to support Intel Key Locker extension.
>
> Key Locker provides a mechanism to encrypt and decrypt data with an AES
key without having access to the raw key value.
>
> For more details, please refer to
https://software.intel.com/content/dam/develop/external/us/en/documents/343965-intel-key-locker-specification.pdf
.
>
> Bootstrap ok, regression test on i386/x86 backend is ok.
>
> OK for master?

@@ -1414,6 +1418,13 @@ enum reg_class
   FP_TOP_REG, FP_SECOND_REG, /* %st(0) %st(1) */
   FLOAT_REGS,
   SSE_FIRST_REG,
+  SSE_SECOND_REG,
+  SSE_THIRD_REG,
+  SSE_FOURTH_REG,
+  SSE_FIFTH_REG,
+  SSE_SIXTH_REG,
+  SSE_SEVENTH_REG,
+  SSE_EIGHTH_REG,
   NO_REX_SSE_REGS,
   SSE_REGS,
   ALL_SSE_REGS,
@@ -1474,6 +1485,13 @@ enum reg_class
"FP_TOP_REG", "FP_SECOND_REG", \
"FLOAT_REGS", \
"SSE_FIRST_REG", \
+   "SSE_SECOND_REG", \
+   "SSE_THIRD_REG", \
+   "SSE_FOURTH_REG", \
+   "SSE_FIFTH_REG", \
+   "SSE_SIXTH_REG", \
+   "SSE_SEVENTH_REG", \
+   "SSE_EIGHTH_REG", \
"NO_REX_SSE_REGS", \
"SSE_REGS", \
"ALL_SSE_REGS", \
@@ -1513,6 +1531,13 @@ enum reg_class
  { 0x200,0x0,   0x0 }, /* FP_SECOND_REG */ \
 { 0xff00,0x0,   0x0 }, /* FLOAT_REGS */ \
   { 0x10,0x0,   0x0 }, /* SSE_FIRST_REG */ \
+  { 0x20,0x0,   0x0 }, /* SSE_SECOND_REG */ \
+  { 0x40,0x0,   0x0 }, /* SSE_THIRD_REG */ \
+  { 0x80,0x0,   0x0 }, /* SSE_FOURTH_REG */ \
+ { 0x100,0x0,   0x0 }, /* SSE_FIFTH_REG */ \
+ { 0x200,0x0,   0x0 }, /* SSE_SIXTH_REG*/ \
+ { 0x400,0x0,   0x0 }, /* SSE_SEVENTH_REG */ \
+ { 0x800,0x0,   0x0 }, /* SSE_EIGHTH_REG */ \
  { 0xff0,0x0,   0x0 }, /* NO_REX_SSE_REGS */ \
  { 0xff0,0xff000,   0x0 }, /* SSE_REGS */ \
  { 0xff0, 0xf000,   0xf }, /* ALL_SSE_REGS */ \

IIRC, adding a new regclass is O(n^2), so it should be avoided. I
think that the new patterns should follow the same path as vzeroall
and vzeroupper patterns, where we emit the pattern with explicit hard
regs.

BTW: We do have SSE_FIRST_REG class, but this class was added to solve
some reload problems in the past by marking %xmm0 as likely spilled.

Uros.


ping x3 [PATCH 0/5] MSP430: Implement macros to describe relative costs of operations

2020-10-14 Thread Jozef Lawrynowicz
3rd ping for below.

On Tue, Sep 15, 2020 at 09:30:22PM +0100, Jozef Lawrynowicz wrote:
> Ping x2 for below.
> 
> On Fri, Aug 07, 2020 at 12:02:59PM +0100, Jozef Lawrynowicz wrote:
> > Pinging for this series of patches.
> > Attached all patches to this mail with the ammended patch 4 thanks to
> > Segher's review.
> > 
> > Thanks,
> > Jozef
> > 
> > On Thu, Jul 23, 2020 at 04:43:56PM +0100, Jozef Lawrynowicz wrote:
> > > The following series of patches for MSP430 implement some of the target
> > > macros used to determine the relative costs of operations.
> > > 
> > > To give an indication of the overall effect of these changes on
> > > codesize, below are some size statistics collected from all the
> > > executable files from execute.exp that are built at -Os.
> > > There are around 1470 such tests (depending on the configuration).
> > > 
> > > The percentage change (((new - old)/old) * 100) in text size is calculated
> > > for each test and the given metric is applied to that overall set of data.
> > > 
> > > Configuration | Mean (%) | Median (%) | Delta < 0 (count) | Delta > 0 
> > > (count)
> > > -
> > > -mcpu=msp430  |  -2.4|   -2.7 |  1454 |  17
> > > -mcpu=msp430x |  -2.3|   -2.4 |  1460 |  10
> > > -mlarge   |  -1.7|   -1.9 |  1412 |  37
> > > 
> > > Successfully regtested on trunk for msp430-elf, ok to apply?
> > > 
> > > Jozef Lawrynowicz (5):
> > >   MSP430: Implement TARGET_MEMORY_MOVE_COST
> > >   MSP430: Implement TARGET_RTX_COSTS
> > >   MSP430: Add defaulting to the insn length attribute
> > >   MSP430: Implement TARGET_INSN_COST
> > >   MSP430: Skip index-1.c test
> > > 
> > >  gcc/config/msp430/msp430-protos.h |   5 +-
> > >  gcc/config/msp430/msp430.c| 867 --
> > >  gcc/config/msp430/msp430.h|  13 +
> > >  gcc/config/msp430/msp430.md   | 439 +++--
> > >  gcc/config/msp430/msp430.opt  |   4 +
> > >  gcc/config/msp430/predicates.md   |  13 +
> > >  gcc/testsuite/gcc.c-torture/execute/index-1.c |   2 +
> > >  7 files changed, 1206 insertions(+), 137 deletions(-)
> > > 
> > > -- 
> > > 2.27.0
> > > 
> 
>From e260de5a31e661afdfaaf2c8053b574a292d6826 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 16 Jul 2020 11:28:11 +0100
Subject: [PATCH 1/5] MSP430: Implement TARGET_MEMORY_MOVE_COST

The cycle and size cost of a MOV instruction in different addressing
modes can be used to calculate the TARGET_MEMORY_MOVE_COST relative to
TARGET_REGISTER_MOVE_COST.

gcc/ChangeLog:

* config/msp430/msp430.c (struct single_op_cost): New struct.
(struct double_op_cost): Likewise.
(TARGET_REGISTER_MOVE_COST): Don't define but add comment.
(TARGET_MEMORY_MOVE_COST): Define to...
(msp430_memory_move_cost): New function.
(BRANCH_COST): Don't define but add comment.
---
 gcc/config/msp430/msp430.c | 131 +
 1 file changed, 131 insertions(+)

diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index c2b24974364..9e739233fa0 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -1043,6 +1043,137 @@ msp430_legitimate_constant (machine_mode mode, rtx x)
 }
 
 
+/* Describing Relative Costs of Operations
+   To model the cost of an instruction, use the number of cycles when
+   optimizing for speed, and the number of words when optimizing for size.
+   The cheapest instruction will execute in one cycle and cost one word.
+   The cycle and size costs correspond to 430 ISA instructions, not 430X
+   instructions or 430X "address" instructions.  The relative costs of 430X
+   instructions is accurately modeled with the 430 costs.  The relative costs
+   of some "address" instructions can differ, but these are not yet handled.
+   Adding support for this could improve performance/code size.  */
+
+const int debug_rtx_costs = 0;
+
+struct single_op_cost
+{
+  const int reg;
+  /* Indirect register (@Rn) or indirect autoincrement (@Rn+).  */
+  const int ind;
+  const int mem;
+};
+
+static const struct single_op_cost cycle_cost_single_op =
+{
+  1, 3, 4
+};
+
+static const struct single_op_cost size_cost_single_op =
+{
+  1, 1, 2
+};
+
+/* When the destination of an insn is memory, the cost is always the same
+   regardless of whether that memory is accessed using indirect register,
+   indexed or absolute addressing.
+   When the source operand is memory, indirect register and post-increment have
+   the same cost, which is lower than indexed and absolute, which also have
+   the same cost.  */
+struct double_op_cost
+{
+  /* Source operand is a register.  */
+  const int r2r;
+  const int r2pc;
+  const int r2m;
+
+  /* Source operand is memory, using indirect register (@Rn) or indirect
+ autoincrement (@Rn+) addressing 

[committed] libstdc++: Fix tests that fail with old std::string ABI

2020-10-14 Thread Jonathan Wakely via Gcc-patches
These two tests have started to fail with the old std::string ABI. The
scan-assembler-not checks fail because they match debug info, not code.

Adding -g0 to the test flags fixes them.

libstdc++-v3/ChangeLog:

* 
testsuite/21_strings/basic_string/modifiers/assign/char/move_assign_optim.cc:
Do not generate debug info.
* 
testsuite/21_strings/basic_string/modifiers/assign/wchar_t/move_assign_optim.cc:
Likewise.

Tested powerpc64le-linux. Committed to trunk.

commit 2b9c09a78b048328e41419e6b941cf0207bfd6bc
Author: Jonathan Wakely 
Date:   Wed Oct 14 16:15:49 2020

libstdc++: Fix tests that fail with old std::string ABI

These two tests have started to fail with the old std::string ABI. The
scan-assembler-not checks fail because they match debug info, not code.

Adding -g0 to the test flags fixes them.

libstdc++-v3/ChangeLog:

* 
testsuite/21_strings/basic_string/modifiers/assign/char/move_assign_optim.cc:
Do not generate debug info.
* 
testsuite/21_strings/basic_string/modifiers/assign/wchar_t/move_assign_optim.cc:
Likewise.

diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/char/move_assign_optim.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/char/move_assign_optim.cc
index 85584d68e47..9546ca68e4d 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/char/move_assign_optim.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/char/move_assign_optim.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-O1" }
+// { dg-options "-O1 -g0" }
 // { dg-do compile { target c++11 } }
 // { dg-final { scan-assembler-not "__throw_length_error" } }
 // { dg-final { scan-assembler-not "__throw_bad_alloc" } }
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/wchar_t/move_assign_optim.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/wchar_t/move_assign_optim.cc
index 9f0a86f3dff..752856b800d 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/wchar_t/move_assign_optim.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/assign/wchar_t/move_assign_optim.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-O1" }
+// { dg-options "-O1 -g0" }
 // { dg-do compile { target c++11 } }
 // { dg-final { scan-assembler-not "__throw_length_error" } }
 // { dg-final { scan-assembler-not "__throw_bad_alloc" } }


Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 5:06 PM Hongyu Wang  wrote:
>
> Hi Uros,
>
> Sorry for my misunderstanding. The test is for the correctness check
> of intrinsic header.
> I have add -muintr to x86gprintrin-{1,2,3,4,5}.c.
>
> UINTR is 64bit only, so I add them with dg-additional-option.
>
> Updated patch. If you agree, we will check-in the attached patch.

Yes, the patch is OK.

Thanks,
Uros.

> Thanks for your help.
>
> H.J. Lu  于2020年10月14日周三 下午9:35写道:
> >
> > On Wed, Oct 14, 2020 at 6:31 AM Hongyu Wang via Gcc-patches
> >  wrote:
> > >
> > > Uros Bizjak  于2020年10月14日周三 下午7:19写道:
> > > >
> > > > > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> > > > > > >> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new
> > > intrinsics
> > > > > > >> > > header.
> > > > > > >> > >
> > > > > > >> >
> > > > > > >> > Thanks for your review. We found that without adding -muintr,
> > > the intrinsics header could also be tested. Make-check for these file all
> > > get passed.
> > > > > > >> >
> > > > > > >> > And there is no intrinsic/builtin with const int parameter. So
> > > we remove -muintr from these files.
> > > > > > >>
> > > > > > >> Can your double check that relevant instructions are indeed
> > > generated?
> > > > > > >> Without -muintr, relevant patterns in i386.md are effectively
> > > blocked,
> > > > > > >> and perhaps a call to __builtin_ia32_* is generated instead.
> > > > > > >
> > > > > > >
> > > > > > > Yes, in sse-14.s we have
> > > > > > >
> > > > > > > _clui:
> > > > > > > .LFB136:
> > > > > > > .cfi_startproc
> > > > > > > pushq   %rbp
> > > > > > > .cfi_def_cfa_offset 16
> > > > > > > .cfi_offset 6, -16
> > > > > > > movq%rsp, %rbp
> > > > > > > .cfi_def_cfa_register 6
> > > > > > > clui
> > > > > > > nop
> > > > > > > popq%rbp
> > > > > > > .cfi_def_cfa 7, 8
> > > > > > > ret
> > > > > > > .cfi_endproc
> > > > > >
> > > > > > Strange, without -muintr, it should not be generated, and some error
> > > > > > about failed inlining due to target specific option mismatch shoul 
> > > > > > be
> > > > > > emitted.
> > > > > >
> > > > > > Can you please investigate this a bit more?
> > > > > >
> > > > >
> > > > > Because of function target attribute?
> > > >
> > > > I don't think so. Please consider this similar testcase:
> > > >
> > > > --cut here--
> > > > #ifndef __SSE2__
> > > > #pragma GCC push_options
> > > > #pragma GCC target("sse2")
> > > > #define __DISABLE_SSE2__
> > > > #endif /* __SSE2__ */
> > > >
> > > > typedef double __v2df __attribute__ ((__vector_size__ (16)));
> > > > typedef double __m128d __attribute__ ((__vector_size__ (16),
> > > __may_alias__));
> > > >
> > > > extern __inline __m128d __attribute__((__gnu_inline__,
> > > > __always_inline__, __artificial__))
> > > > _mm_add_sd (__m128d __A, __m128d __B)
> > > > {
> > > >   return (__m128d)__builtin_ia32_addsd ((__v2df)__A, (__v2df)__B);
> > > > }
> > > >
> > > > #ifdef __DISABLE_SSE2__
> > > > #undef __DISABLE_SSE2__
> > > > #pragma GCC pop_options
> > > > #endif /* __DISABLE_SSE2__ */
> > > >
> > > >
> > > > __v2df foo (__v2df a, __v2df b)
> > > > {
> > > >   return _mm_add_sd (a, b);
> > > > }
> > > > --cut here--
> > > >
> > > > $ gcc -O2 -mno-sse2 -S -dp sse2.c
> > > > sse2.c: In function ‘foo’:
> > > > sse2.c:11:1: error: inlining failed in call to ‘always_inline’
> > > > ‘_mm_add_sd’: target specific option mismatch
> > > >   11 | _mm_add_sd (__m128d __A, __m128d __B)
> > > >  | ^~
> > > > sse2.c:24:10: note: called from here
> > > >   24 |   return _mm_add_sd (a, b);
> > > >  |  ^
> > > >
> > > > I'd expect some similar warning from missing -mumip.
> > > >
> > >
> > > For this case, I can confirm uintr could generate similar warning without
> > > -muintr. But
> > > sse-{12,13,14,22,23}.c will not test intrinsic call for uintr, since it
> > > doesn't have const
> > > int parameter intrinsics.
> > >
> > > sse-{13,14,22,23}.c has
> > >
> > > #define extern
> > > #define __inline
> > >
> > > So intrinsic will be treated as common call to builtin, then
> > >
> > > #pragma GCC push_options
> > > #pragma GCC target("uintr")
> > >
> > > ensures the builtin could be expanded correctly.
> > >
> > > I think the intrinsic call test should be in uintr-1.c, so it is redundant
> > > to add -muintr in sse-{12,13,14,22,23}.c
> > > or x86gprintrin-*.c.
> >
> > Please add UINTR intrinsic tests to x86gprintrin-*.c to cover such
> > usages.
> >
> > > >
> > > > Uros.
> >
> >
> >
> > --
> > H.J.


Re: [Patch] x86: Enable GCC support for Intel Hreset extension

2020-10-14 Thread Hongyu Wang via Gcc-patches
>
> The patch doesn't include all testsuite changes.
>

Yes, I update -mhreset in x86gprintrin-{1,2,3,4,5}.c

We will check-in the attached patch. Thanks.

Uros Bizjak  于2020年10月14日周三 下午2:26写道:
>
> On Tue, Oct 13, 2020 at 10:49 AM Hongyu Wang 
wrote:
> >
> > Hi:
> >
> > This patch is about to support Intel Hreset instruction.
> >
> > Hreset provides a hint to the processor to selectively reset the
prediction history of the current logical processor.
> >
> > For more details, please refer to
https://software.intel.com/content/dam/develop/external/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf
> >
> > Bootstrap ok, regression test on i386/x86 backend is ok.
> >
> > OK for master?
> >
> > gcc/
> >
> > * common/config/i386/cpuinfo.h (get_available_features):
> > Detect HRESET.
> > * common/config/i386/i386-common.c (OPTION_MASK_ISA2_HRESET_SET,
> > OPTION_MASK_ISA2_HRESET_UNSET): New macros.
> > (ix86_handle_option): Handle -mhreset.
> > * common/config/i386/i386-cpuinfo.h (enum processor_features):
> > Add FEATURE_HRESET.
> > * common/config/i386/i386-isas.h: Add ISA_NAMES_TABLE_ENTRY
> > for hreset.
> > * config.gcc: Add hresetintrin.h
> > * config/i386/hresetintrin.h: New header file.
> > * config/i386/x86gprintrin.h: Include hresetintrin.h.
> > * config/i386/cpuid.h (bit_HRESET): New.
> > * config/i386/i386-builtin.def: Add new builtin.
> > * config/i386/i386-expand.c (ix86_expand_builtin):
> > Handle new builtin.
> > * config/i386/i386-c.c (ix86_target_macros_internal): Define
> > __HRESET__.
> > * config/i386/i386-options.c (isa2_opts): Add -mhreset.
> > (ix86_valid_target_attribute_inner_p): Handle hreset.
> > * config/i386/i386.h (TARGET_HRESET, TARGET_HRESET_P,
> > PTA_HRESET): New.
> > (PTA_ALDERLAKE): Add PTA_HRESET.
> > * config/i386/i386.opt: Add option -mhreset.
> > * config/i386/i386.md (UNSPECV_HRESET): New unspec.
> > (hreset): New define_insn.
> > * doc/invoke.texi: Document -mhreset.
> > * doc/extend.texi: Document hreset.
> >
> > gcc/testsuite/
> >
> > * gcc.target/i386/hreset-1.c: New test.
> > * gcc.target/i386/funcspec-56.inc: Add new target attribute.
> > * gcc.target/i386/sse-12.c: Update -mhreset.
> > * gcc.target/i386/sse-13.c: Likewise.
> > * gcc.target/i386/sse-14.c: Likewise.
> > * gcc.target/i386/sse-22.c: Likewise.
> > * gcc.target/i386/sse-23.c: Likewise.
> > * g++.dg/other/i386-2.C: Likewise.
> > * g++.dg/other/i386-3.C: Likewise.
>
> The patch doesn't include all testsuite changes.
>
> Otherwise OK.
>
> Thanks,
> Uros.



-- 
Regards,

Hongyu, Wang
From 765e5e15a7e07d742653b13af1fc6d39b9f376c4 Mon Sep 17 00:00:00 2001
From: Hongyu Wang 
Date: Tue, 7 Apr 2020 18:39:53 +
Subject: [PATCH] Enable Intel HRESET Instruction

gcc/

	* common/config/i386/cpuinfo.h (get_available_features):
	Detect HRESET.
	* common/config/i386/i386-common.c (OPTION_MASK_ISA2_HRESET_SET,
	OPTION_MASK_ISA2_HRESET_UNSET): New macros.
	(ix86_handle_option): Handle -mhreset.
	* common/config/i386/i386-cpuinfo.h (enum processor_features):
	Add FEATURE_HRESET.
	* common/config/i386/i386-isas.h: Add ISA_NAMES_TABLE_ENTRY
	for hreset.
	* config.gcc: Add hresetintrin.h
	* config/i386/hresetintrin.h: New header file.
	* config/i386/x86gprintrin.h: Include hresetintrin.h.
	* config/i386/cpuid.h (bit_HRESET): New.
	* config/i386/i386-builtin.def: Add new builtin.
	* config/i386/i386-expand.c (ix86_expand_builtin):
	Handle new builtin.
	* config/i386/i386-c.c (ix86_target_macros_internal): Define
	__HRESET__.
	* config/i386/i386-options.c (isa2_opts): Add -mhreset.
	(ix86_valid_target_attribute_inner_p): Handle hreset.
	* config/i386/i386.h (TARGET_HRESET, TARGET_HRESET_P,
	PTA_HRESET): New.
	(PTA_ALDERLAKE): Add PTA_HRESET.
	* config/i386/i386.opt: Add option -mhreset.
	* config/i386/i386.md (UNSPECV_HRESET): New unspec.
	(hreset): New define_insn.
	* doc/invoke.texi: Document -mhreset.
	* doc/extend.texi: Document hreset.

gcc/testsuite/

	* gcc.target/i386/hreset-1.c: New test.
	* gcc.target/i386/funcspec-56.inc: Add new target attribute.
	* gcc.target/i386/x86gprintrin-1.c: Add -mhreset.
	* gcc.target/i386/x86gprintrin-2.c: Ditto.
	* gcc.target/i386/x86gprintrin-3.c: Ditto.
	* gcc.target/i386/x86gprintrin-4.c: Add mhreset.
	* gcc.target/i386/x86gprintrin-5.c: Ditto.
---
 gcc/common/config/i386/cpuinfo.h  |  3 ++
 gcc/common/config/i386/i386-common.c  | 15 ++
 gcc/common/config/i386/i386-cpuinfo.h |  1 +
 gcc/common/config/i386/i386-isas.h|  1 +
 gcc/config.gcc|  4 +-
 gcc/config/i386/cpuid.h   |  1 +
 gcc/config/i386/hresetintrin.h| 48 +++
 gcc/config/i386/i386-builtin.def  |  3 ++
 gcc/config/i386/i386-c.c  |  3 +-
 gcc/config/i386/i386-expand.c |  8 

Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Hongyu Wang via Gcc-patches
Hi Uros,

Sorry for my misunderstanding. The test is for the correctness check
of intrinsic header.
I have add -muintr to x86gprintrin-{1,2,3,4,5}.c.

UINTR is 64bit only, so I add them with dg-additional-option.

Updated patch. If you agree, we will check-in the attached patch.

Thanks for your help.

H.J. Lu  于2020年10月14日周三 下午9:35写道:
>
> On Wed, Oct 14, 2020 at 6:31 AM Hongyu Wang via Gcc-patches
>  wrote:
> >
> > Uros Bizjak  于2020年10月14日周三 下午7:19写道:
> > >
> > > > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> > > > > >> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new
> > intrinsics
> > > > > >> > > header.
> > > > > >> > >
> > > > > >> >
> > > > > >> > Thanks for your review. We found that without adding -muintr,
> > the intrinsics header could also be tested. Make-check for these file all
> > get passed.
> > > > > >> >
> > > > > >> > And there is no intrinsic/builtin with const int parameter. So
> > we remove -muintr from these files.
> > > > > >>
> > > > > >> Can your double check that relevant instructions are indeed
> > generated?
> > > > > >> Without -muintr, relevant patterns in i386.md are effectively
> > blocked,
> > > > > >> and perhaps a call to __builtin_ia32_* is generated instead.
> > > > > >
> > > > > >
> > > > > > Yes, in sse-14.s we have
> > > > > >
> > > > > > _clui:
> > > > > > .LFB136:
> > > > > > .cfi_startproc
> > > > > > pushq   %rbp
> > > > > > .cfi_def_cfa_offset 16
> > > > > > .cfi_offset 6, -16
> > > > > > movq%rsp, %rbp
> > > > > > .cfi_def_cfa_register 6
> > > > > > clui
> > > > > > nop
> > > > > > popq%rbp
> > > > > > .cfi_def_cfa 7, 8
> > > > > > ret
> > > > > > .cfi_endproc
> > > > >
> > > > > Strange, without -muintr, it should not be generated, and some error
> > > > > about failed inlining due to target specific option mismatch shoul be
> > > > > emitted.
> > > > >
> > > > > Can you please investigate this a bit more?
> > > > >
> > > >
> > > > Because of function target attribute?
> > >
> > > I don't think so. Please consider this similar testcase:
> > >
> > > --cut here--
> > > #ifndef __SSE2__
> > > #pragma GCC push_options
> > > #pragma GCC target("sse2")
> > > #define __DISABLE_SSE2__
> > > #endif /* __SSE2__ */
> > >
> > > typedef double __v2df __attribute__ ((__vector_size__ (16)));
> > > typedef double __m128d __attribute__ ((__vector_size__ (16),
> > __may_alias__));
> > >
> > > extern __inline __m128d __attribute__((__gnu_inline__,
> > > __always_inline__, __artificial__))
> > > _mm_add_sd (__m128d __A, __m128d __B)
> > > {
> > >   return (__m128d)__builtin_ia32_addsd ((__v2df)__A, (__v2df)__B);
> > > }
> > >
> > > #ifdef __DISABLE_SSE2__
> > > #undef __DISABLE_SSE2__
> > > #pragma GCC pop_options
> > > #endif /* __DISABLE_SSE2__ */
> > >
> > >
> > > __v2df foo (__v2df a, __v2df b)
> > > {
> > >   return _mm_add_sd (a, b);
> > > }
> > > --cut here--
> > >
> > > $ gcc -O2 -mno-sse2 -S -dp sse2.c
> > > sse2.c: In function ‘foo’:
> > > sse2.c:11:1: error: inlining failed in call to ‘always_inline’
> > > ‘_mm_add_sd’: target specific option mismatch
> > >   11 | _mm_add_sd (__m128d __A, __m128d __B)
> > >  | ^~
> > > sse2.c:24:10: note: called from here
> > >   24 |   return _mm_add_sd (a, b);
> > >  |  ^
> > >
> > > I'd expect some similar warning from missing -mumip.
> > >
> >
> > For this case, I can confirm uintr could generate similar warning without
> > -muintr. But
> > sse-{12,13,14,22,23}.c will not test intrinsic call for uintr, since it
> > doesn't have const
> > int parameter intrinsics.
> >
> > sse-{13,14,22,23}.c has
> >
> > #define extern
> > #define __inline
> >
> > So intrinsic will be treated as common call to builtin, then
> >
> > #pragma GCC push_options
> > #pragma GCC target("uintr")
> >
> > ensures the builtin could be expanded correctly.
> >
> > I think the intrinsic call test should be in uintr-1.c, so it is redundant
> > to add -muintr in sse-{12,13,14,22,23}.c
> > or x86gprintrin-*.c.
>
> Please add UINTR intrinsic tests to x86gprintrin-*.c to cover such
> usages.
>
> > >
> > > Uros.
>
>
>
> --
> H.J.
From 7b5ba74517bf58b9f985bfe7591372c64a313ad7 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Mon, 20 May 2019 17:56:41 +0800
Subject: [PATCH] Enable gcc support for UINTR

2020-05-20  Hongtao Liu  

gcc/
	* common/config/i386/cpuinfo.h (get_available_features):
	Detect UINTR.
	* common/config/i386/i386-common.c (OPTION_MASK_ISA2_UINTR_SET
	OPTION_MASK_ISA2_UINTR_UNSET): New.
	(ix86_handle_option): Handle -muintr.
	* common/config/i386/i386-cpuinfo.h (enum processor_features):
	Add FEATURE_UINTR.
	* common/config/i386/i386-isas.h: Add ISA_NAMES_TABLE_ENTRY
	for uintr.
	* config.gcc: Add uintrintrin.h to extra_headers.
	* config/i386/uintrintrin.h: New.
	* config/i386/cpuid.h (bit_UINTR): New.
	* config/i386/driver-i386.c (host_detect_local_cpu): Detect UINTR.
	* 

Re: [RFA,PATCH] Bail in bounds_of_var_in_loop if no step found.

2020-10-14 Thread Aldy Hernandez via Gcc-patches




On 10/14/20 4:31 PM, Richard Biener wrote:

On Wed, Oct 14, 2020 at 4:19 PM Aldy Hernandez  wrote:




On 10/14/20 9:43 AM, Richard Biener wrote:

On Tue, Oct 13, 2020 at 6:12 PM Aldy Hernandez  wrote:




On 10/13/20 6:02 PM, Richard Biener wrote:

On October 13, 2020 5:17:48 PM GMT+02:00, Aldy Hernandez via Gcc-patches 
 wrote:

[Neither Andrew nor I are familiar with the SCEV code.  We treat it as
a
black box :).  So we could use a SCEV expert here.]

In bounds_of_var_in_loop, evolution_part_in_loop_num is returning NULL:

 step = evolution_part_in_loop_num (chrec, loop->num);


(*)



It means that Var doesn't vary in the loop.
That is, chrec isn't a polynomial chrec.


That's what I thought, but it is:

(gdb) p chrec
$6 = 
(gdb) dd chrec
{0, +, 1}_2

evolution_part_in_loop_num() is returning NULL deep in
chrec_component_in_loop_num():

default:
=>if (right)
   return NULL_TREE;
 else
   return chrec;

Do you have any suggestions?


I can only guess (w/o a testcase) that loop->num at (*) is not 2 and thus that
chrec does not evolve in the loop we're asking.  But this doesn't make much
sense with the constraints we are calling this function (a loop header PHI
with loop == the loop and stmt a loop header PHI and var the PHIs lhs).

OK, so looking at the testcase you're doing

492   class loop *l = loop_containing_stmt (phi);
493   if (l)
494 {
495   range_of_ssa_name_with_loop_info (loop_range,
phi_def, l, phi);

but 'l' isn't a loop, it's the loop tree root.  Change to

if (l && loop_outer (l))


Woah.  I did not expect that.

A quick peek shows that all users of bounds_of_var_in_loop (through
adjust_range_with_scev) predicate the call with:

 && l->header == gimple_bb (phi))

If this check is similar to the loop_outer(l) you suggest, could we
perhaps push this check (and/or the loop_outer one) into
bounds_of_var_in_loop itself and remove it from all the callers?  It
seems cleaner to have this check in one place, than in three different
places.  That is, unless the l->header check is altogether a different
thing than loop_outer(l).


The code also works for stmts other than PHIs (or stmts in other blocks
than the loop header), so IMHO is not appropriate for bounds_of_var_in_loop.


Ahhh.   Makes perfect sense.

The patch below passes tests.  I've pushed it.

Thanks.
Aldy

gcc/ChangeLog:

PR tree-optimization/97396
* gimple-range.cc (gimple_ranger::range_of_phi): Do not call
range_of_ssa_name_with_loop_info with the loop tree root.

gcc/testsuite/ChangeLog:

* gcc.dg/pr97396.c: New test.
---
 gcc/gimple-range.cc|  2 +-
 gcc/testsuite/gcc.dg/pr97396.c | 23 +++
 2 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr97396.c

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 2ca86ed0e4c..999d631c5ee 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -490,7 +490,7 @@ gimple_ranger::range_of_phi (irange , gphi *phi)
 {
   value_range loop_range;
   class loop *l = loop_containing_stmt (phi);
-  if (l)
+  if (l && loop_outer (l))
 {
  range_of_ssa_name_with_loop_info (loop_range, phi_def, l, phi);
  if (!loop_range.varying_p ())
diff --git a/gcc/testsuite/gcc.dg/pr97396.c b/gcc/testsuite/gcc.dg/pr97396.c
new file mode 100644
index 000..d992c11f238
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr97396.c
@@ -0,0 +1,23 @@
+// { dg-do compile }
+// { dg-options "-O1 -ftree-vrp" }
+// { dg-additional-options "-m32" { target { i?86-*-* x86_64-*-* } } }
+
+unsigned int
+po (char *os, unsigned int al)
+{
+  for (;;)
+{
+  int qx = 0;
+
+  while (al < 1)
+{
+  char *cw;
+
+  cw = os + qx;
+  if (cw)
+return al + qx;
+
+  qx += sizeof *cw;
+}
+}
+}
--
2.26.2



Re: [PATCH] x86: Add missing intrinsics [PR95483]

2020-10-14 Thread Uros Bizjak via Gcc-patches
> gcc/ChangeLog:
>
> * config/i386/avx2intrin.h (_mm_broadcastsi128_si256): New intrinsics.
> (_mm_broadcastsd_pd): Ditto.
> * config/i386/avx512bwintrin.h (_mm512_loadu_epi16): New intrinsics.
> (_mm512_storeu_epi16): Ditto.
> (_mm512_loadu_epi8): Ditto.
> (_mm512_storeu_epi8): Ditto.
> * config/i386/avx512dqintrin.h (_mm_reduce_round_sd): New intrinsics.
> (_mm_mask_reduce_round_sd): Ditto.
> (_mm_maskz_reduce_round_sd): Ditto.
> (_mm_reduce_round_ss): Ditto.
> (_mm_mask_reduce_round_ss): Ditto.
> (_mm_maskz_reduce_round_ss): Ditto.
> (_mm512_reduce_round_pd): Ditto.
> (_mm512_mask_reduce_round_pd): Ditto.
> (_mm512_maskz_reduce_round_pd): Ditto.
> (_mm512_reduce_round_ps): Ditto.
> (_mm512_mask_reduce_round_ps): Ditto.
> (_mm512_maskz_reduce_round_ps): Ditto.
> * config/i386/avx512erintrin.h
> (_mm_mask_rcp28_round_sd): New intrinsics.
> (_mm_maskz_rcp28_round_sd): Ditto.
> (_mm_mask_rcp28_round_ss): Ditto.
> (_mm_maskz_rcp28_round_ss): Ditto.
> (_mm_mask_rsqrt28_round_sd): Ditto.
> (_mm_maskz_rsqrt28_round_sd): Ditto.
> (_mm_mask_rsqrt28_round_ss): Ditto.
> (_mm_maskz_rsqrt28_round_ss): Ditto.
> (_mm_mask_rcp28_sd): Ditto.
> (_mm_maskz_rcp28_sd): Ditto.
> (_mm_mask_rcp28_ss): Ditto.
> (_mm_maskz_rcp28_ss): Ditto.
> (_mm_mask_rsqrt28_sd): Ditto.
> (_mm_maskz_rsqrt28_sd): Ditto.
> (_mm_mask_rsqrt28_ss): Ditto.
> (_mm_maskz_rsqrt28_ss): Ditto.
> * config/i386/avx512fintrin.h (_mm_mask_sqrt_sd): New intrinsics.
> (_mm_maskz_sqrt_sd): Ditto.
> (_mm_mask_sqrt_ss): Ditto.
> (_mm_maskz_sqrt_ss): Ditto.
> (_mm_mask_scalef_sd): Ditto.
> (_mm_maskz_scalef_sd): Ditto.
> (_mm_mask_scalef_ss): Ditto.
> (_mm_maskz_scalef_ss): Ditto.
> (_mm_mask_cvt_roundsd_ss): Ditto.
> (_mm_maskz_cvt_roundsd_ss): Ditto.
> (_mm_mask_cvt_roundss_sd): Ditto.
> (_mm_maskz_cvt_roundss_sd): Ditto.
> (_mm_mask_cvtss_sd): Ditto.
> (_mm_maskz_cvtss_sd): Ditto.
> (_mm_mask_cvtsd_ss): Ditto.
> (_mm_maskz_cvtsd_ss): Ditto.
> (_mm512_cvtsi512_si32): Ditto.
> (_mm_cvtsd_i32): Ditto.
> (_mm_cvtss_i32): Ditto.
> (_mm_cvti32_sd): Ditto.
> (_mm_cvti32_ss): Ditto.
> (_mm_cvtsd_i64): Ditto.
> (_mm_cvtss_i64): Ditto.
> (_mm_cvti64_sd): Ditto.
> (_mm_cvti64_ss): Ditto.
> * config/i386/avx512vlbwintrin.h (_mm256_storeu_epi8): New intrinsics.
> (_mm_storeu_epi8): Ditto.
> (_mm256_loadu_epi16): Ditto.
> (_mm_loadu_epi16): Ditto.
> (_mm256_loadu_epi8): Ditto.
> (_mm_loadu_epi8): Ditto.
> (_mm256_storeu_epi16): Ditto.
> (_mm_storeu_epi16): Ditto.
> * config/i386/avx512vlintrin.h (_mm256_load_epi64): New intrinsics.
> (_mm_load_epi64): Ditto.
> (_mm256_load_epi32): Ditto.
> (_mm_load_epi32): Ditto.
> (_mm256_store_epi32): Ditto.
> (_mm_store_epi32): Ditto.
> (_mm256_loadu_epi64): Ditto.
> (_mm_loadu_epi64): Ditto.
> (_mm256_loadu_epi32): Ditto.
> (_mm_loadu_epi32): Ditto.
> (_mm256_mask_cvt_roundps_ph): Ditto.
> (_mm256_maskz_cvt_roundps_ph): Ditto.
> (_mm_mask_cvt_roundps_ph): Ditto.
> (_mm_maskz_cvt_roundps_ph): Ditto.
> * config/i386/avxintrin.h (_mm256_cvtsi256_si32): New intrinsics.
> * config/i386/emmintrin.h (_mm_loadu_si32): New intrinsics.
> (_mm_loadu_si16): Ditto.
> (_mm_storeu_si32): Ditto.
> (_mm_storeu_si16): Ditto.
> * config/i386/i386-builtin-types.def
> (V8DF_FTYPE_V8DF_INT_V8DF_UQI_INT): Add new type.
> (V16SF_FTYPE_V16SF_INT_V16SF_UHI_INT): Ditto.
> (V4SF_FTYPE_V4SF_V2DF_V4SF_UQI_INT): Ditto.
> (V2DF_FTYPE_V2DF_V4SF_V2DF_UQI_INT): Ditto.
> * config/i386/i386-builtin.def
> (__builtin_ia32_cvtsd2ss_mask_round): New builtin.
> (__builtin_ia32_cvtss2sd_mask_round): Ditto.
> (__builtin_ia32_rcp28sd_mask_round): Ditto.
> (__builtin_ia32_rcp28ss_mask_round): Ditto.
> (__builtin_ia32_rsqrt28sd_mask_round): Ditto.
> (__builtin_ia32_rsqrt28ss_mask_round): Ditto.
> (__builtin_ia32_reducepd512_mask_round): Ditto.
> (__builtin_ia32_reduceps512_mask_round): Ditto.
> (__builtin_ia32_reducesd_mask_round): Ditto.
> (__builtin_ia32_reducess_mask_round): Ditto.
> * config/i386/i386-expand.c
> (ix86_expand_round_builtin): Expand round builtin for new type.
> (V8DF_FTYPE_V8DF_INT_V8DF_UQI_INT)
> (V16SF_FTYPE_V16SF_INT_V16SF_UHI_INT)
> (V4SF_FTYPE_V4SF_V2DF_V4SF_UQI_INT)
> (V2DF_FTYPE_V2DF_V4SF_V2DF_UQI_INT)
> * config/i386/mmintrin.h ()
> Define datatype __m32 and __m16.
> Define datatype __m32_u and __m16_u.
> * config/i386/sse.md: Adjust pattern.
> (reducep): Adjust.
> (reduces): Ditto.
> (sse2_cvtsd2ss): Ditto.
> (sse2_cvtss2sd): Ditto.
> (avx512er_vmrcp28): Ditto.
> (avx512er_vmrsqrt28): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx-1.c: Add test.
> * gcc.target/i386/avx2-vbroadcastsi128-1.c: Ditto.
> * gcc.target/i386/avx2-vbroadcastsi128-2.c: Ditto.
> * gcc.target/i386/avx512bw-vmovdqu16-1.c: Ditto.
> * gcc.target/i386/avx512bw-vmovdqu8-1.c: Ditto.
> * gcc.target/i386/avx512dq-vreducesd-1.c: Ditto.
> * gcc.target/i386/avx512dq-vreducesd-2.c: Ditto.
> * gcc.target/i386/avx512dq-vreducess-1.c: Ditto.
> * gcc.target/i386/avx512dq-vreducess-2.c: Ditto.
> * gcc.target/i386/avx512er-vrcp28sd-1.c: Ditto.
> * gcc.target/i386/avx512er-vrcp28sd-2.c: 

Re: [PATCH] i386: Improve chaining of _{addcarry, subborrow}_u{32, 64} [PR97387]

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 3:49 PM Jakub Jelinek  wrote:
>
> On Wed, Oct 14, 2020 at 03:17:03PM +0200, Uros Bizjak wrote:
> > > +(define_insn_and_split "*setcc_qi_addqi3_cconly_overflow_1_"
> > > +  [(set (reg:CCC FLAGS_REG)
> > > +   (compare:CCC (neg:QI (geu:QI (reg:CC_CCC FLAGS_REG) (const_int 
> > > 0)))
> > > +(ltu:QI (reg:CC_CCC FLAGS_REG) (const_int 0]
> > > +  "ix86_pre_reload_split ()"
> > > +  "#"
> > > +  "&& 1"
> > > +  [(const_int 0)])
> > >
> >
> > Hmm... does the above really represent a NOP?
>
> It is just what combine.c + simplify-rtx.c make out of this.
> We have:
> (insn 10 9 11 2 (set (reg:QI 88 [ _31 ])
> (ltu:QI (reg:CCC 17 flags)
> (const_int 0 [0]))) "include/adxintrin.h":69:10 785 {*setcc_qi}
>  (expr_list:REG_DEAD (reg:CCC 17 flags)
> (nil)))
> and
> (insn 17 15 18 2 (parallel [
> (set (reg:CCC 17 flags)
> (compare:CCC (plus:QI (reg:QI 88 [ _31 ])
> (const_int -1 [0x]))
> (reg:QI 88 [ _31 ])))
> (clobber (scratch:QI))
> ]) "include/adxintrin.h":69:10 350 {*addqi3_cconly_overflow_1}
>  (expr_list:REG_DEAD (reg:QI 88 [ _31 ])
> (nil)))
> So when substituting (reg:QI 88) for (ltu:QI flags 0), we initially get:
> (compare:CCC (plus:QI (ltu:QI (reg:CCC 17 flags) (const_int [0])) (const_int 
> -1 [0x]))
>  (ltu:QI (reg:CCC 17 flags) (const_int [0])))
> On this triggers simplify_binary_operation_1 rule:
>   /* (plus (comparison A B) C) can become (neg (rev-comp A B)) if
>  C is 1 and STORE_FLAG_VALUE is -1 or if C is -1 and STORE_FLAG_VALUE
>  is 1.  */
>   if (COMPARISON_P (op0)
>   && ((STORE_FLAG_VALUE == -1 && trueop1 == const1_rtx)
>   || (STORE_FLAG_VALUE == 1 && trueop1 == constm1_rtx))
>   && (reversed = reversed_comparison (op0, mode)))
> return
>   simplify_gen_unary (NEG, mode, reversed, mode);
> As STORE_FLAG_VALUE is 1 on i386, it triggers for that -1.
>
> Now, in CCCmode we have just 2 possible values, 0 and 1, CF clear and CF set.
> So, either (ltu:QI (reg:CCC 17 flags) (const_int 0 [0])) is 0, in that case
> (geu:QI (reg:CCC 17 flags) (const_int 0 [0])) is 1 and so
> (compare:CCC (neg:QI (geu:QI (reg:CCC FLAGS_REG) (const_int 0)))
>  (ltu:QI (reg:CCC FLAGS_REG) (const_int 0
> is (compare:CCC (neg:QI (const_int 1 [0x1])) (const_int 0 [0]))
> which is (compare:CCC (const_int -1 [0x]) (const_int 0 [0]))
> Or (ltu:QI (reg:CCC 17 flags) (const_int 0 [0])) is 1, in that case
> (geu:QI (reg:CCC 17 flags) (const_int 0 [0])) is 0 and so
> (compare:CCC (neg:QI (geu:QI (reg:CCC FLAGS_REG) (const_int 0)))
>  (ltu:QI (reg:CCC FLAGS_REG) (const_int 0
> is (compare:CCC (neg:QI (const_int 0 [0x0])) (const_int 1 [0x1]))
> which is (compare:CCC (const_int 0 [0]) (const_int 1 [0x1]))
> As CCCmode flags can be only used in LTU or GEU comparisons, we are asking if
> 0xffU < 0 (false, CF clear) or 0 < 1 (true, CF set).
>
> So I think the pattern is meaningful and is really a nop.

Phew ... it took me some time to wrap my mind around the above logic.

The explanation clears my concerns, and also answers the question if
this simplification should be implemented in some generic,
target-independant way. No, due to all the target-dependant stuff,
mentioned in the explanation.

OK for mainline.

Thanks,
Uros.


Re: [RFA,PATCH] Bail in bounds_of_var_in_loop if no step found.

2020-10-14 Thread Richard Biener via Gcc-patches
On Wed, Oct 14, 2020 at 4:19 PM Aldy Hernandez  wrote:
>
>
>
> On 10/14/20 9:43 AM, Richard Biener wrote:
> > On Tue, Oct 13, 2020 at 6:12 PM Aldy Hernandez  wrote:
> >>
> >>
> >>
> >> On 10/13/20 6:02 PM, Richard Biener wrote:
> >>> On October 13, 2020 5:17:48 PM GMT+02:00, Aldy Hernandez via Gcc-patches 
> >>>  wrote:
>  [Neither Andrew nor I are familiar with the SCEV code.  We treat it as
>  a
>  black box :).  So we could use a SCEV expert here.]
> 
>  In bounds_of_var_in_loop, evolution_part_in_loop_num is returning NULL:
> 
>  step = evolution_part_in_loop_num (chrec, loop->num);
> >
> > (*)
> >
> >>>
> >>> It means that Var doesn't vary in the loop.
> >>> That is, chrec isn't a polynomial chrec.
> >>
> >> That's what I thought, but it is:
> >>
> >> (gdb) p chrec
> >> $6 = 
> >> (gdb) dd chrec
> >> {0, +, 1}_2
> >>
> >> evolution_part_in_loop_num() is returning NULL deep in
> >> chrec_component_in_loop_num():
> >>
> >>default:
> >> =>if (right)
> >>   return NULL_TREE;
> >> else
> >>   return chrec;
> >>
> >> Do you have any suggestions?
> >
> > I can only guess (w/o a testcase) that loop->num at (*) is not 2 and thus 
> > that
> > chrec does not evolve in the loop we're asking.  But this doesn't make much
> > sense with the constraints we are calling this function (a loop header PHI
> > with loop == the loop and stmt a loop header PHI and var the PHIs lhs).
> >
> > OK, so looking at the testcase you're doing
> >
> > 492   class loop *l = loop_containing_stmt (phi);
> > 493   if (l)
> > 494 {
> > 495   range_of_ssa_name_with_loop_info (loop_range,
> > phi_def, l, phi);
> >
> > but 'l' isn't a loop, it's the loop tree root.  Change to
> >
> >if (l && loop_outer (l))
>
> Woah.  I did not expect that.
>
> A quick peek shows that all users of bounds_of_var_in_loop (through
> adjust_range_with_scev) predicate the call with:
>
> && l->header == gimple_bb (phi))
>
> If this check is similar to the loop_outer(l) you suggest, could we
> perhaps push this check (and/or the loop_outer one) into
> bounds_of_var_in_loop itself and remove it from all the callers?  It
> seems cleaner to have this check in one place, than in three different
> places.  That is, unless the l->header check is altogether a different
> thing than loop_outer(l).

The code also works for stmts other than PHIs (or stmts in other blocks
than the loop header), so IMHO is not appropriate for bounds_of_var_in_loop.

Richard.

>
> Thanks so much for looking at this.
> Aldy
>


[PATCH] More vect_get_and_check_slp_defs refactoring

2020-10-14 Thread Richard Biener
This is another tiny piece in some bigger refactoring of
vect_get_and_check_slp_defs.  Split out a test that has nothing
to do with def types or commutation.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-10-14  Richard Biener  

* tree-vect-slp.c (vect_get_and_check_slp_defs): Split out
test for compatible operand types.
---
 gcc/tree-vect-slp.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index ba681fe6d5e..5e0a3608948 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -486,6 +486,14 @@ again:
}
   else
{
+ if (!types_compatible_p (oprnd_info->first_op_type, type))
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"Build SLP failed: different operand types\n");
+ return 1;
+   }
+
  /* Not first stmt of the group, check that the def-stmt/s match
 the def-stmt/s of the first stmt.  Allow different definition
 types for reduction chains: the first stmt must be a
@@ -503,7 +511,6 @@ again:
 || oprnd_info->first_dt == vect_constant_def)
&& (dt == vect_external_def
|| dt == vect_constant_def)))
- || !types_compatible_p (oprnd_info->first_op_type, type)
  || (!STMT_VINFO_DATA_REF (stmt_info)
  && REDUC_GROUP_FIRST_ELEMENT (stmt_info)
  && ((!def_stmt_info
-- 
2.26.2


Re: [PATCH] IPA: fix profile handling in IRA

2020-10-14 Thread Jan Hubicka
> Hello.
> 
> There's a new version of the patch that fixes profile scaling
> in IRA.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
OK, thanks!
Honza
> Thanks,
> Martin

> From 052fe1cb256226108648b2fe93977beec7ca4209 Mon Sep 17 00:00:00 2001
> From: Martin Liska 
> Date: Tue, 13 Oct 2020 16:44:47 +0200
> Subject: [PATCH] IPA: fix profile handling in IRA
> 
> gcc/ChangeLog:
> 
>   PR ipa/97295
>   * profile-count.c (profile_count::to_frequency): Move part of
>   gcc_assert to STATIC_ASSERT.
>   * regs.h (REG_FREQ_FROM_BB): Do not use count.to_frequency for
>   a function that does not have count_max initialized.
> ---
>  gcc/profile-count.c | 4 ++--
>  gcc/regs.h  | 3 ++-
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/profile-count.c b/gcc/profile-count.c
> index c89914ff8a0..aaefc11ff3d 100644
> --- a/gcc/profile-count.c
> +++ b/gcc/profile-count.c
> @@ -270,8 +270,8 @@ profile_count::to_frequency (struct function *fun) const
>  return BB_FREQ_MAX;
>if (*this == zero ())
>  return 0;
> -  gcc_assert (REG_BR_PROB_BASE == BB_FREQ_MAX
> -   && fun->cfg->count_max.initialized_p ());
> +  STATIC_ASSERT (REG_BR_PROB_BASE == BB_FREQ_MAX);
> +  gcc_assert (fun->cfg->count_max.initialized_p ());
>profile_probability prob = probability_in (fun->cfg->count_max);
>if (!prob.initialized_p ())
>  return REG_BR_PROB_BASE;
> diff --git a/gcc/regs.h b/gcc/regs.h
> index 1decd2c2d2a..11416c47f6f 100644
> --- a/gcc/regs.h
> +++ b/gcc/regs.h
> @@ -128,7 +128,8 @@ extern size_t reg_info_p_size;
> or profile driven feedback is available and the function is never 
> executed,
> frequency is always equivalent.  Otherwise rescale the basic block
> frequency.  */
> -#define REG_FREQ_FROM_BB(bb) (optimize_function_for_size_p (cfun)  \
> +#define REG_FREQ_FROM_BB(bb) ((optimize_function_for_size_p (cfun) \
> +|| !cfun->cfg->count_max.initialized_p ()) \
> ? REG_FREQ_MAX  \
> : ((bb)->count.to_frequency (cfun)  \
>   * REG_FREQ_MAX / BB_FREQ_MAX) \
> -- 
> 2.28.0
> 



[PATCH] IPA: fix profile handling in IRA

2020-10-14 Thread Martin Liška

Hello.

There's a new version of the patch that fixes profile scaling
in IRA.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From 052fe1cb256226108648b2fe93977beec7ca4209 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 13 Oct 2020 16:44:47 +0200
Subject: [PATCH] IPA: fix profile handling in IRA

gcc/ChangeLog:

	PR ipa/97295
	* profile-count.c (profile_count::to_frequency): Move part of
	gcc_assert to STATIC_ASSERT.
	* regs.h (REG_FREQ_FROM_BB): Do not use count.to_frequency for
	a function that does not have count_max initialized.
---
 gcc/profile-count.c | 4 ++--
 gcc/regs.h  | 3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/profile-count.c b/gcc/profile-count.c
index c89914ff8a0..aaefc11ff3d 100644
--- a/gcc/profile-count.c
+++ b/gcc/profile-count.c
@@ -270,8 +270,8 @@ profile_count::to_frequency (struct function *fun) const
 return BB_FREQ_MAX;
   if (*this == zero ())
 return 0;
-  gcc_assert (REG_BR_PROB_BASE == BB_FREQ_MAX
-	  && fun->cfg->count_max.initialized_p ());
+  STATIC_ASSERT (REG_BR_PROB_BASE == BB_FREQ_MAX);
+  gcc_assert (fun->cfg->count_max.initialized_p ());
   profile_probability prob = probability_in (fun->cfg->count_max);
   if (!prob.initialized_p ())
 return REG_BR_PROB_BASE;
diff --git a/gcc/regs.h b/gcc/regs.h
index 1decd2c2d2a..11416c47f6f 100644
--- a/gcc/regs.h
+++ b/gcc/regs.h
@@ -128,7 +128,8 @@ extern size_t reg_info_p_size;
or profile driven feedback is available and the function is never executed,
frequency is always equivalent.  Otherwise rescale the basic block
frequency.  */
-#define REG_FREQ_FROM_BB(bb) (optimize_function_for_size_p (cfun)	  \
+#define REG_FREQ_FROM_BB(bb) ((optimize_function_for_size_p (cfun)	  \
+			   || !cfun->cfg->count_max.initialized_p ()) \
 			  ? REG_FREQ_MAX  \
 			  : ((bb)->count.to_frequency (cfun)	  \
 * REG_FREQ_MAX / BB_FREQ_MAX)		  \
-- 
2.28.0



Re: [RFA,PATCH] Bail in bounds_of_var_in_loop if no step found.

2020-10-14 Thread Aldy Hernandez via Gcc-patches




On 10/14/20 9:43 AM, Richard Biener wrote:

On Tue, Oct 13, 2020 at 6:12 PM Aldy Hernandez  wrote:




On 10/13/20 6:02 PM, Richard Biener wrote:

On October 13, 2020 5:17:48 PM GMT+02:00, Aldy Hernandez via Gcc-patches 
 wrote:

[Neither Andrew nor I are familiar with the SCEV code.  We treat it as
a
black box :).  So we could use a SCEV expert here.]

In bounds_of_var_in_loop, evolution_part_in_loop_num is returning NULL:

step = evolution_part_in_loop_num (chrec, loop->num);


(*)



It means that Var doesn't vary in the loop.
That is, chrec isn't a polynomial chrec.


That's what I thought, but it is:

(gdb) p chrec
$6 = 
(gdb) dd chrec
{0, +, 1}_2

evolution_part_in_loop_num() is returning NULL deep in
chrec_component_in_loop_num():

   default:
=>if (right)
  return NULL_TREE;
else
  return chrec;

Do you have any suggestions?


I can only guess (w/o a testcase) that loop->num at (*) is not 2 and thus that
chrec does not evolve in the loop we're asking.  But this doesn't make much
sense with the constraints we are calling this function (a loop header PHI
with loop == the loop and stmt a loop header PHI and var the PHIs lhs).

OK, so looking at the testcase you're doing

492   class loop *l = loop_containing_stmt (phi);
493   if (l)
494 {
495   range_of_ssa_name_with_loop_info (loop_range,
phi_def, l, phi);

but 'l' isn't a loop, it's the loop tree root.  Change to

   if (l && loop_outer (l))


Woah.  I did not expect that.

A quick peek shows that all users of bounds_of_var_in_loop (through 
adjust_range_with_scev) predicate the call with:


&& l->header == gimple_bb (phi))

If this check is similar to the loop_outer(l) you suggest, could we 
perhaps push this check (and/or the loop_outer one) into 
bounds_of_var_in_loop itself and remove it from all the callers?  It 
seems cleaner to have this check in one place, than in three different 
places.  That is, unless the l->header check is altogether a different 
thing than loop_outer(l).


Thanks so much for looking at this.
Aldy



Re: Fix possible overflow in ipa-fnsummary

2020-10-14 Thread Martin Jambor
Hi,

On Wed, Oct 14 2020, Jan Hubicka wrote:
> Hi,
> while looking into jump functions I noticed that offset_map in
> ipa-fnsummary is array of integers while everywhere else the offsets are
> HOST_WIDE_INTs (for good reason since the offsets are pointer
> adjustments moreover multplied by UNIT_SIZE)
>
> Bootstrapped/regtested x86_64-linux, will commit it shortly.
>
> gcc/ChangeLog:
>
> 2020-10-14  Jan Hubicka  
>
>   * ipa-fnsummary.c (remap_edge_summaries): Make offset_map HOST_WIDE_INT.
>   (remap_freqcounting_predicate): Likewise.
>   (ipa_merge_fn_summary_after_inlining): Likewise.
>   * ipa-predicate.c (predicate::remap_after_inlining): Likewise
>   * ipa-predicate.h (remap_after_inlining): Update.
>
>
> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index 771f432ebec..9e3eda4d3cb 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -3896,7 +3896,7 @@ remap_edge_summaries (struct cgraph_edge *inlined_edge,
> class ipa_node_params *params_summary,
> class ipa_fn_summary *callee_info,
> vec operand_map,
> -   vec offset_map,
> +   vec offset_map,
> clause_t possible_truths,
> predicate *toplev_predicate)
>  {
> @@ -3957,7 +3957,7 @@ remap_freqcounting_predicate (class ipa_fn_summary 
> *info,
> class ipa_fn_summary *callee_info,
> vec *v,
> vec operand_map,
> -   vec offset_map,
> +   vec offset_map,
> clause_t possible_truths,
> predicate *toplev_predicate)
>  
> @@ -3987,7 +3987,7 @@ ipa_merge_fn_summary_after_inlining (struct cgraph_edge 
> *edge)
>clause_t clause = 0;   /* not_inline is known to be false.  */
>size_time_entry *e;
>auto_vec operand_map;
> -  auto_vec offset_map;
> +  auto_vec offset_map;


if you want to do this, I suppose you also want to remove the INT_MAX
check from:

  if (offset >= 0 && offset < INT_MAX)
{
  map = ipa_get_jf_ancestor_formal_id (jfunc);
  if (!ipa_get_jf_ancestor_agg_preserved (jfunc))
offset = -1;
  offset_map[i] = offset;
}

further down in this function.  

Martin



Re: [PATCH] i386: Improve chaining of _{addcarry, subborrow}_u{32, 64} [PR97387]

2020-10-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Oct 14, 2020 at 03:17:03PM +0200, Uros Bizjak wrote:
> > +(define_insn_and_split "*setcc_qi_addqi3_cconly_overflow_1_"
> > +  [(set (reg:CCC FLAGS_REG)
> > +   (compare:CCC (neg:QI (geu:QI (reg:CC_CCC FLAGS_REG) (const_int 0)))
> > +(ltu:QI (reg:CC_CCC FLAGS_REG) (const_int 0]
> > +  "ix86_pre_reload_split ()"
> > +  "#"
> > +  "&& 1"
> > +  [(const_int 0)])
> >
> 
> Hmm... does the above really represent a NOP?

It is just what combine.c + simplify-rtx.c make out of this.
We have:
(insn 10 9 11 2 (set (reg:QI 88 [ _31 ])
(ltu:QI (reg:CCC 17 flags)
(const_int 0 [0]))) "include/adxintrin.h":69:10 785 {*setcc_qi}
 (expr_list:REG_DEAD (reg:CCC 17 flags)
(nil)))
and
(insn 17 15 18 2 (parallel [
(set (reg:CCC 17 flags)
(compare:CCC (plus:QI (reg:QI 88 [ _31 ])
(const_int -1 [0x]))
(reg:QI 88 [ _31 ])))
(clobber (scratch:QI))
]) "include/adxintrin.h":69:10 350 {*addqi3_cconly_overflow_1}
 (expr_list:REG_DEAD (reg:QI 88 [ _31 ])
(nil)))
So when substituting (reg:QI 88) for (ltu:QI flags 0), we initially get:
(compare:CCC (plus:QI (ltu:QI (reg:CCC 17 flags) (const_int [0])) (const_int -1 
[0x]))
 (ltu:QI (reg:CCC 17 flags) (const_int [0])))
On this triggers simplify_binary_operation_1 rule:
  /* (plus (comparison A B) C) can become (neg (rev-comp A B)) if
 C is 1 and STORE_FLAG_VALUE is -1 or if C is -1 and STORE_FLAG_VALUE
 is 1.  */
  if (COMPARISON_P (op0)
  && ((STORE_FLAG_VALUE == -1 && trueop1 == const1_rtx)
  || (STORE_FLAG_VALUE == 1 && trueop1 == constm1_rtx))
  && (reversed = reversed_comparison (op0, mode)))
return
  simplify_gen_unary (NEG, mode, reversed, mode);
As STORE_FLAG_VALUE is 1 on i386, it triggers for that -1.

Now, in CCCmode we have just 2 possible values, 0 and 1, CF clear and CF set.
So, either (ltu:QI (reg:CCC 17 flags) (const_int 0 [0])) is 0, in that case
(geu:QI (reg:CCC 17 flags) (const_int 0 [0])) is 1 and so
(compare:CCC (neg:QI (geu:QI (reg:CCC FLAGS_REG) (const_int 0)))
 (ltu:QI (reg:CCC FLAGS_REG) (const_int 0
is (compare:CCC (neg:QI (const_int 1 [0x1])) (const_int 0 [0]))
which is (compare:CCC (const_int -1 [0x]) (const_int 0 [0]))
Or (ltu:QI (reg:CCC 17 flags) (const_int 0 [0])) is 1, in that case
(geu:QI (reg:CCC 17 flags) (const_int 0 [0])) is 0 and so
(compare:CCC (neg:QI (geu:QI (reg:CCC FLAGS_REG) (const_int 0)))
 (ltu:QI (reg:CCC FLAGS_REG) (const_int 0
is (compare:CCC (neg:QI (const_int 0 [0x0])) (const_int 1 [0x1]))
which is (compare:CCC (const_int 0 [0]) (const_int 1 [0x1]))
As CCCmode flags can be only used in LTU or GEU comparisons, we are asking if
0xffU < 0 (false, CF clear) or 0 < 1 (true, CF set).

So I think the pattern is meaningful and is really a nop.

Jakub



[PATCH][testsuite] Don't overwrite compiler_flags in check_compile

2020-10-14 Thread Tom de Vries
Hi,

Consider the test-case gcc.c-torture/compile/pr42717.c, which has:
...
/* { dg-xfail-if "ptxas crashes" { nvptx-*-* } { "-O0" } { "" } } */
...

When running make check-gcc, I get:
...
XPASS: gcc.c-torture/compile/pr42717.c   -O0  (test for excess errors)
...
but when forcing to run only that test-case using
RUNTESTFLAGS=compile.exp=pr42717.c I get instead:
...
PASS: gcc.c-torture/compile/pr42717.c   -O0  (test for excess errors)
...

Using RUNTESTFLAGS="-v -v -v" we can see what happens:
...
check_cached_effective_target exceptions_enabled: \
  returning 1 for nvptx-none-run
Limited to targets: *-*-*
Will search for options  "-O0"
Will exclude for options  ""
Compiler flags are: exceptions_enabled9848.cc -fdiagnostics-plain-output \
  --sysroot=/home/vries/nvptx/trunk/install/nvptx-none -S  -isystem \
  /home/vries/nvptx/trunk/build-gcc/nvptx-none/./newlib/targ-include \
  -isystem /home/vries/nvptx/trunk/source-gcc/newlib/libc/include \
  -o exceptions_enabled9848.s
Checking "*-*-*" against "nvptx-unknown-none"
Looking for -O0 to include in the compiler flags
Looking for  to exclude in the compiler flags
This is not a conditional match
PASS: gcc.c-torture/compile/pr42717.c   -O0  (test for excess errors)
...

The effective target exceptions_enabled is tested from gcc-dg-prune, but
the calculation overwrites $compiler_flags, which is subsequently tested for
-O0.

Fix this by saving and restoring $compiler_flags when calling
${tool}_target_compile in check_compile.

Tested on nvptx.

OK for trunk?

Thanks,
- Tom

[testsuite] Don't overwrite compiler_flags in check_compile

gcc/testsuite/ChangeLog:

2020-10-14  Tom de Vries  

* lib/target-supports.exp (check_compile): Save and restore
$compiler_flags when calling ${tool}_target_compile.

---
 gcc/testsuite/lib/target-supports.exp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index ecf8be3e567..8439720baea 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -86,7 +86,10 @@ proc check_compile {basename type contents args} {
 set f [open $src "w"]
 puts $f $contents
 close $f
+global compiler_flags
+set save_compiler_flags $compiler_flags
 set lines [${tool}_target_compile $src $output $compile_type "$options"]
+set compiler_flags $save_compiler_flags 
 file delete $src
 
 set scan_output $output


Fix possible overflow in ipa-fnsummary

2020-10-14 Thread Jan Hubicka
Hi,
while looking into jump functions I noticed that offset_map in
ipa-fnsummary is array of integers while everywhere else the offsets are
HOST_WIDE_INTs (for good reason since the offsets are pointer
adjustments moreover multplied by UNIT_SIZE)

Bootstrapped/regtested x86_64-linux, will commit it shortly.

gcc/ChangeLog:

2020-10-14  Jan Hubicka  

* ipa-fnsummary.c (remap_edge_summaries): Make offset_map HOST_WIDE_INT.
(remap_freqcounting_predicate): Likewise.
(ipa_merge_fn_summary_after_inlining): Likewise.
* ipa-predicate.c (predicate::remap_after_inlining): Likewise
* ipa-predicate.h (remap_after_inlining): Update.


diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 771f432ebec..9e3eda4d3cb 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -3896,7 +3896,7 @@ remap_edge_summaries (struct cgraph_edge *inlined_edge,
  class ipa_node_params *params_summary,
  class ipa_fn_summary *callee_info,
  vec operand_map,
- vec offset_map,
+ vec offset_map,
  clause_t possible_truths,
  predicate *toplev_predicate)
 {
@@ -3957,7 +3957,7 @@ remap_freqcounting_predicate (class ipa_fn_summary *info,
  class ipa_fn_summary *callee_info,
  vec *v,
  vec operand_map,
- vec offset_map,
+ vec offset_map,
  clause_t possible_truths,
  predicate *toplev_predicate)
 
@@ -3987,7 +3987,7 @@ ipa_merge_fn_summary_after_inlining (struct cgraph_edge 
*edge)
   clause_t clause = 0; /* not_inline is known to be false.  */
   size_time_entry *e;
   auto_vec operand_map;
-  auto_vec offset_map;
+  auto_vec offset_map;
   int i;
   predicate toplev_predicate;
   class ipa_call_summary *es = ipa_call_summaries->get (edge);
diff --git a/gcc/ipa-predicate.c b/gcc/ipa-predicate.c
index 27dabf2dc6a..605da912d26 100644
--- a/gcc/ipa-predicate.c
+++ b/gcc/ipa-predicate.c
@@ -508,7 +508,7 @@ predicate::remap_after_inlining (class ipa_fn_summary *info,
 class ipa_node_params *params_summary,
 class ipa_fn_summary *callee_info,
 vec operand_map,
-vec offset_map,
+vec offset_map,
 clause_t possible_truths,
 const predicate _predicate)
 {
diff --git a/gcc/ipa-predicate.h b/gcc/ipa-predicate.h
index 05e37073817..34a0d239d2a 100644
--- a/gcc/ipa-predicate.h
+++ b/gcc/ipa-predicate.h
@@ -243,7 +243,8 @@ public:
   predicate remap_after_inlining (class ipa_fn_summary *,
  class ipa_node_params *params_summary,
  class ipa_fn_summary *,
- vec, vec, clause_t, const predicate 
&);
+ vec, vec,
+ clause_t, const predicate &);
 
   void stream_in (class lto_input_block *);
   void stream_out (struct output_block *);


Handle POINTER_PLUS_EXPR in jump functions

2020-10-14 Thread Jan Hubicka
Hi,
this patch adds logic to handle POINTER_PLUS_EXPR in compute_parm_map
that I originally did not since I tought that all such adjustments are
done by ancestor function.

Bootstrapped/regtested x86_64-linux, will commit it shortly.
Honza

gcc/ChangeLog:

2020-10-14  Jan Hubicka  

* ipa-modref.c (compute_parm_map): Handle POINTER_PLUS_EXPR in
PASSTHROUGH.

gcc/testsuite/ChangeLog:

2020-10-14  Jan Hubicka  

* gcc.dg/ipa/modref-1.c: New test.
* gcc.dg/tree-ssa/modref-4.c: New test.

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index a6dfe1fc401..8e6a87643ec 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -1682,9 +1682,18 @@ compute_parm_map (cgraph_edge *callee_edge, 
vec *parm_map)
{
  (*parm_map)[i].parm_index
= ipa_get_jf_pass_through_formal_id (jf);
- (*parm_map)[i].parm_offset_known
-   = ipa_get_jf_pass_through_operation (jf) == NOP_EXPR;
- (*parm_map)[i].parm_offset = 0;
+ if (ipa_get_jf_pass_through_operation (jf) == NOP_EXPR)
+   {
+ (*parm_map)[i].parm_offset_known = true;
+ (*parm_map)[i].parm_offset = 0;
+   }
+ else if (ipa_get_jf_pass_through_operation (jf)
+  == POINTER_PLUS_EXPR
+  && ptrdiff_tree_p (ipa_get_jf_pass_through_operand (jf),
+ &(*parm_map)[i].parm_offset))
+   (*parm_map)[i].parm_offset_known = true;
+ else
+   (*parm_map)[i].parm_offset_known = false;
  continue;
}
  if (jf && jf->type == IPA_JF_ANCESTOR)
diff --git a/gcc/testsuite/gcc.dg/ipa/modref-1.c 
b/gcc/testsuite/gcc.dg/ipa/modref-1.c
new file mode 100644
index 000..46eb78ccebf
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/modref-1.c
@@ -0,0 +1,23 @@
+/* { dg-options "-O2 -fdump-ipa-modref"  } */
+/* { dg-do compile } */
+__attribute__((noinline))
+void a(char *ptr, char *ptr2)
+{
+  (*ptr)++;
+  (*ptr2)++;
+}
+
+__attribute__((noinline))
+b(char *ptr)
+{
+  a(ptr+1,[2]);
+}
+main()
+{
+  char c[2]={0,1,0};
+  b(c);
+  return c[0]+c[2];
+}
+/* Check that both param offsets are determined correctly.  */
+/* { dg-final { scan-ipa-dump "param offset: 1" "modref"  } } */
+/* { dg-final { scan-ipa-dump "param offset: 2" "modref"  } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-4.c 
b/gcc/testsuite/gcc.dg/tree-ssa/modref-4.c
new file mode 100644
index 000..776f46ed687
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-4.c
@@ -0,0 +1,25 @@
+/* { dg-options "-O2 -fdump-tree-modref1"  } */
+/* { dg-do compile } */
+__attribute__((noinline))
+void a(char *ptr, char *ptr2)
+{
+  (*ptr)++;
+  (*ptr2)++;
+}
+
+__attribute__((noinline))
+b(char *ptr)
+{
+  a(ptr+1,[2]);
+}
+main()
+{
+  char c[2]={0,1,0};
+  b(c);
+  return c[0]+c[2];
+}
+/* Check that both param offsets are determined correctly and the computation
+   is optimized out.  */
+/* { dg-final { scan-tree-dump "param offset: 1" "modref1"  } } */
+/* { dg-final { scan-tree-dump "param offset: 2" "modref2"  } } */
+/* { dg-final { scan-tree-dump "return 0" "modref2"  } } */


Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread H.J. Lu via Gcc-patches
On Wed, Oct 14, 2020 at 6:31 AM Hongyu Wang via Gcc-patches
 wrote:
>
> Uros Bizjak  于2020年10月14日周三 下午7:19写道:
> >
> > > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> > > > >> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new
> intrinsics
> > > > >> > > header.
> > > > >> > >
> > > > >> >
> > > > >> > Thanks for your review. We found that without adding -muintr,
> the intrinsics header could also be tested. Make-check for these file all
> get passed.
> > > > >> >
> > > > >> > And there is no intrinsic/builtin with const int parameter. So
> we remove -muintr from these files.
> > > > >>
> > > > >> Can your double check that relevant instructions are indeed
> generated?
> > > > >> Without -muintr, relevant patterns in i386.md are effectively
> blocked,
> > > > >> and perhaps a call to __builtin_ia32_* is generated instead.
> > > > >
> > > > >
> > > > > Yes, in sse-14.s we have
> > > > >
> > > > > _clui:
> > > > > .LFB136:
> > > > > .cfi_startproc
> > > > > pushq   %rbp
> > > > > .cfi_def_cfa_offset 16
> > > > > .cfi_offset 6, -16
> > > > > movq%rsp, %rbp
> > > > > .cfi_def_cfa_register 6
> > > > > clui
> > > > > nop
> > > > > popq%rbp
> > > > > .cfi_def_cfa 7, 8
> > > > > ret
> > > > > .cfi_endproc
> > > >
> > > > Strange, without -muintr, it should not be generated, and some error
> > > > about failed inlining due to target specific option mismatch shoul be
> > > > emitted.
> > > >
> > > > Can you please investigate this a bit more?
> > > >
> > >
> > > Because of function target attribute?
> >
> > I don't think so. Please consider this similar testcase:
> >
> > --cut here--
> > #ifndef __SSE2__
> > #pragma GCC push_options
> > #pragma GCC target("sse2")
> > #define __DISABLE_SSE2__
> > #endif /* __SSE2__ */
> >
> > typedef double __v2df __attribute__ ((__vector_size__ (16)));
> > typedef double __m128d __attribute__ ((__vector_size__ (16),
> __may_alias__));
> >
> > extern __inline __m128d __attribute__((__gnu_inline__,
> > __always_inline__, __artificial__))
> > _mm_add_sd (__m128d __A, __m128d __B)
> > {
> >   return (__m128d)__builtin_ia32_addsd ((__v2df)__A, (__v2df)__B);
> > }
> >
> > #ifdef __DISABLE_SSE2__
> > #undef __DISABLE_SSE2__
> > #pragma GCC pop_options
> > #endif /* __DISABLE_SSE2__ */
> >
> >
> > __v2df foo (__v2df a, __v2df b)
> > {
> >   return _mm_add_sd (a, b);
> > }
> > --cut here--
> >
> > $ gcc -O2 -mno-sse2 -S -dp sse2.c
> > sse2.c: In function ‘foo’:
> > sse2.c:11:1: error: inlining failed in call to ‘always_inline’
> > ‘_mm_add_sd’: target specific option mismatch
> >   11 | _mm_add_sd (__m128d __A, __m128d __B)
> >  | ^~
> > sse2.c:24:10: note: called from here
> >   24 |   return _mm_add_sd (a, b);
> >  |  ^
> >
> > I'd expect some similar warning from missing -mumip.
> >
>
> For this case, I can confirm uintr could generate similar warning without
> -muintr. But
> sse-{12,13,14,22,23}.c will not test intrinsic call for uintr, since it
> doesn't have const
> int parameter intrinsics.
>
> sse-{13,14,22,23}.c has
>
> #define extern
> #define __inline
>
> So intrinsic will be treated as common call to builtin, then
>
> #pragma GCC push_options
> #pragma GCC target("uintr")
>
> ensures the builtin could be expanded correctly.
>
> I think the intrinsic call test should be in uintr-1.c, so it is redundant
> to add -muintr in sse-{12,13,14,22,23}.c
> or x86gprintrin-*.c.

Please add UINTR intrinsic tests to x86gprintrin-*.c to cover such
usages.

> >
> > Uros.



-- 
H.J.


Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Hongyu Wang via Gcc-patches
Uros Bizjak  于2020年10月14日周三 下午7:19写道:
>
> > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> > > >> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new
intrinsics
> > > >> > > header.
> > > >> > >
> > > >> >
> > > >> > Thanks for your review. We found that without adding -muintr,
the intrinsics header could also be tested. Make-check for these file all
get passed.
> > > >> >
> > > >> > And there is no intrinsic/builtin with const int parameter. So
we remove -muintr from these files.
> > > >>
> > > >> Can your double check that relevant instructions are indeed
generated?
> > > >> Without -muintr, relevant patterns in i386.md are effectively
blocked,
> > > >> and perhaps a call to __builtin_ia32_* is generated instead.
> > > >
> > > >
> > > > Yes, in sse-14.s we have
> > > >
> > > > _clui:
> > > > .LFB136:
> > > > .cfi_startproc
> > > > pushq   %rbp
> > > > .cfi_def_cfa_offset 16
> > > > .cfi_offset 6, -16
> > > > movq%rsp, %rbp
> > > > .cfi_def_cfa_register 6
> > > > clui
> > > > nop
> > > > popq%rbp
> > > > .cfi_def_cfa 7, 8
> > > > ret
> > > > .cfi_endproc
> > >
> > > Strange, without -muintr, it should not be generated, and some error
> > > about failed inlining due to target specific option mismatch shoul be
> > > emitted.
> > >
> > > Can you please investigate this a bit more?
> > >
> >
> > Because of function target attribute?
>
> I don't think so. Please consider this similar testcase:
>
> --cut here--
> #ifndef __SSE2__
> #pragma GCC push_options
> #pragma GCC target("sse2")
> #define __DISABLE_SSE2__
> #endif /* __SSE2__ */
>
> typedef double __v2df __attribute__ ((__vector_size__ (16)));
> typedef double __m128d __attribute__ ((__vector_size__ (16),
__may_alias__));
>
> extern __inline __m128d __attribute__((__gnu_inline__,
> __always_inline__, __artificial__))
> _mm_add_sd (__m128d __A, __m128d __B)
> {
>   return (__m128d)__builtin_ia32_addsd ((__v2df)__A, (__v2df)__B);
> }
>
> #ifdef __DISABLE_SSE2__
> #undef __DISABLE_SSE2__
> #pragma GCC pop_options
> #endif /* __DISABLE_SSE2__ */
>
>
> __v2df foo (__v2df a, __v2df b)
> {
>   return _mm_add_sd (a, b);
> }
> --cut here--
>
> $ gcc -O2 -mno-sse2 -S -dp sse2.c
> sse2.c: In function ‘foo’:
> sse2.c:11:1: error: inlining failed in call to ‘always_inline’
> ‘_mm_add_sd’: target specific option mismatch
>   11 | _mm_add_sd (__m128d __A, __m128d __B)
>  | ^~
> sse2.c:24:10: note: called from here
>   24 |   return _mm_add_sd (a, b);
>  |  ^
>
> I'd expect some similar warning from missing -mumip.
>

For this case, I can confirm uintr could generate similar warning without
-muintr. But
sse-{12,13,14,22,23}.c will not test intrinsic call for uintr, since it
doesn't have const
int parameter intrinsics.

sse-{13,14,22,23}.c has

#define extern
#define __inline

So intrinsic will be treated as common call to builtin, then

#pragma GCC push_options
#pragma GCC target("uintr")

ensures the builtin could be expanded correctly.

I think the intrinsic call test should be in uintr-1.c, so it is redundant
to add -muintr in sse-{12,13,14,22,23}.c
or x86gprintrin-*.c.

>
> Uros.


Re: [PATCH] i386: Improve chaining of _{addcarry, subborrow}_u{32, 64} [PR97387]

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 11:01 AM Jakub Jelinek  wrote:
>
> Hi!
>
> These builtins have two known issues and this patch fixes one of them.
>
> One issue is that the builtins effectively return two results and
> they make the destination addressable until expansion, which means
> a stack slot is allocated for them and e.g. with -fstack-protector*
> DSE isn't able to optimize that away.  I think for that we want to use
> the technique of returning complex value; the patch doesn't handle that
> though.  See PR93990 for that.
>
> The other problem is optimization of successive uses of the builtin
> e.g. for arbitrary precision arithmetic additions/subtractions.
> As shown PR93990, combine is able to optimize the case when the first
> argument to these builtins is 0 (the first instance when several are used
> together), and also the last one if the last one ignores its result (i.e.
> the carry/borrow is dead and thrown away in that case).
> As shown in this PR, combiner refuses to optimize the rest, where it sees:
> (insn 10 9 11 2 (set (reg:QI 88 [ _31 ])
> (ltu:QI (reg:CCC 17 flags)
> (const_int 0 [0]))) "include/adxintrin.h":69:10 785 {*setcc_qi}
>  (expr_list:REG_DEAD (reg:CCC 17 flags)
> (nil)))
> - set pseudo 88 to CF from flags, then some uninteresting insns that
> don't modify flags, and finally:
> (insn 17 15 18 2 (parallel [
> (set (reg:CCC 17 flags)
> (compare:CCC (plus:QI (reg:QI 88 [ _31 ])
> (const_int -1 [0x]))
> (reg:QI 88 [ _31 ])))
> (clobber (scratch:QI))
> ]) "include/adxintrin.h":69:10 350 {*addqi3_cconly_overflow_1}
>  (expr_list:REG_DEAD (reg:QI 88 [ _31 ])
> (nil)))
> to set CF in flags back to what we saved earlier.  The combiner just punts
> trying to combine the 10, 17 and following addcarrydi (etc.) instruction,
> because
>   if (i1 && !can_combine_p (i1, i3, i0, NULL, i2, NULL, , ))
> {
>   if (dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, "Can't combine i1 into i3\n");
>   undo_all ();
>   return 0;
> }
> fails - the 3 insns aren't all adjacent and
>   || (! all_adjacent
>   && (((!MEM_P (src)
> || ! find_reg_note (insn, REG_EQUIV, src))
>&& modified_between_p (src, insn, i3))
> src (flags hard register) is modified between the first and third insn - in
> the second insn.
>
> The following patch optimizes this by optimizing just the two insns,
> 10 and 17 above, i.e. save CF into pseudo, set CF from that pseudo, into
> a nop.  The new define_insn_and_split matches how combine simplifies those
> two together (except without the ix86_cc_mode change it was choosing CCmode
> for the destination instead of CCCmode, so had to change that function too,
> and also adjust costs so that combiner understand it is beneficial).
>
> With this, all the testcases are optimized, so that the:
> setc%dl
> ...
> addb$-1, %dl
> insns in between the ad[dc][lq] or s[ub]b[lq] instructions are all optimized
> away (sure, if something would clobber flags in between they wouldn't, but
> there is nothing that can be done about that).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2020-10-14  Jakub Jelinek  
>
> PR target/97387
> * config/i386/i386.md (CC_CCC): New mode iterator.
> (*setcc_qi_addqi3_cconly_overflow_1_): New
> define_insn_and_split.
> * config/i386/i386.c (ix86_cc_mode): Return CCCmode for
> *setcc_qi_addqi3_cconly_overflow_1_ pattern operands.
> (ix86_rtx_costs): Return true and *total = 0; for
> *setcc_qi_addqi3_cconly_overflow_1_ pattern.
>
> * gcc.target/i386/pr97387-1.c: New test.
> * gcc.target/i386/pr97387-2.c: New test.
>
> --- gcc/config/i386/i386.md.jj  2020-10-01 10:40:09.955758167 +0200
> +++ gcc/config/i386/i386.md 2020-10-13 13:38:24.644980815 +0200
> @@ -7039,6 +7039,20 @@ (define_expand "subborrow_0"
>(set (match_operand:SWI48 0 "register_operand")
>(minus:SWI48 (match_dup 1) (match_dup 2)))])]
>"ix86_binary_operator_ok (MINUS, mode, operands)")
> +
> +(define_mode_iterator CC_CCC [CC CCC])
> +
> +;; Pre-reload splitter to optimize
> +;; *setcc_qi followed by *addqi3_cconly_overflow_1 with the same QI
> +;; operand and no intervening flags modifications into nothing.
> +(define_insn_and_split "*setcc_qi_addqi3_cconly_overflow_1_"
> +  [(set (reg:CCC FLAGS_REG)
> +   (compare:CCC (neg:QI (geu:QI (reg:CC_CCC FLAGS_REG) (const_int 0)))
> +(ltu:QI (reg:CC_CCC FLAGS_REG) (const_int 0]
> +  "ix86_pre_reload_split ()"
> +  "#"
> +  "&& 1"
> +  [(const_int 0)])
>

Hmm... does the above really represent a NOP?

This is a compare of a *negation* of a reversed condition with itself, e.g:

cmp (neg (reversed cond), (cond)

we are sure that (reversed cond) and (cond) 

[PATCH] adjust BB SLP build from scalars heuristics

2020-10-14 Thread Richard Biener
We can end up with { _1, 1.0 } * { 3.0, _2 } which isn't really
profitable.  The following adjusts things so we reject more than
one possibly expensive (non-constant and not uniform) vector CTOR
and instead build a CTOR for the scalar operation results.

This also moves a check in vect_get_and_check_slp_defs to a better
place.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-10-14  Richard Biener  

* tree-vect-slp.c (vect_get_and_check_slp_defs): Move
check for duplicate/interleave of variable size constants
to a place done once and early.
(vect_build_slp_tree_2): Adjust heuristics when to build
a BB SLP node from scalars.
---
 gcc/tree-vect-slp.c | 51 +++--
 1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index ff0ecda801b..ba681fe6d5e 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -457,8 +457,23 @@ again:
   if (def_stmt_info && is_pattern_stmt_p (def_stmt_info))
oprnd_info->any_pattern = true;
 
+  tree type = TREE_TYPE (oprnd);
   if (first)
{
+ if ((dt == vect_constant_def
+  || dt == vect_external_def)
+ && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
+ && (TREE_CODE (type) == BOOLEAN_TYPE
+ || !can_duplicate_and_interleave_p (vinfo, stmts.length (),
+ type)))
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"Build SLP failed: invalid type of def "
+"for variable-length SLP %T\n", oprnd);
+ return -1;
+   }
+
  /* For the swapping logic below force vect_reduction_def
 for the reduction op in a SLP reduction group.  */
  if (!STMT_VINFO_DATA_REF (stmt_info)
@@ -467,7 +482,7 @@ again:
  && def_stmt_info)
dt = vect_reduction_def;
  oprnd_info->first_dt = dt;
- oprnd_info->first_op_type = TREE_TYPE (oprnd);
+ oprnd_info->first_op_type = type;
}
   else
{
@@ -476,7 +491,6 @@ again:
 types for reduction chains: the first stmt must be a
 vect_reduction_def (a phi node), and the rest
 end in the reduction chain.  */
- tree type = TREE_TYPE (oprnd);
  if ((oprnd_info->first_dt != dt
   && !(oprnd_info->first_dt == vect_reduction_def
&& !STMT_VINFO_DATA_REF (stmt_info)
@@ -514,19 +528,6 @@ again:
 
  return 1;
}
- if ((dt == vect_constant_def
-  || dt == vect_external_def)
- && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
- && (TREE_CODE (type) == BOOLEAN_TYPE
- || !can_duplicate_and_interleave_p (vinfo, stmts.length (),
- type)))
-   {
- if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"Build SLP failed: invalid type of def "
-"for variable-length SLP %T\n", oprnd);
- return -1;
-   }
}
 
   /* Check the types of the definitions.  */
@@ -1568,7 +1569,8 @@ fail:
   vect_free_oprnd_info (oprnds_info);
 
   /* If we have all children of a child built up from uniform scalars
- then just throw that away, causing it built up from scalars.
+ or does more than one possibly expensive vector construction then
+ just throw that away, causing it built up from scalars.
  The exception is the SLP node for the vector store.  */
   if (is_a  (vinfo)
   && !STMT_VINFO_GROUPED_ACCESS (stmt_info)
@@ -1579,11 +1581,20 @@ fail:
 {
   slp_tree child;
   unsigned j;
+  bool all_uniform_p = true;
+  unsigned n_vector_builds = 0;
   FOR_EACH_VEC_ELT (children, j, child)
-   if (SLP_TREE_DEF_TYPE (child) == vect_internal_def
-   || !vect_slp_tree_uniform_p (child))
- break;
-  if (!child)
+   {
+ if (SLP_TREE_DEF_TYPE (child) == vect_internal_def)
+   all_uniform_p = false;
+ else if (!vect_slp_tree_uniform_p (child))
+   {
+ all_uniform_p = false;
+ if (SLP_TREE_DEF_TYPE (child) == vect_external_def)
+   n_vector_builds++;
+   }
+   }
+  if (all_uniform_p || n_vector_builds > 1)
{
  /* Roll back.  */
  matches[0] = false;
-- 
2.26.2


[PATCH] Fix up plugin header install

2020-10-14 Thread Jakub Jelinek via Gcc-patches
Hi!

Jeff has noticed and I've confirmed that config/i386/i386.h header which is
installed on x86 in plugin/include/ directory newly in GCC 11 has
#include "common/config/i386/i386-cpuinfo.h"
which breaks all plugins that include tm.h etc. because that header is not
shipped.
The following patch seems to fix that.  Unfortunately it isn't just a matter
of TM_H += t-i386 change, because the header has full path and therefore
needs to be installed in its full path.
Additionally, I've noticed that the b-header-vars generation is completely
broken, it will just throw many of the dependencies away, because it
incorrectly removed everything from first ... remaining till the last /,
while what it clearly wants to do is remove each ... till last / in the same
header path (i.e. instead of .* should have used [^ ]* and g modifier).
I've also noticed that some other headers mentioned in #include of other
headers aren't included (gomp-constants.h as dependency of omp-general.h
and various dependencies of expr.h (where omp-general.h and expr.h were
previously installed)).

Tested on x86_64-linux with make install, ok for trunk?

2020-10-14  Jakub Jelinek  

* Makefile.in (PLUGIN_HEADERS): Add gomp-constants.h and $(EXPR_H).
(s-header-vars): Accept not just spaces but also tabs between *_H name
and =.  Handle common/config/ headers similarly to config.  Don't
throw away everything from first ... to last / on the remaining
string, instead skip just ... to corresponding last / without
intervening spaces and tabs.
(install-plugin): Treat common/config headers like config headers.
* config/i386/t-i386 (TM_H): Add
$(srcdir)/common/config/i386/i386-cpuinfo.h.

--- gcc/Makefile.in.jj  2020-10-06 23:35:52.616444535 +0200
+++ gcc/Makefile.in 2020-10-14 14:15:43.179074337 +0200
@@ -3594,7 +3594,8 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $
   tree-ssa-threadupdate.h inchash.h wide-int.h signop.h hash-map.h \
   hash-set.h dominance.h cfg.h cfgrtl.h cfganal.h cfgbuild.h cfgcleanup.h \
   lcm.h cfgloopmanip.h file-prefix-map.h builtins.def $(INSN_ATTR_H) \
-  pass-instances.def params.list
+  pass-instances.def params.list $(srcdir)/../include/gomp-constants.h \
+  $(EXPR_H)
 
 # generate the 'build fragment' b-header-vars
 s-header-vars: Makefile
@@ -3604,7 +3605,7 @@ s-header-vars: Makefile
 # more portable than a trailing "-e d" to filter out the uninteresting lines,
 # in particular on ia64-hpux where "s/.../p" only prints if -n was requested
 # as well.
-   $(foreach header_var,$(shell sed < Makefile -n -e 's/^\([A-Z0-9_]*_H\)[ 
 ]*=.*/\1/p'),echo $(header_var)=$(shell echo 
$($(header_var):$(srcdir)/%=.../%) | sed -e 's~\.\.\./config/~config/~' -e 
's~\.\.\..*/~~') >> tmp-header-vars;) \
+   $(foreach header_var,$(shell sed < Makefile -n -e 's/^\([A-Z0-9_]*_H\)[ 
]*=.*/\1/p'),echo $(header_var)=$(shell echo 
$($(header_var):$(srcdir)/%=.../%) | sed -e 's~\.\.\./config/~config/~' -e 
's~\.\.\./common/config/~common/config/~' -e 's~\.\.\.[^]*/~~g') >> 
tmp-header-vars;)
$(SHELL) $(srcdir)/../move-if-change tmp-header-vars b-header-vars
$(STAMP) s-header-vars
 
@@ -3630,7 +3631,8 @@ install-plugin: installdirs lang.install
  else continue; \
  fi; \
  case $$path in \
- "$(srcdir)"/config/* | "$(srcdir)"/c-family/* | "$(srcdir)"/*.def ) \
+ "$(srcdir)"/config/* | "$(srcdir)"/common/config/* \
+ | "$(srcdir)"/c-family/* | "$(srcdir)"/*.def ) \
base=`echo "$$path" | sed -e "s|$$srcdirstrip/||"`;; \
  *) base=`basename $$path` ;; \
  esac; \
--- gcc/config/i386/t-i386.jj   2020-01-14 20:02:45.861623645 +0100
+++ gcc/config/i386/t-i386  2020-10-14 13:41:38.503905771 +0200
@@ -17,7 +17,8 @@
 # .
 
 OPTIONS_H_EXTRA += $(srcdir)/config/i386/stringop.def
-TM_H += $(srcdir)/config/i386/x86-tune.def
+TM_H += $(srcdir)/config/i386/x86-tune.def \
+   $(srcdir)/common/config/i386/i386-cpuinfo.h
 PASSES_EXTRA += $(srcdir)/config/i386/i386-passes.def
 
 i386-c.o: $(srcdir)/config/i386/i386-c.c

Jakub



Re: [RFC][gimple] Move can_duplicate_bb_p to gimple_can_duplicate_bb_p

2020-10-14 Thread Richard Biener
On Wed, 14 Oct 2020, Tom de Vries wrote:

> On 10/14/20 8:15 AM, Richard Biener wrote:
> >> I've tried to address this by merging can_duplicate_stmt_p and
> >> can_duplicate_last_stmt_p, and adding a default parameter.
> >>
> >> Better like this?
> > Sorry for iterating again but since we now would appropriately
> > handle things in the CFG hook there's no need for tracer.c to
> > do this on its own via the _stmt calls.  So I suggest to
> > remove the unifying of the stmt counting loop and use
> > the can_duplicate_block_p CFG hook directly instead (but of course
> > still cache its outcome).  That way we can simplify what is
> > exported.
> 
> Np :) . Fully retested, OK for trunk?

OK.

Thanks,
Richard.


Re: [PATCH] i386: Improve chaining of _{addcarry, subborrow}_u{32, 64} [PR97387]

2020-10-14 Thread Richard Biener via Gcc-patches
On Wed, Oct 14, 2020 at 11:49 AM Jakub Jelinek  wrote:
>
> On Wed, Oct 14, 2020 at 11:22:48AM +0200, Richard Biener wrote:
> > > +  if (mode == CCCmode
> > > + && GET_CODE (XEXP (x, 0)) == NEG
> > > + && GET_CODE (XEXP (XEXP (x, 0), 0)) == GEU
> > > + && REG_P (XEXP (XEXP (XEXP (x, 0), 0), 0))
> > > + && (GET_MODE (XEXP (XEXP (XEXP (x, 0), 0), 0)) == CCCmode
> > > + || GET_MODE (XEXP (XEXP (XEXP (x, 0), 0), 0)) == CCmode)
> > > + && REGNO (XEXP (XEXP (XEXP (x, 0), 0), 0)) == FLAGS_REG
> > > + && XEXP (XEXP (XEXP (x, 0), 0), 1) == const0_rtx
> > > + && GET_CODE (XEXP (x, 1)) == LTU
> > > + && REG_P (XEXP (XEXP (x, 1), 0))
> > > + && (GET_MODE (XEXP (XEXP (x, 1), 0))
> > > + == GET_MODE (XEXP (XEXP (XEXP (x, 0), 0), 0)))
> > > + && REGNO (XEXP (XEXP (x, 1), 0)) == FLAGS_REG
> > > + && XEXP (XEXP (x, 1), 1) == const0_rtx)
> >
> > Meh ;)  templates to the rescue?
> >
> >   rtx_match < un > ().matches (x)
> >
> > and with fancy metaprogramming expand it to above?  Not sure if it's easier
> > to read that way.  Maybe
>
> It would certainly not match the style used elsewhere in the backend.
> >
> >   rtx neg, geu;
> >   if (mode == CCCmode
> >   && (neg = XEXP (x, 0), GET_CODE (neg) == NEG)
> >   && (geu = XEXP (neg, 0), GET_CODE (geu) == GEU)
> > ...
> >
> > or
> >
> >   if (mode == CCCmode
> >   && GET_CODE (neg = XEXP (x, 0)) == NEG
> >
> > thus some manual CSE and naming in this matching would help?
>
> Attached are two incremental patches, one just adds op0 and op1 for the
> COMPARE operand of all the costs COMPARE handling, which replaces all the
> XEXP (x, 0) with op0 and XEXP (x, 1) with op1, the other is that plus
> the geu you've suggested.

OK, so the visual appearance is not very much improved.  I guess
the main issue is the checks do not "align" with a view of the expression
but I can't see how to improve that.  When one actually looks at
the tests the CSEd vairants are easier to match-up and I'd pick the geu one.

Eventually intermixed comments would help the casual reader?

+  rtx geu;
   /* (neg:CCC */
   if (mode == CCCmode
+ && GET_CODE (op0) == NEG
   /* (geu (reg:CC[C] cc) const0)) */
+ && GET_CODE (geu = XEXP (op0, 0)) == GEU
+ && REG_P (XEXP (geu, 0))
+ && (GET_MODE (XEXP (geu, 0)) == CCCmode
+ || GET_MODE (XEXP (geu, 0)) == CCmode)
+ && REGNO (XEXP (geu, 0)) == FLAGS_REG
+ && XEXP (geu, 1) == const0_rtx
   /* (LTU (reg:CCCmode cc) const0)) */
+ && GET_CODE (op1) == LTU
+ && REG_P (XEXP (op1, 0))
+ && GET_MODE (XEXP (op1, 0)) == GET_MODE (XEXP (geu, 0))
+ && REGNO (XEXP (op1, 0)) == FLAGS_REG
+ && XEXP (op1, 1) == const0_rtx)


I'll leave the actual review to Uros of course.

Richard.

>
> Jakub


c++: Instantiation with local extern [PR97395]

2020-10-14 Thread Nathan Sidwell

It turns out that pushdecl_with_scope has somewhat strange behaviour,
which probably made more sense way back.  Unfortunately making it
somewhat saner turned into a rathole.  Instead use a
push_nested_namespace around pushing the alias -- this is similar to
some of the friend handling we already have.

gcc/cp/
* name-lookup.c (push_local_extern_decl_alias): Push into alias's
namespace and use pushdecl.
(do_pushdecl_with_scope): Clarify behaviour.
gcc/testsuite/
* g++.dg/lookup/extern-redecl2.C: New.


pushing to trunk

nathan

--
Nathan Sidwell
diff --git c/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index e3f3712b1f0..5dcaab4d1df 100644
--- c/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -38,7 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 
 static cxx_binding *cxx_binding_make (tree value, tree type);
 static cp_binding_level *innermost_nonclass_level (void);
-static tree do_pushdecl_with_scope (tree x, cp_binding_level *, bool hiding);
+static tree do_pushdecl (tree decl, bool hiding);
 static void set_identifier_type_value_with_scope (tree id, tree decl,
 		  cp_binding_level *b);
 static name_hint maybe_suggest_missing_std_header (location_t location,
@@ -2975,8 +2975,9 @@ push_local_extern_decl_alias (tree decl)
 
 	  /* Expected default linkage is from the namespace.  */
 	  TREE_PUBLIC (alias) = TREE_PUBLIC (ns);
-	  alias = do_pushdecl_with_scope (alias, NAMESPACE_LEVEL (ns),
-	  /* hiding= */true);
+	  push_nested_namespace (ns);
+	  alias = do_pushdecl (alias, /* hiding= */true);
+	  pop_nested_namespace (ns);
 	}
 }
 
@@ -3848,10 +3849,17 @@ constructor_name_p (tree name, tree type)
 /* Same as pushdecl, but define X in binding-level LEVEL.  We rely on the
caller to set DECL_CONTEXT properly.
 
-   Note that this must only be used when X will be the new innermost
-   binding for its name, as we tack it onto the front of IDENTIFIER_BINDING
-   without checking to see if the current IDENTIFIER_BINDING comes from a
-   closer binding level than LEVEL.  */
+   Warning: For class and block-scope this must only be used when X
+   will be the new innermost binding for its name, as we tack it onto
+   the front of IDENTIFIER_BINDING without checking to see if the
+   current IDENTIFIER_BINDING comes from a closer binding level than
+   LEVEL.
+
+   Warning: For namespace scope, this will look in LEVEL for an
+   existing binding to match, but if not found will push the decl into
+   CURRENT_NAMESPACE.  Use push_nested_namespace/pushdecl/
+   pop_nested_namespace if you really need to push it into a foreign
+   namespace.  */
 
 static tree
 do_pushdecl_with_scope (tree x, cp_binding_level *level, bool hiding = false)
diff --git c/gcc/testsuite/g++.dg/lookup/extern-redecl2.C w/gcc/testsuite/g++.dg/lookup/extern-redecl2.C
new file mode 100644
index 000..9c5caa6b677
--- /dev/null
+++ w/gcc/testsuite/g++.dg/lookup/extern-redecl2.C
@@ -0,0 +1,18 @@
+// PR 97395
+// ICE injecting hidden decl in wrong namespace
+
+namespace pr {
+  template
+  void
+  kp ()
+  {
+extern WW hz;
+  }
+
+  void
+  n5 ()
+  {
+kp ();
+kp ();
+  }
+}


[committed] libstdc++: Improve comments in std::string tests

2020-10-14 Thread Jonathan Wakely via Gcc-patches
The COW std::string does support some features of C++11 allocators, just
not propagation. Change some comments in the tests to be more precise
about that.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/allocator/char/copy.cc: Make
comment more precise about what isn't supported by COW strings.
* testsuite/21_strings/basic_string/allocator/char/copy_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/move.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/move_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/noexcept.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/operator_plus.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/swap.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/copy.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/copy_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/move.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/move_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/noexcept.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/operator_plus.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/swap.cc:
Likewise.

Tested powerpc64le-linux. Committed to trunk.

commit 5e961dba46a84b5c4ba5086d05db6ff449d8682f
Author: Jonathan Wakely 
Date:   Wed Oct 14 12:07:31 2020

libstdc++: Improve comments in std::string tests

The COW std::string does support some features of C++11 allocators, just
not propagation. Change some comments in the tests to be more precise
about that.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/allocator/char/copy.cc: Make
comment more precise about what isn't supported by COW strings.
* testsuite/21_strings/basic_string/allocator/char/copy_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/move.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/move_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/noexcept.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/operator_plus.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/char/swap.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/copy.cc:
Likewise.
* 
testsuite/21_strings/basic_string/allocator/wchar_t/copy_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/move.cc:
Likewise.
* 
testsuite/21_strings/basic_string/allocator/wchar_t/move_assign.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/noexcept.cc:
Likewise.
* 
testsuite/21_strings/basic_string/allocator/wchar_t/operator_plus.cc:
Likewise.
* testsuite/21_strings/basic_string/allocator/wchar_t/swap.cc:
Likewise.

diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/copy.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/copy.cc
index b5c9b5b5315..9972cfbdc46 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/copy.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/copy.cc
@@ -5,24 +5,24 @@
 // terms of the GNU General Public License as published by the
 // Free Software Foundation; either version 3, or (at your option)
 // any later version.
- 
+
 // This library is distributed in the hope that it will be useful,
 // but WITHOUT ANY WARRANTY; without even the implied warranty of
 // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 // GNU General Public License for more details.
- 
+
 // You should have received a copy of the GNU General Public License along
 // with this library; see the file COPYING3.  If not see
 // .
 
 // { dg-do run { target c++11 } }
-// COW strings don't support C++11 allocators:
+// COW strings don't support C++11 allocator propagation:
 // { dg-require-effective-target cxx11-abi }
 
 #include 
 #include 
 #include 
- 
+
 using C = char;
 const C c = 'a';
 using traits = std::char_traits;
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/copy_assign.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/copy_assign.cc
index e776f37542b..038af4d8e05 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/copy_assign.cc
+++ 

[committed] libstdc++: Improve comments for check_effective_target_cxx11-abi

2020-10-14 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* testsuite/lib/libstdc++.exp (check_effective_target_cxx11-abi):
Add comments about which test flags get used by the check.

Tested powerpc64le-linux. Committed to trunk.

commit a1b6b013615082f0837ea34c5a65136822523be7
Author: Jonathan Wakely 
Date:   Wed Oct 14 12:09:27 2020

libstdc++: Improve comments for check_effective_target_cxx11-abi

libstdc++-v3/ChangeLog:

* testsuite/lib/libstdc++.exp (check_effective_target_cxx11-abi):
Add comments about which test flags get used by the check.

diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp 
b/libstdc++-v3/testsuite/lib/libstdc++.exp
index 78484f7c9af..fc1e8f242fd 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -1524,6 +1524,8 @@ proc check_v3_target_filesystem_ts { } {
 }
 
 # Return 1 if the "cxx11" ABI is in use using the current flags, 0 otherwise.
+# Any flags provided by RUNTESTFLAGS or a target board will be used here.
+# Flags added in the test by dg-options or dg-add-options will not be used.
 proc check_effective_target_cxx11-abi { } {
 global cxxflags
 


[committed] libstdc++: Enable tests that incorrectly require cxx11-abi

2020-10-14 Thread Jonathan Wakely via Gcc-patches
These tests were not being run when -D_GLIBCXX_USE_CXX11_ABI=0 was added
to the test flags, but they actually work OK with the old string.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/allocator/char/minimal.cc:
Do not require cxx11-abi effective target.
* testsuite/21_strings/basic_string/allocator/wchar_t/minimal.cc:
Likewise.
* testsuite/27_io/basic_fstream/cons/base.cc: Likewise.

Tested powerpc64le-linux. Committed to trunk.

commit 5ae9ddd480f97ba16b9b9d11d333e1252b820166
Author: Jonathan Wakely 
Date:   Wed Oct 14 12:05:57 2020

libstdc++: Enable tests that incorrectly require cxx11-abi

These tests were not being run when -D_GLIBCXX_USE_CXX11_ABI=0 was added
to the test flags, but they actually work OK with the old string.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/allocator/char/minimal.cc:
Do not require cxx11-abi effective target.
* testsuite/21_strings/basic_string/allocator/wchar_t/minimal.cc:
Likewise.
* testsuite/27_io/basic_fstream/cons/base.cc: Likewise.

diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/minimal.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/minimal.cc
index 2e7c7e2d4a4..3493f630920 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/minimal.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/char/minimal.cc
@@ -5,25 +5,23 @@
 // terms of the GNU General Public License as published by the
 // Free Software Foundation; either version 3, or (at your option)
 // any later version.
- 
+
 // This library is distributed in the hope that it will be useful,
 // but WITHOUT ANY WARRANTY; without even the implied warranty of
 // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 // GNU General Public License for more details.
- 
+
 // You should have received a copy of the GNU General Public License along
 // with this library; see the file COPYING3.  If not see
 // .
 
 // { dg-do run { target c++11 } }
-// COW strings don't support C++11 allocators:
-// { dg-require-effective-target cxx11-abi }
 
 #include 
 #include 
 #include 
 #include 
- 
+
 using C = char;
 const C c = 'a';
 using traits = std::char_traits;
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/wchar_t/minimal.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/wchar_t/minimal.cc
index ed791747df3..5f057e84dbf 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/wchar_t/minimal.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/wchar_t/minimal.cc
@@ -5,25 +5,23 @@
 // terms of the GNU General Public License as published by the
 // Free Software Foundation; either version 3, or (at your option)
 // any later version.
- 
+
 // This library is distributed in the hope that it will be useful,
 // but WITHOUT ANY WARRANTY; without even the implied warranty of
 // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 // GNU General Public License for more details.
- 
+
 // You should have received a copy of the GNU General Public License along
 // with this library; see the file COPYING3.  If not see
 // .
 
 // { dg-do run { target c++11 } }
-// COW strings don't support C++11 allocators:
-// { dg-require-effective-target cxx11-abi }
 
 #include 
 #include 
 #include 
 #include 
- 
+
 using C = wchar_t;
 const C c = L'a';
 using traits = std::char_traits;
diff --git a/libstdc++-v3/testsuite/27_io/basic_fstream/cons/base.cc 
b/libstdc++-v3/testsuite/27_io/basic_fstream/cons/base.cc
index 50a45faacdc..1e7be126212 100644
--- a/libstdc++-v3/testsuite/27_io/basic_fstream/cons/base.cc
+++ b/libstdc++-v3/testsuite/27_io/basic_fstream/cons/base.cc
@@ -17,7 +17,6 @@
 
 // { dg-options "-O0" }
 // { dg-do link { target c++11 } }
-// { dg-require-effective-target cxx11-abi }
 
 #include 
 #include 


[committed] libstdc++: Define some std::string constructors inline

2020-10-14 Thread Jonathan Wakely via Gcc-patches
There are a lot of very simple constructors for the old string which are
not defined inline. I don't see any reason for this and it probably
makes them less likely to be optimized away. Move the definitions into
the class body.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (basic_string(const Alloc&))
(basic_string(const basic_string&)
(basic_string(const CharT*, size_type, const Alloc&))
(basic_string(const CharT*, const Alloc&))
(basic_string(size_type, CharT, const Alloc&))
(basic_string(initializer_list, const Alloc&))
(basic_string(InputIterator, InputIterator, const Alloc&)):
Define inline in class body.
* include/bits/basic_string.tcc (basic_string(const Alloc&))
(basic_string(const basic_string&)
(basic_string(const CharT*, size_type, const Alloc&))
(basic_string(const CharT*, const Alloc&))
(basic_string(size_type, CharT, const Alloc&))
(basic_string(initializer_list, const Alloc&))
(basic_string(InputIterator, InputIterator, const Alloc&)):
Move definitions into class body.

Tested powerpc64le-linux. Committed to trunk.

commit 252c9967ba785aedf3b39e2cd50237d0f32fe3bd
Author: Jonathan Wakely 
Date:   Wed Oct 14 12:10:26 2020

libstdc++: Define some std::string constructors inline

There are a lot of very simple constructors for the old string which are
not defined inline. I don't see any reason for this and it probably
makes them less likely to be optimized away. Move the definitions into
the class body.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (basic_string(const Alloc&))
(basic_string(const basic_string&)
(basic_string(const CharT*, size_type, const Alloc&))
(basic_string(const CharT*, const Alloc&))
(basic_string(size_type, CharT, const Alloc&))
(basic_string(initializer_list, const Alloc&))
(basic_string(InputIterator, InputIterator, const Alloc&)):
Define inline in class body.
* include/bits/basic_string.tcc (basic_string(const Alloc&))
(basic_string(const basic_string&)
(basic_string(const CharT*, size_type, const Alloc&))
(basic_string(const CharT*, const Alloc&))
(basic_string(size_type, CharT, const Alloc&))
(basic_string(initializer_list, const Alloc&))
(basic_string(InputIterator, InputIterator, const Alloc&)):
Move definitions into class body.

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index 4b3722bdbf1..372302ba6a1 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -548,7 +548,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
*
*  The newly-created string contains the exact contents of @a __str.
*  @a __str is a valid, but unspecified string.
-   **/
+   */
   basic_string(basic_string&& __str) noexcept
   : _M_dataplus(_M_local_data(), std::move(__str._M_get_allocator()))
   {
@@ -696,7 +696,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
*
*  The contents of @a str are moved into this string (without copying).
*  @a str is a valid, but unspecified string.
-   **/
+   */
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 2063. Contradictory requirements for string move assignment
   basic_string&
@@ -3563,14 +3563,20 @@ _GLIBCXX_END_NAMESPACE_CXX11
*  @brief  Construct an empty string using allocator @a a.
*/
   explicit
-  basic_string(const _Alloc& __a);
+  basic_string(const _Alloc& __a)
+  : _M_dataplus(_S_construct(size_type(), _CharT(), __a), __a)
+  { }
 
   // NB: per LWG issue 42, semantics different from IS:
   /**
*  @brief  Construct string with copy of value of @a str.
*  @param  __str  Source string.
*/
-  basic_string(const basic_string& __str);
+  basic_string(const basic_string& __str)
+  : _M_dataplus(__str._M_rep()->_M_grab(_Alloc(__str.get_allocator()),
+   __str.get_allocator()),
+   __str.get_allocator())
+  { }
 
   // _GLIBCXX_RESOLVE_LIB_DEFECTS
   // 2583. no way to supply an allocator for basic_string(str, pos)
@@ -3611,7 +3617,9 @@ _GLIBCXX_END_NAMESPACE_CXX11
*  has no special meaning.
*/
   basic_string(const _CharT* __s, size_type __n,
-  const _Alloc& __a = _Alloc());
+  const _Alloc& __a = _Alloc())
+  : _M_dataplus(_S_construct(__s, __s + __n, __a), __a)
+  { }
 
   /**
*  @brief  Construct string as copy of a C string.
@@ -3623,7 +3631,10 @@ _GLIBCXX_END_NAMESPACE_CXX11
   // 3076. basic_string CTAD ambiguity
   template>
 #endif
-  basic_string(const _CharT* __s, const _Alloc& __a = 

[committed] libstdc++: Implement LWG 3706 for COW strings

2020-10-14 Thread Jonathan Wakely via Gcc-patches
The basic_string deduction guides are defined for the old ABI, but the
tests are currently disabled. This is because a single case fails when
using the old ABI, which is just because LWG 3706 isn't implemented for
the old ABI. That can be done easily, and the tests can be enabled.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h [!_GLIBCXX_USE_CXX11_ABI]
(basic_string(const _CharT*, const _Alloc&)): Constrain to
require an allocator-like type to fix CTAD ambiguity (LWG 3706).
* testsuite/21_strings/basic_string/cons/char/deduction.cc:
Remove dg-skip-if.
* testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc:
Likewise.

Tested powerpc64le-linux. Committed to trunk.

commit dc38e255242192303ae463a913c060b426eb06c0
Author: Jonathan Wakely 
Date:   Wed Oct 14 11:52:26 2020

libstdc++: Implement LWG 3706 for COW strings

The basic_string deduction guides are defined for the old ABI, but the
tests are currently disabled. This is because a single case fails when
using the old ABI, which is just because LWG 3706 isn't implemented for
the old ABI. That can be done easily, and the tests can be enabled.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h [!_GLIBCXX_USE_CXX11_ABI]
(basic_string(const _CharT*, const _Alloc&)): Constrain to
require an allocator-like type to fix CTAD ambiguity (LWG 3706).
* testsuite/21_strings/basic_string/cons/char/deduction.cc:
Remove dg-skip-if.
* testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc:
Likewise.

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index a9fe09f2069..4b3722bdbf1 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -3612,12 +3612,19 @@ _GLIBCXX_END_NAMESPACE_CXX11
*/
   basic_string(const _CharT* __s, size_type __n,
   const _Alloc& __a = _Alloc());
+
   /**
*  @brief  Construct string as copy of a C string.
*  @param  __s  Source C string.
*  @param  __a  Allocator to use (default is default allocator).
*/
+#if __cpp_deduction_guides && ! defined _GLIBCXX_DEFINING_STRING_INSTANTIATIONS
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 3076. basic_string CTAD ambiguity
+  template>
+#endif
   basic_string(const _CharT* __s, const _Alloc& __a = _Alloc());
+
   /**
*  @brief  Construct string as multiple characters.
*  @param  __n  Number of characters.
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/deduction.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/deduction.cc
index 6484ed43453..d05c4b776c3 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/deduction.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/deduction.cc
@@ -17,7 +17,6 @@
 
 // { dg-options "-std=gnu++17" }
 // { dg-do compile { target c++17 } }
-// { dg-xfail-if "COW string missing deduction guides" { ! cxx11-abi } }
 
 #include 
 #include 
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc
index 373b2b24bdd..1773be28e37 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/deduction.cc
@@ -17,7 +17,6 @@
 
 // { dg-options "-std=gnu++17" }
 // { dg-do compile { target c++17 } }
-// { dg-xfail-if "COW string missing deduction guides" { ! cxx11-abi } }
 
 #include 
 #include 


Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
> > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> > >> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics
> > >> > > header.
> > >> > >
> > >> >
> > >> > Thanks for your review. We found that without adding -muintr, the 
> > >> > intrinsics header could also be tested. Make-check for these file all 
> > >> > get passed.
> > >> >
> > >> > And there is no intrinsic/builtin with const int parameter. So we 
> > >> > remove -muintr from these files.
> > >>
> > >> Can your double check that relevant instructions are indeed generated?
> > >> Without -muintr, relevant patterns in i386.md are effectively blocked,
> > >> and perhaps a call to __builtin_ia32_* is generated instead.
> > >
> > >
> > > Yes, in sse-14.s we have
> > >
> > > _clui:
> > > .LFB136:
> > > .cfi_startproc
> > > pushq   %rbp
> > > .cfi_def_cfa_offset 16
> > > .cfi_offset 6, -16
> > > movq%rsp, %rbp
> > > .cfi_def_cfa_register 6
> > > clui
> > > nop
> > > popq%rbp
> > > .cfi_def_cfa 7, 8
> > > ret
> > > .cfi_endproc
> >
> > Strange, without -muintr, it should not be generated, and some error
> > about failed inlining due to target specific option mismatch shoul be
> > emitted.
> >
> > Can you please investigate this a bit more?
> >
>
> Because of function target attribute?

I don't think so. Please consider this similar testcase:

--cut here--
#ifndef __SSE2__
#pragma GCC push_options
#pragma GCC target("sse2")
#define __DISABLE_SSE2__
#endif /* __SSE2__ */

typedef double __v2df __attribute__ ((__vector_size__ (16)));
typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__));

extern __inline __m128d __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
_mm_add_sd (__m128d __A, __m128d __B)
{
  return (__m128d)__builtin_ia32_addsd ((__v2df)__A, (__v2df)__B);
}

#ifdef __DISABLE_SSE2__
#undef __DISABLE_SSE2__
#pragma GCC pop_options
#endif /* __DISABLE_SSE2__ */


__v2df foo (__v2df a, __v2df b)
{
  return _mm_add_sd (a, b);
}
--cut here--

$ gcc -O2 -mno-sse2 -S -dp sse2.c
sse2.c: In function ‘foo’:
sse2.c:11:1: error: inlining failed in call to ‘always_inline’
‘_mm_add_sd’: target specific option mismatch
  11 | _mm_add_sd (__m128d __A, __m128d __B)
 | ^~
sse2.c:24:10: note: called from here
  24 |   return _mm_add_sd (a, b);
 |  ^

I'd expect some similar warning from missing -mumip.

Uros.


Re: [patch] Rework CPP_BUILTINS_SPEC for powerpc-vxworks

2020-10-14 Thread Olivier Hainque



> On 13 Oct 2020, at 17:38, Segher Boessenkool 
>> 
>> Same ChangeLog. Patch hopefully quotable if needed now.
> 
> It is, thank you!

Sure.

> The patch looks fine to me now.

Great, thanks!

>  Not that you need my approval :-)

Always happy to get constructive suggestions for
improvements :)

Your comments encouraged me to look at the set of
predefined macros slightly differently, which proved
very useful.

Glad to see that the adjustements lead to something
that looks better in your opinion as well.

Best Regards,

Olivier



[RFC] Automatic linking of libatomic via gcc.c or ...? [PR81358] (dependency for libgomp on nvptx dep, configure overriddable, ...)

2020-10-14 Thread Tobias Burnus

Hi all, hi Joseph & Jakub,

BACKGROUND:

The main reason I am interested in this is offloading where
OpenACC/OpenMP code might run into:
unresolved symbol __atomic_compare_exchange_16
and it is nontrivial to find out that the solution is:
-foffload=nvptx-none=-latomic
And the atomic use can appear for innocent looking code.
(The dependency comes via libgomp/config/nvptx/atomic.c
for __sync_val_compare_and_swap_16.)

However, as PR81358 shows, also for normal C (and C++) code
it would be nice as atomics are part of the language.

Target specific: for RISC-V (with -as-needed support only
with pthread, otherwise unconditionally) – and seemingly for RTEMS –
-latomic is already linked via LIB_SPEC.


My initial plan was to add -latomic to libgomp.spec - if built
for that target and using -as-needed if available, cf. last but one
email in the thread
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/thread.html#555298
(my initial RFC question, quoted in the first email, was in September)

However, adding it there was opposed by Jakub as:
"I think for libgomp.spec we should add it solely for the offloading targets,
neither GCC generated code for OpenMP construct nor libgomp itself needs
-latomic on the hosts."
(Albeit for nvptx, see above.)


SOLUTION PART 1: Add configure option --enable-autolink-libatomic

[See attached patch (only configure check)]

Example for an explicit use: enable it for offloading targets
(which enables it for nvptx but not for amdgcn which does not
build libatomic). — And for the host, you may want to enable if
you always install libatomic and -as-needed is supported and
otherwise, you might want to disable it.

QUESTION:
* Should the default of --enable-autolink-libatomic default
  to 'yes' or 'no'?
* Should the default additionally depend on having -as-needed support?
  (Which would exclude nvptx with default settings.)


SOLUTION PART 2: Actually linking libatomic

?

QUESTION: I have no idea where to add the -latomic.

(a) Still do it in libgomp.spec but now with that configure
flag to be able to disable it?

Pro:
+ Solves problem for nvptx, which adds atomic code to
  libgomp/config/nvptx/atomic.c
+ rather minimal invasive and by tuning the config option
  it could be used when needed
Con:
- While OpenMP has atomics, libgomp by itself in general
  does not depend on libatomic for most targets
  - this can be mitigated by using the configure flag
  but it is not ideal, either.
- C code can use atomic directly and thus it would be nice
  if it could be automatically linked. (Especially if the
  target supports the -as-needed linker flag.)


(b) Do it in the driver

Pro:
+ will automatically work for C/C++ atomic code
+ with -as-needed it will only be linked if really needed
  (caveat: lib file has to be present at link time.)
Con:
- If -as-needed is not supported and it is always linked, this
  adds unneccessary dependencies (shared lib) or file size (static lib)
- Adding files for the linker to analyze does not help with
  the compile size
- The file has to be present, even if -as-needed is supported,
  adding a hard dependency on the libatomic library for Linux
  distros

For doing it in the driver, I am not sure when to add it.
Used:
- Direct consumer is C/C++ using atomics.
- For Fortran, it (currently) is only used for with
  nvptx offloading as described above.
- For offloading, the compilation/linking handling is
  slightly different and it needs to work there as well.
- No idea about Ada, D, or other direct or indirect dependencies


RISC-V + RTEMS use LIB_SPEC to add it.

One possibility would be to add it to the init_gcc_specs call,
but that feels like a sledge hammer solution.

Question: Where do you think should it be in the driver?

Other thoughts?

Or is solution (a) better? (That is: previous patch +
new --enable-autolink-libatomic for libgomp, only.)
Which is kind of a complicated nvptx-offloading-only solution?

Tobias

PS: I assume -static-libatomic then has to be added as well when we
add -latomic in the driver.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
 gcc/config.in|  6 ++
 gcc/configure| 38 +++---
 gcc/configure.ac | 26 +-
 3 files changed, 66 insertions(+), 4 deletions(-)
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 26a5d8e3619..55e29773f98 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -1102,6 +1102,31 @@ AC_ARG_WITH(multilib-list,
 :,
 with_multilib_list=default)
 
+# If libatomic is available, whether it should be linked automatically
+AC_ARG_ENABLE(autolink-libatomic,
+[AS_HELP_STRING([--enable-autolink-libatomic],
+		[enable or disable the automatic linking of libatomic
+ if it is available (enabled by default)])],
+[
+  case $enable_autolink_libatomic in
+yes | no) ;;
+*) 

Re: Support ofsetted parameters in local modref

2020-10-14 Thread Jan Hubicka
> Hi,
> 
> On Wed, Oct 14 2020, Jan Hubicka wrote:
> > Hi,
> > here is updated patch with cap on number of iterations.
> > I set the limit to 8 and bootstrapped it with additional assert that the
> > limit is not met, it did not fire.
> >
> > Bootstrapped/regtested x86_64-linux, OK?
> >
> > gcc/ChangeLog:
> >
> > 2020-10-14  Jan Hubicka  
> >
> > * doc/invoke.texi: (ipa-jump-function-lookups): Document param.
> > * ipa-modref.c (merge_call_side_effects): Use
> > unadjusted_ptr_and_unit_offset.
> > * ipa-prop.c (unadjusted_ptr_and_unit_offset): New function.
> > * ipa-prop.h (unadjusted_ptr_and_unit_offset): Declare.
> > * params.opt: (-param-ipa-jump-function-lookups): New.
> >
> > diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> > index 2d09d913051..cf3da6a6568 100644
> > --- a/gcc/ipa-prop.c
> > +++ b/gcc/ipa-prop.c
> > @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "domwalk.h"
> >  #include "builtins.h"
> >  #include "tree-cfgcleanup.h"
> > +#include "options.h"
> >  
> >  /* Function summary where the parameter infos are actually stored. */
> >  ipa_node_params_t *ipa_node_params_sum = NULL;
> > @@ -1222,6 +1223,73 @@ load_from_unmodified_param_or_agg (struct 
> > ipa_func_body_info *fbi,
> >return index;
> >  }
> >  
> > +/* Walk pointer adjustemnts from OP (such as POINTER_PLUS and ADDR_EXPR)
> > +   to find original pointer.  Initialize RET to the pointer which results 
> > from
> > +   the walk.
> > +   If offset is known return true and initialize OFFSET_RET.  */
> > +
> > +bool
> > +unadjusted_ptr_and_unit_offset (tree op, tree *ret, poly_int64 *offset_ret)
> > +{
> > +  poly_int64 offset = 0;
> > +  bool offset_known = true;
> > +  int i;
> > +
> > +  for (i = 0; i < param_ipa_jump_function_lookups; i++)
> > +{
> > +  if (TREE_CODE (op) == ADDR_EXPR)
> > +   {
> > + poly_int64 extra_offset = 0;
> > + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (op, 0),
> > +);
> > + if (!base)
> > +   {
> > + base = get_base_address (TREE_OPERAND (op, 0));
> > + if (TREE_CODE (base) != MEM_REF)
> > +   break;
> > + offset_known = false;
> 
> Umm, did you really intend to clear offset_known only after the break?
> (I may not understand the nuances of the get_base... functions but it
> strikes me as odd.)

Yes, offset_known relates to op.

We try tolookup base and if it is MEM_REF we will be able to update op.
If this is a decl or something else we thus need to give up (since we
are required to return pointer). So in first case we will return current
op for which offset_known stil lapplies, however if we found MEM_REF we
will update op but lose track of offset, since
get_addr_base_and_unit_offset failed.

Honza
> 
> Thanks,
> 
> Martin


Re: Support ofsetted parameters in local modref

2020-10-14 Thread Martin Jambor
Hi,

On Wed, Oct 14 2020, Jan Hubicka wrote:
> Hi,
> here is updated patch with cap on number of iterations.
> I set the limit to 8 and bootstrapped it with additional assert that the
> limit is not met, it did not fire.
>
> Bootstrapped/regtested x86_64-linux, OK?
>
> gcc/ChangeLog:
>
> 2020-10-14  Jan Hubicka  
>
>   * doc/invoke.texi: (ipa-jump-function-lookups): Document param.
>   * ipa-modref.c (merge_call_side_effects): Use
>   unadjusted_ptr_and_unit_offset.
>   * ipa-prop.c (unadjusted_ptr_and_unit_offset): New function.
>   * ipa-prop.h (unadjusted_ptr_and_unit_offset): Declare.
>   * params.opt: (-param-ipa-jump-function-lookups): New.
>
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index 2d09d913051..cf3da6a6568 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "domwalk.h"
>  #include "builtins.h"
>  #include "tree-cfgcleanup.h"
> +#include "options.h"
>  
>  /* Function summary where the parameter infos are actually stored. */
>  ipa_node_params_t *ipa_node_params_sum = NULL;
> @@ -1222,6 +1223,73 @@ load_from_unmodified_param_or_agg (struct 
> ipa_func_body_info *fbi,
>return index;
>  }
>  
> +/* Walk pointer adjustemnts from OP (such as POINTER_PLUS and ADDR_EXPR)
> +   to find original pointer.  Initialize RET to the pointer which results 
> from
> +   the walk.
> +   If offset is known return true and initialize OFFSET_RET.  */
> +
> +bool
> +unadjusted_ptr_and_unit_offset (tree op, tree *ret, poly_int64 *offset_ret)
> +{
> +  poly_int64 offset = 0;
> +  bool offset_known = true;
> +  int i;
> +
> +  for (i = 0; i < param_ipa_jump_function_lookups; i++)
> +{
> +  if (TREE_CODE (op) == ADDR_EXPR)
> + {
> +   poly_int64 extra_offset = 0;
> +   tree base = get_addr_base_and_unit_offset (TREE_OPERAND (op, 0),
> +  );
> +   if (!base)
> + {
> +   base = get_base_address (TREE_OPERAND (op, 0));
> +   if (TREE_CODE (base) != MEM_REF)
> + break;
> +   offset_known = false;

Umm, did you really intend to clear offset_known only after the break?
(I may not understand the nuances of the get_base... functions but it
strikes me as odd.)

Thanks,

Martin


Re: [RFC][gimple] Move can_duplicate_bb_p to gimple_can_duplicate_bb_p

2020-10-14 Thread Tom de Vries
On 10/14/20 8:15 AM, Richard Biener wrote:
>> I've tried to address this by merging can_duplicate_stmt_p and
>> can_duplicate_last_stmt_p, and adding a default parameter.
>>
>> Better like this?
> Sorry for iterating again but since we now would appropriately
> handle things in the CFG hook there's no need for tracer.c to
> do this on its own via the _stmt calls.  So I suggest to
> remove the unifying of the stmt counting loop and use
> the can_duplicate_block_p CFG hook directly instead (but of course
> still cache its outcome).  That way we can simplify what is
> exported.

Np :) . Fully retested, OK for trunk?

Thanks,
- Tom
[gimple] Move can_duplicate_bb_p to gimple_can_duplicate_bb_p

The function gimple_can_duplicate_bb_p currently always returns true.

The presence of can_duplicate_bb_p in tracer.c however suggests that
there are cases when bb's indeed cannot be duplicated.

Move the implementation of can_duplicate_bb_p to gimple_can_duplicate_bb_p.

Bootstrapped and reg-tested on x86_64-linux.

Build x86_64-linux with nvptx accelerator and tested libgomp.

No issues found.

As corner-case check, bootstrapped and reg-tested a patch that makes
gimple_can_duplicate_bb_p always return false, resulting in
PR97333 - "[gimple_can_duplicate_bb_p == false, tree-ssa-threadupdate]
ICE in duplicate_block, at cfghooks.c:1093".

gcc/ChangeLog:

2020-10-09  Tom de Vries  

	* tracer.c (cached_can_duplicate_bb_p, analyze_bb): Use
	can_duplicate_block_p.
	(can_duplicate_insn_p, can_duplicate_bb_no_insn_iter_p)
	(can_duplicate_bb_p): Move and merge ...
	* tree-cfg.c (gimple_can_duplicate_bb_p): ... here.

---
 gcc/tracer.c   | 66 +++---
 gcc/tree-cfg.c | 38 -
 2 files changed, 40 insertions(+), 64 deletions(-)

diff --git a/gcc/tracer.c b/gcc/tracer.c
index e1c2b9527e5..2f9daf92d79 100644
--- a/gcc/tracer.c
+++ b/gcc/tracer.c
@@ -84,65 +84,6 @@ bb_seen_p (basic_block bb)
   return bitmap_bit_p (bb_seen, bb->index);
 }
 
-/* Return true if gimple stmt G can be duplicated.  */
-static bool
-can_duplicate_insn_p (gimple *g)
-{
-  /* An IFN_GOMP_SIMT_ENTER_ALLOC/IFN_GOMP_SIMT_EXIT call must be
- duplicated as part of its group, or not at all.
- The IFN_GOMP_SIMT_VOTE_ANY and IFN_GOMP_SIMT_XCHG_* are part of such a
- group, so the same holds there.  */
-  if (is_gimple_call (g)
-  && (gimple_call_internal_p (g, IFN_GOMP_SIMT_ENTER_ALLOC)
-	  || gimple_call_internal_p (g, IFN_GOMP_SIMT_EXIT)
-	  || gimple_call_internal_p (g, IFN_GOMP_SIMT_VOTE_ANY)
-	  || gimple_call_internal_p (g, IFN_GOMP_SIMT_XCHG_BFLY)
-	  || gimple_call_internal_p (g, IFN_GOMP_SIMT_XCHG_IDX)))
-return false;
-
-  return true;
-}
-
-/* Return true if BB can be duplicated.  Avoid iterating over the insns.  */
-static bool
-can_duplicate_bb_no_insn_iter_p (const_basic_block bb)
-{
-  if (bb->index < NUM_FIXED_BLOCKS)
-return false;
-
-  if (gimple *g = last_stmt (CONST_CAST_BB (bb)))
-{
-  /* A transaction is a single entry multiple exit region.  It
-	 must be duplicated in its entirety or not at all.  */
-  if (gimple_code (g) == GIMPLE_TRANSACTION)
-	return false;
-
-  /* An IFN_UNIQUE call must be duplicated as part of its group,
-	 or not at all.  */
-  if (is_gimple_call (g)
-	  && gimple_call_internal_p (g)
-	  && gimple_call_internal_unique_p (g))
-	return false;
-}
-
-  return true;
-}
-
-/* Return true if BB can be duplicated.  */
-static bool
-can_duplicate_bb_p (const_basic_block bb)
-{
-  if (!can_duplicate_bb_no_insn_iter_p (bb))
-return false;
-
-  for (gimple_stmt_iterator gsi = gsi_start_bb (CONST_CAST_BB (bb));
-   !gsi_end_p (gsi); gsi_next ())
-if (!can_duplicate_insn_p (gsi_stmt (gsi)))
-  return false;
-
-  return true;
-}
-
 static sbitmap can_duplicate_bb;
 
 /* Cache VAL as value of can_duplicate_bb_p for BB.  */
@@ -167,7 +108,7 @@ cached_can_duplicate_bb_p (const_basic_block bb)
   return false;
 }
 
-  return can_duplicate_bb_p (bb);
+  return can_duplicate_block_p (bb);
 }
 
 /* Return true if we should ignore the basic block for purposes of tracing.  */
@@ -190,16 +131,15 @@ analyze_bb (basic_block bb, int *count)
   gimple_stmt_iterator gsi;
   gimple *stmt;
   int n = 0;
-  bool can_duplicate = can_duplicate_bb_no_insn_iter_p (bb);
 
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
 {
   stmt = gsi_stmt (gsi);
   n += estimate_num_insns (stmt, _size_weights);
-  can_duplicate = can_duplicate && can_duplicate_insn_p (stmt);
 }
   *count = n;
-  cache_can_duplicate_bb_p (bb, can_duplicate);
+
+  cache_can_duplicate_bb_p (bb, can_duplicate_block_p (CONST_CAST_BB (bb)));
 }
 
 /* Return true if E1 is more frequent than E2.  */
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 5caf3b62d69..002560d9370 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -6211,8 +6211,44 @@ gimple_split_block_before_cond_jump (basic_block bb)
 /* Return 

Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Hongtao Liu via Gcc-patches
On Wed, Oct 14, 2020 at 5:21 PM Uros Bizjak  wrote:
>
> On Wed, Oct 14, 2020 at 11:04 AM Hongyu Wang  wrote:
> >
> >
> >
> > Uros Bizjak  于2020年10月14日周三 下午4:42写道:
> >>
> >> On Wed, Oct 14, 2020 at 10:34 AM Hongyu Wang  
> >> wrote:
> >> >
> >> > >
> >> > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> >> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics
> >> > > header.
> >> > >
> >> >
> >> > Thanks for your review. We found that without adding -muintr, the 
> >> > intrinsics header could also be tested. Make-check for these file all 
> >> > get passed.
> >> >
> >> > And there is no intrinsic/builtin with const int parameter. So we remove 
> >> > -muintr from these files.
> >>
> >> Can your double check that relevant instructions are indeed generated?
> >> Without -muintr, relevant patterns in i386.md are effectively blocked,
> >> and perhaps a call to __builtin_ia32_* is generated instead.
> >
> >
> > Yes, in sse-14.s we have
> >
> > _clui:
> > .LFB136:
> > .cfi_startproc
> > pushq   %rbp
> > .cfi_def_cfa_offset 16
> > .cfi_offset 6, -16
> > movq%rsp, %rbp
> > .cfi_def_cfa_register 6
> > clui
> > nop
> > popq%rbp
> > .cfi_def_cfa 7, 8
> > ret
> > .cfi_endproc
>
> Strange, without -muintr, it should not be generated, and some error
> about failed inlining due to target specific option mismatch shoul be
> emitted.
>
> Can you please investigate this a bit more?
>

Because of function target attribute?

> Uros.



-- 
BR,
Hongtao


Re: [PATCH] i386: Improve chaining of _{addcarry, subborrow}_u{32, 64} [PR97387]

2020-10-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Oct 14, 2020 at 11:22:48AM +0200, Richard Biener wrote:
> > +  if (mode == CCCmode
> > + && GET_CODE (XEXP (x, 0)) == NEG
> > + && GET_CODE (XEXP (XEXP (x, 0), 0)) == GEU
> > + && REG_P (XEXP (XEXP (XEXP (x, 0), 0), 0))
> > + && (GET_MODE (XEXP (XEXP (XEXP (x, 0), 0), 0)) == CCCmode
> > + || GET_MODE (XEXP (XEXP (XEXP (x, 0), 0), 0)) == CCmode)
> > + && REGNO (XEXP (XEXP (XEXP (x, 0), 0), 0)) == FLAGS_REG
> > + && XEXP (XEXP (XEXP (x, 0), 0), 1) == const0_rtx
> > + && GET_CODE (XEXP (x, 1)) == LTU
> > + && REG_P (XEXP (XEXP (x, 1), 0))
> > + && (GET_MODE (XEXP (XEXP (x, 1), 0))
> > + == GET_MODE (XEXP (XEXP (XEXP (x, 0), 0), 0)))
> > + && REGNO (XEXP (XEXP (x, 1), 0)) == FLAGS_REG
> > + && XEXP (XEXP (x, 1), 1) == const0_rtx)
> 
> Meh ;)  templates to the rescue?
> 
>   rtx_match < un > ().matches (x)
> 
> and with fancy metaprogramming expand it to above?  Not sure if it's easier
> to read that way.  Maybe

It would certainly not match the style used elsewhere in the backend.
> 
>   rtx neg, geu;
>   if (mode == CCCmode
>   && (neg = XEXP (x, 0), GET_CODE (neg) == NEG)
>   && (geu = XEXP (neg, 0), GET_CODE (geu) == GEU)
> ...
> 
> or
> 
>   if (mode == CCCmode
>   && GET_CODE (neg = XEXP (x, 0)) == NEG
> 
> thus some manual CSE and naming in this matching would help?

Attached are two incremental patches, one just adds op0 and op1 for the
COMPARE operand of all the costs COMPARE handling, which replaces all the
XEXP (x, 0) with op0 and XEXP (x, 1) with op1, the other is that plus
the geu you've suggested.

Jakub
--- gcc/config/i386/i386.c.jj   2020-10-14 11:31:38.0 +0200
+++ gcc/config/i386/i386.c  2020-10-14 11:37:02.843258215 +0200
@@ -19765,44 +19765,44 @@ ix86_rtx_costs (rtx x, machine_mode mode
   return false;
 
 case COMPARE:
-  if (GET_CODE (XEXP (x, 0)) == ZERO_EXTRACT
- && XEXP (XEXP (x, 0), 1) == const1_rtx
- && CONST_INT_P (XEXP (XEXP (x, 0), 2))
- && XEXP (x, 1) == const0_rtx)
+  rtx op0, op1;
+  op0 = XEXP (x, 0);
+  op1 = XEXP (x, 1);
+  if (GET_CODE (op0) == ZERO_EXTRACT
+ && XEXP (op0, 1) == const1_rtx
+ && CONST_INT_P (XEXP (op0, 2))
+ && op1 == const0_rtx)
{
  /* This kind of construct is implemented using test[bwl].
 Treat it as if we had an AND.  */
- mode = GET_MODE (XEXP (XEXP (x, 0), 0));
+ mode = GET_MODE (XEXP (op0, 0));
  *total = (cost->add
-   + rtx_cost (XEXP (XEXP (x, 0), 0), mode, outer_code,
+   + rtx_cost (XEXP (op0, 0), mode, outer_code,
opno, speed)
+ rtx_cost (const1_rtx, mode, outer_code, opno, speed));
  return true;
}
 
-  if (GET_CODE (XEXP (x, 0)) == PLUS
- && rtx_equal_p (XEXP (XEXP (x, 0), 0), XEXP (x, 1)))
+  if (GET_CODE (op0) == PLUS && rtx_equal_p (XEXP (op0, 0), op1))
{
  /* This is an overflow detection, count it as a normal compare.  */
- *total = rtx_cost (XEXP (x, 0), GET_MODE (XEXP (x, 0)),
-COMPARE, 0, speed);
+ *total = rtx_cost (op0, GET_MODE (op0), COMPARE, 0, speed);
  return true;
}
 
   if (mode == CCCmode
- && GET_CODE (XEXP (x, 0)) == NEG
- && GET_CODE (XEXP (XEXP (x, 0), 0)) == GEU
- && REG_P (XEXP (XEXP (XEXP (x, 0), 0), 0))
- && (GET_MODE (XEXP (XEXP (XEXP (x, 0), 0), 0)) == CCCmode
- || GET_MODE (XEXP (XEXP (XEXP (x, 0), 0), 0)) == CCmode)
- && REGNO (XEXP (XEXP (XEXP (x, 0), 0), 0)) == FLAGS_REG
- && XEXP (XEXP (XEXP (x, 0), 0), 1) == const0_rtx
- && GET_CODE (XEXP (x, 1)) == LTU
- && REG_P (XEXP (XEXP (x, 1), 0))
- && (GET_MODE (XEXP (XEXP (x, 1), 0))
- == GET_MODE (XEXP (XEXP (XEXP (x, 0), 0), 0)))
- && REGNO (XEXP (XEXP (x, 1), 0)) == FLAGS_REG
- && XEXP (XEXP (x, 1), 1) == const0_rtx)
+ && GET_CODE (op0) == NEG
+ && GET_CODE (XEXP (op0, 0)) == GEU
+ && REG_P (XEXP (XEXP (op0, 0), 0))
+ && (GET_MODE (XEXP (XEXP (op0, 0), 0)) == CCCmode
+ || GET_MODE (XEXP (XEXP (op0, 0), 0)) == CCmode)
+ && REGNO (XEXP (XEXP (op0, 0), 0)) == FLAGS_REG
+ && XEXP (XEXP (op0, 0), 1) == const0_rtx
+ && GET_CODE (op1) == LTU
+ && REG_P (XEXP (op1, 0))
+ && GET_MODE (XEXP (op1, 0)) == GET_MODE (XEXP (XEXP (op0, 0), 0))
+ && REGNO (XEXP (op1, 0)) == FLAGS_REG
+ && XEXP (op1, 1) == const0_rtx)
{
  /* This is *setcc_qi_addqi3_cconly_overflow_1_* patterns, a nop.  */
  *total = 0;
@@ -19810,8 +19810,7 @@ ix86_rtx_costs (rtx x, machine_mode mode
}
 
   /* The embedded comparison operand is completely free.  */
-  if (!general_operand (XEXP (x, 

Re: [PATCH 4/X] libsanitizer: options: Add hwasan flags and argument parsing

2020-10-14 Thread Richard Sandiford via Gcc-patches
Matthew Malcomson  writes:
> ### Attachment also inlined for ease of reply
> ###
>
>
> diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> index 
> 372148315389db6671dfd943fd1a68670fcb1cbc..f8bf165aa48b5709c26f4e8245e5ab929b44fca6
>  100644
> --- a/gcc/c-family/c-attribs.c
> +++ b/gcc/c-family/c-attribs.c
> @@ -54,6 +54,8 @@ static tree handle_cold_attribute (tree *, tree, tree, int, 
> bool *);
>  static tree handle_no_sanitize_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_no_sanitize_address_attribute (tree *, tree, tree,
> int, bool *);
> +static tree handle_no_sanitize_hwaddress_attribute (tree *, tree, tree,
> + int, bool *);
>  static tree handle_no_sanitize_thread_attribute (tree *, tree, tree,
>int, bool *);
>  static tree handle_no_address_safety_analysis_attribute (tree *, tree, tree,
> @@ -412,6 +414,8 @@ const struct attribute_spec c_common_attribute_table[] =
> handle_no_sanitize_attribute, NULL },
>{ "no_sanitize_address",0, 0, true, false, false, false,
> handle_no_sanitize_address_attribute, NULL },
> +  { "no_sanitize_hwaddress",0, 0, true, false, false, false,
> +   handle_no_sanitize_hwaddress_attribute, NULL },
>{ "no_sanitize_thread", 0, 0, true, false, false, false,
> handle_no_sanitize_thread_attribute, NULL },
>{ "no_sanitize_undefined",  0, 0, true, false, false, false,
> @@ -946,6 +950,22 @@ handle_no_sanitize_address_attribute (tree *node, tree 
> name, tree, int,
>return NULL_TREE;
>  }
>  
> +/* Handle a "no_sanitize_hwaddress" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_no_sanitize_hwaddress_attribute (tree *node, tree name, tree, int,
> +   bool *no_add_attrs)
> +{
> +  *no_add_attrs = true;
> +  if (TREE_CODE (*node) != FUNCTION_DECL)
> +warning (OPT_Wattributes, "%qE attribute ignored", name);
> +  else
> +add_no_sanitize_value (*node, SANITIZE_HWADDRESS);
> +
> +  return NULL_TREE;
> +}
> +
>  /* Handle a "no_sanitize_thread" attribute; arguments as in
> struct attribute_spec.handler.  */
>  

Although the Clang design page mentions this attribute, it doesn't look
like it was ever committed upstream (unless I'm missing something).
Clang instead seems to require:

  __attribute__((no_sanitize("hwaddress")))
  __attribute__((no_sanitize("kernel-hwaddress")))

That seems more scalable than adding an extra attribute for each new
sanitiser, and of course is what this patch also supports.

So it might be better to drop this and just stick with what Clang provides.

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 
> ba18e05fb1abd0034afb73fd4a20feac27133149..97a5a532e31a9cea20955863bf6d2c8911a8e869
>  100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13898,13 +13898,34 @@ more details.  The run-time behavior can be 
> influenced using the
>  the available options are shown at startup of the instrumented program.  See
>  
> @url{https://github.com/google/sanitizers/wiki/AddressSanitizerFlags#run-time-flags}
>  for a list of supported options.
> -The option cannot be combined with @option{-fsanitize=thread}.
> +The option cannot be combined with @option{-fsanitize=thread} or
> +@option{-fsanitize=hwaddress}.
>  
>  @item -fsanitize=kernel-address
>  @opindex fsanitize=kernel-address
>  Enable AddressSanitizer for Linux kernel.
>  See @uref{https://github.com/google/kasan/wiki} for more details.
>  
> +@item -fsanitize=hwaddress
> +@opindex fsanitize=hwaddress
> +Enable HardwareAddressSanitizer, a fast memory error detector.

Is HardwareAddressSanitizer an established shorthand?  All the references
I could see instead referred to it as “Hardware-assisted AddressSanitizer”,
which seems a bit more descriptive.

It would be good to expand on “a fast memory error detector“ a bit.
I was wondering about something like “, which uses dedicated features
of the target hardware to accelerate the detection of memory errors”.

> +Memory access instructions are instrumented to detect out-of-bounds and
> +use-after-free bugs.
> +The option enables @option{-fsanitize-address-use-after-scope}.
> +See
> +@uref{https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html}
> +for more details.  The run-time behavior can be influenced using the
> +@env{HWASAN_OPTIONS} environment variable.  When set to @code{help=1},
> +the available options are shown at startup of the instrumented program.
> +The option cannot be combined with @option{-fsanitize=thread} or
> +@option{-fsanitize=address}.

I think it would be worth emphasising that this option is only supported
on AArch64.

> +
> +@item -fsanitize=kernel-hwaddress
> 

Re: Support ofsetted parameters in local modref

2020-10-14 Thread Richard Biener
On Wed, 14 Oct 2020, Jan Hubicka wrote:

> Hi,
> here is updated patch with cap on number of iterations.
> I set the limit to 8 and bootstrapped it with additional assert that the
> limit is not met, it did not fire.
> 
> Bootstrapped/regtested x86_64-linux, OK?

OK.

Richard.

> gcc/ChangeLog:
> 
> 2020-10-14  Jan Hubicka  
> 
>   * doc/invoke.texi: (ipa-jump-function-lookups): Document param.
>   * ipa-modref.c (merge_call_side_effects): Use
>   unadjusted_ptr_and_unit_offset.
>   * ipa-prop.c (unadjusted_ptr_and_unit_offset): New function.
>   * ipa-prop.h (unadjusted_ptr_and_unit_offset): Declare.
>   * params.opt: (-param-ipa-jump-function-lookups): New.
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index c8281ecf502..47aa69530ab 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13456,6 +13456,9 @@ loop in the loop nest by a given number of 
> iterations.  The strip
>  length can be changed using the @option{loop-block-tile-size}
>  parameter.
>  
> +@item ipa-jump-function-lookups
> +Specifies number of statements visited during jump function offset discovery.
> +
>  @item ipa-cp-value-list-size
>  IPA-CP attempts to track all possible values and types passed to a function's
>  parameter in order to propagate them and perform devirtualization.
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index 771a0a88f9a..a6dfe1fc401 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -531,6 +531,10 @@ merge_call_side_effects (modref_summary *cur_summary,
>for (unsigned i = 0; i < gimple_call_num_args (stmt); i++)
>  {
>tree op = gimple_call_arg (stmt, i);
> +  bool offset_known;
> +  poly_int64 offset;
> +
> +  offset_known = unadjusted_ptr_and_unit_offset (op, , );
>if (TREE_CODE (op) == SSA_NAME
> && SSA_NAME_IS_DEFAULT_DEF (op)
> && TREE_CODE (SSA_NAME_VAR (op)) == PARM_DECL)
> @@ -547,15 +551,23 @@ merge_call_side_effects (modref_summary *cur_summary,
> index++;
>   }
> parm_map[i].parm_index = index;
> -   parm_map[i].parm_offset_known = true;
> -   parm_map[i].parm_offset = 0;
> +   parm_map[i].parm_offset_known = offset_known;
> +   parm_map[i].parm_offset = offset;
>   }
>else if (points_to_local_or_readonly_memory_p (op))
>   parm_map[i].parm_index = -2;
>else
>   parm_map[i].parm_index = -1;
>if (dump_file)
> - fprintf (dump_file, " %i", parm_map[i].parm_index);
> + {
> +   fprintf (dump_file, " %i", parm_map[i].parm_index);
> +   if (parm_map[i].parm_offset_known)
> + {
> +   fprintf (dump_file, " offset:");
> +   print_dec ((poly_int64_pod)parm_map[i].parm_offset,
> +  dump_file, SIGNED);
> + }
> + }
>  }
>if (dump_file)
>  fprintf (dump_file, "\n");
> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> index 2d09d913051..cf3da6a6568 100644
> --- a/gcc/ipa-prop.c
> +++ b/gcc/ipa-prop.c
> @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "domwalk.h"
>  #include "builtins.h"
>  #include "tree-cfgcleanup.h"
> +#include "options.h"
>  
>  /* Function summary where the parameter infos are actually stored. */
>  ipa_node_params_t *ipa_node_params_sum = NULL;
> @@ -1222,6 +1223,73 @@ load_from_unmodified_param_or_agg (struct 
> ipa_func_body_info *fbi,
>return index;
>  }
>  
> +/* Walk pointer adjustemnts from OP (such as POINTER_PLUS and ADDR_EXPR)
> +   to find original pointer.  Initialize RET to the pointer which results 
> from
> +   the walk.
> +   If offset is known return true and initialize OFFSET_RET.  */
> +
> +bool
> +unadjusted_ptr_and_unit_offset (tree op, tree *ret, poly_int64 *offset_ret)
> +{
> +  poly_int64 offset = 0;
> +  bool offset_known = true;
> +  int i;
> +
> +  for (i = 0; i < param_ipa_jump_function_lookups; i++)
> +{
> +  if (TREE_CODE (op) == ADDR_EXPR)
> + {
> +   poly_int64 extra_offset = 0;
> +   tree base = get_addr_base_and_unit_offset (TREE_OPERAND (op, 0),
> +  );
> +   if (!base)
> + {
> +   base = get_base_address (TREE_OPERAND (op, 0));
> +   if (TREE_CODE (base) != MEM_REF)
> + break;
> +   offset_known = false;
> + }
> +   else
> + {
> +   if (TREE_CODE (base) != MEM_REF)
> + break;
> +   offset += extra_offset;
> + }
> +   op = TREE_OPERAND (base, 0);
> +   if (mem_ref_offset (base).to_shwi (_offset))
> + offset += extra_offset;
> +   else
> + offset_known = false;
> + }
> +  else if (TREE_CODE (op) == SSA_NAME
> +&& !SSA_NAME_IS_DEFAULT_DEF (op))
> + {
> +   gimple *pstmt = SSA_NAME_DEF_STMT (op);
> +
> +   if (gimple_assign_single_p (pstmt))
> + op = gimple_assign_rhs1 (pstmt);
> +   else if 

Re: [PATCH] i386: Improve chaining of _{addcarry, subborrow}_u{32, 64} [PR97387]

2020-10-14 Thread Richard Biener via Gcc-patches
On Wed, Oct 14, 2020 at 11:01 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> These builtins have two known issues and this patch fixes one of them.
>
> One issue is that the builtins effectively return two results and
> they make the destination addressable until expansion, which means
> a stack slot is allocated for them and e.g. with -fstack-protector*
> DSE isn't able to optimize that away.  I think for that we want to use
> the technique of returning complex value; the patch doesn't handle that
> though.  See PR93990 for that.
>
> The other problem is optimization of successive uses of the builtin
> e.g. for arbitrary precision arithmetic additions/subtractions.
> As shown PR93990, combine is able to optimize the case when the first
> argument to these builtins is 0 (the first instance when several are used
> together), and also the last one if the last one ignores its result (i.e.
> the carry/borrow is dead and thrown away in that case).
> As shown in this PR, combiner refuses to optimize the rest, where it sees:
> (insn 10 9 11 2 (set (reg:QI 88 [ _31 ])
> (ltu:QI (reg:CCC 17 flags)
> (const_int 0 [0]))) "include/adxintrin.h":69:10 785 {*setcc_qi}
>  (expr_list:REG_DEAD (reg:CCC 17 flags)
> (nil)))
> - set pseudo 88 to CF from flags, then some uninteresting insns that
> don't modify flags, and finally:
> (insn 17 15 18 2 (parallel [
> (set (reg:CCC 17 flags)
> (compare:CCC (plus:QI (reg:QI 88 [ _31 ])
> (const_int -1 [0x]))
> (reg:QI 88 [ _31 ])))
> (clobber (scratch:QI))
> ]) "include/adxintrin.h":69:10 350 {*addqi3_cconly_overflow_1}
>  (expr_list:REG_DEAD (reg:QI 88 [ _31 ])
> (nil)))
> to set CF in flags back to what we saved earlier.  The combiner just punts
> trying to combine the 10, 17 and following addcarrydi (etc.) instruction,
> because
>   if (i1 && !can_combine_p (i1, i3, i0, NULL, i2, NULL, , ))
> {
>   if (dump_file && (dump_flags & TDF_DETAILS))
> fprintf (dump_file, "Can't combine i1 into i3\n");
>   undo_all ();
>   return 0;
> }
> fails - the 3 insns aren't all adjacent and
>   || (! all_adjacent
>   && (((!MEM_P (src)
> || ! find_reg_note (insn, REG_EQUIV, src))
>&& modified_between_p (src, insn, i3))
> src (flags hard register) is modified between the first and third insn - in
> the second insn.
>
> The following patch optimizes this by optimizing just the two insns,
> 10 and 17 above, i.e. save CF into pseudo, set CF from that pseudo, into
> a nop.  The new define_insn_and_split matches how combine simplifies those
> two together (except without the ix86_cc_mode change it was choosing CCmode
> for the destination instead of CCCmode, so had to change that function too,
> and also adjust costs so that combiner understand it is beneficial).
>
> With this, all the testcases are optimized, so that the:
> setc%dl
> ...
> addb$-1, %dl
> insns in between the ad[dc][lq] or s[ub]b[lq] instructions are all optimized
> away (sure, if something would clobber flags in between they wouldn't, but
> there is nothing that can be done about that).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2020-10-14  Jakub Jelinek  
>
> PR target/97387
> * config/i386/i386.md (CC_CCC): New mode iterator.
> (*setcc_qi_addqi3_cconly_overflow_1_): New
> define_insn_and_split.
> * config/i386/i386.c (ix86_cc_mode): Return CCCmode for
> *setcc_qi_addqi3_cconly_overflow_1_ pattern operands.
> (ix86_rtx_costs): Return true and *total = 0; for
> *setcc_qi_addqi3_cconly_overflow_1_ pattern.
>
> * gcc.target/i386/pr97387-1.c: New test.
> * gcc.target/i386/pr97387-2.c: New test.
>
> --- gcc/config/i386/i386.md.jj  2020-10-01 10:40:09.955758167 +0200
> +++ gcc/config/i386/i386.md 2020-10-13 13:38:24.644980815 +0200
> @@ -7039,6 +7039,20 @@ (define_expand "subborrow_0"
>(set (match_operand:SWI48 0 "register_operand")
>(minus:SWI48 (match_dup 1) (match_dup 2)))])]
>"ix86_binary_operator_ok (MINUS, mode, operands)")
> +
> +(define_mode_iterator CC_CCC [CC CCC])
> +
> +;; Pre-reload splitter to optimize
> +;; *setcc_qi followed by *addqi3_cconly_overflow_1 with the same QI
> +;; operand and no intervening flags modifications into nothing.
> +(define_insn_and_split "*setcc_qi_addqi3_cconly_overflow_1_"
> +  [(set (reg:CCC FLAGS_REG)
> +   (compare:CCC (neg:QI (geu:QI (reg:CC_CCC FLAGS_REG) (const_int 0)))
> +(ltu:QI (reg:CC_CCC FLAGS_REG) (const_int 0]
> +  "ix86_pre_reload_split ()"
> +  "#"
> +  "&& 1"
> +  [(const_int 0)])
>
>  ;; Overflow setting add instructions
>
> --- gcc/config/i386/i386.c.jj   2020-10-01 10:40:09.951758225 +0200
> +++ gcc/config/i386/i386.c  2020-10-13 13:40:20.471300518 +0200
> @@ 

Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 11:04 AM Hongyu Wang  wrote:
>
>
>
> Uros Bizjak  于2020年10月14日周三 下午4:42写道:
>>
>> On Wed, Oct 14, 2020 at 10:34 AM Hongyu Wang  wrote:
>> >
>> > >
>> > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
>> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics
>> > > header.
>> > >
>> >
>> > Thanks for your review. We found that without adding -muintr, the 
>> > intrinsics header could also be tested. Make-check for these file all get 
>> > passed.
>> >
>> > And there is no intrinsic/builtin with const int parameter. So we remove 
>> > -muintr from these files.
>>
>> Can your double check that relevant instructions are indeed generated?
>> Without -muintr, relevant patterns in i386.md are effectively blocked,
>> and perhaps a call to __builtin_ia32_* is generated instead.
>
>
> Yes, in sse-14.s we have
>
> _clui:
> .LFB136:
> .cfi_startproc
> pushq   %rbp
> .cfi_def_cfa_offset 16
> .cfi_offset 6, -16
> movq%rsp, %rbp
> .cfi_def_cfa_register 6
> clui
> nop
> popq%rbp
> .cfi_def_cfa 7, 8
> ret
> .cfi_endproc

Strange, without -muintr, it should not be generated, and some error
about failed inlining due to target specific option mismatch shoul be
emitted.

Can you please investigate this a bit more?

Uros.


RE: [PATCH][Arm] Auto-vectorization for MVE: vmin/vmax

2020-10-14 Thread Kyrylo Tkachov via Gcc-patches
Hi Dennis,

> -Original Message-
> From: Dennis Zhang 
> Sent: 06 October 2020 17:59
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; nd ;
> Richard Earnshaw ; Ramana Radhakrishnan
> 
> Subject: [PATCH][Arm] Auto-vectorization for MVE: vmin/vmax
> 
> Hi all,
> 
> This patch enables MVE vmin/vmax instructions for auto-vectorization.
> MVE target is included in expander smin3, umin3,
> smax3
> and umax3 for vectorization.
> Related insns for vmin/vmax in mve.md are modified to use smin, umin,
> smax and umax expressions instead of unspec to support the expanders.
> 
> Regression tested on arm-none-eabi and bootstraped on
> arm-none-linux-gnueabihf.
> 
> Is it OK for trunk please?

Ok.
Thanks,
Kyrill

> 
> Thanks
> Dennis
> 
> gcc/ChangeLog:
> 
> 2020-10-02  Dennis Zhang  
> 
> * config/arm/mve.md (mve_vmaxq_): Replace with ...
> (mve_vmaxq_s, mve_vmaxq_u): ... these new insns to
> use smax/umax instead of VMAXQ.
> (mve_vminq_): Replace with ...
> (mve_vminq_s, mve_vminq_u): ... these new insns to
> use smin/umin instead of VMINQ.
> (mve_vmaxnmq_f): Use smax instead of VMAXNMQ_F.
> (mve_vminnmq_f): Use smin instead of VMINNMQ_F.
> * config/arm/vec-common.md (smin3): Use the new mode macros
> ARM_HAVE__ARITH.
> (umin3, smax3, umax3): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-10-02  Dennis Zhang  
> 
> * gcc.target/arm/simd/mve-vminmax_1.c: New test.



RE: [PATCH][Arm] Auto-vectorization for MVE: vmul

2020-10-14 Thread Kyrylo Tkachov via Gcc-patches
Hi Dennis,

> -Original Message-
> From: Dennis Zhang 
> Sent: 06 October 2020 17:55
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; nd ;
> Richard Earnshaw ; Ramana Radhakrishnan
> 
> Subject: [PATCH][Arm] Auto-vectorization for MVE: vmul
> 
> Hi all,
> 
> This patch enables MVE vmul instructions for auto-vectorization.
> It includes MVE in expander mul3 to enable vectorization for MVE
> and modifies related vmul insns to support the expander by using 'mult'
> instead of unspec.
> The mul3 for vectorization in vec-common.md uses mode iterator
> VDQWH instead of VALLW to cover all supported modes.
> The macros ARM_HAVE__ARITH are used to select supported
> modes for
> different targets. The redundant mul3 in neon.md is removed.
> 
> Regression tested on arm-none-eabi and bootstraped on
> arm-none-linux-gnueabihf.
> 
> Is it OK for trunk please?

Ok, thank you for your patience.
Kyrill

> 
> Thanks
> Dennis
> 
> gcc/ChangeLog:
> 
> 2020-10-02  Dennis Zhang  
> 
> * config/arm/mve.md (mve_vmulq): New entry for vmul instruction
> using expression 'mult'.
> (mve_vmulq_f): Use mult instead of VMULQ_F.
> * config/arm/neon.md (mul3): Removed.
> * config/arm/vec-common.md (mul3): Use the new mode macros
> ARM_HAVE__ARITH. Use mode iterator VDQWH instead of VALLW.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-10-02  Dennis Zhang  
> 
> * gcc.target/arm/simd/mve-vmul_1.c: New test.



Re: Support ofsetted parameters in local modref

2020-10-14 Thread Jan Hubicka
Hi,
here is updated patch with cap on number of iterations.
I set the limit to 8 and bootstrapped it with additional assert that the
limit is not met, it did not fire.

Bootstrapped/regtested x86_64-linux, OK?

gcc/ChangeLog:

2020-10-14  Jan Hubicka  

* doc/invoke.texi: (ipa-jump-function-lookups): Document param.
* ipa-modref.c (merge_call_side_effects): Use
unadjusted_ptr_and_unit_offset.
* ipa-prop.c (unadjusted_ptr_and_unit_offset): New function.
* ipa-prop.h (unadjusted_ptr_and_unit_offset): Declare.
* params.opt: (-param-ipa-jump-function-lookups): New.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c8281ecf502..47aa69530ab 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13456,6 +13456,9 @@ loop in the loop nest by a given number of iterations.  
The strip
 length can be changed using the @option{loop-block-tile-size}
 parameter.
 
+@item ipa-jump-function-lookups
+Specifies number of statements visited during jump function offset discovery.
+
 @item ipa-cp-value-list-size
 IPA-CP attempts to track all possible values and types passed to a function's
 parameter in order to propagate them and perform devirtualization.
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index 771a0a88f9a..a6dfe1fc401 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -531,6 +531,10 @@ merge_call_side_effects (modref_summary *cur_summary,
   for (unsigned i = 0; i < gimple_call_num_args (stmt); i++)
 {
   tree op = gimple_call_arg (stmt, i);
+  bool offset_known;
+  poly_int64 offset;
+
+  offset_known = unadjusted_ptr_and_unit_offset (op, , );
   if (TREE_CODE (op) == SSA_NAME
  && SSA_NAME_IS_DEFAULT_DEF (op)
  && TREE_CODE (SSA_NAME_VAR (op)) == PARM_DECL)
@@ -547,15 +551,23 @@ merge_call_side_effects (modref_summary *cur_summary,
  index++;
}
  parm_map[i].parm_index = index;
- parm_map[i].parm_offset_known = true;
- parm_map[i].parm_offset = 0;
+ parm_map[i].parm_offset_known = offset_known;
+ parm_map[i].parm_offset = offset;
}
   else if (points_to_local_or_readonly_memory_p (op))
parm_map[i].parm_index = -2;
   else
parm_map[i].parm_index = -1;
   if (dump_file)
-   fprintf (dump_file, " %i", parm_map[i].parm_index);
+   {
+ fprintf (dump_file, " %i", parm_map[i].parm_index);
+ if (parm_map[i].parm_offset_known)
+   {
+ fprintf (dump_file, " offset:");
+ print_dec ((poly_int64_pod)parm_map[i].parm_offset,
+dump_file, SIGNED);
+   }
+   }
 }
   if (dump_file)
 fprintf (dump_file, "\n");
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 2d09d913051..cf3da6a6568 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "domwalk.h"
 #include "builtins.h"
 #include "tree-cfgcleanup.h"
+#include "options.h"
 
 /* Function summary where the parameter infos are actually stored. */
 ipa_node_params_t *ipa_node_params_sum = NULL;
@@ -1222,6 +1223,73 @@ load_from_unmodified_param_or_agg (struct 
ipa_func_body_info *fbi,
   return index;
 }
 
+/* Walk pointer adjustemnts from OP (such as POINTER_PLUS and ADDR_EXPR)
+   to find original pointer.  Initialize RET to the pointer which results from
+   the walk.
+   If offset is known return true and initialize OFFSET_RET.  */
+
+bool
+unadjusted_ptr_and_unit_offset (tree op, tree *ret, poly_int64 *offset_ret)
+{
+  poly_int64 offset = 0;
+  bool offset_known = true;
+  int i;
+
+  for (i = 0; i < param_ipa_jump_function_lookups; i++)
+{
+  if (TREE_CODE (op) == ADDR_EXPR)
+   {
+ poly_int64 extra_offset = 0;
+ tree base = get_addr_base_and_unit_offset (TREE_OPERAND (op, 0),
+);
+ if (!base)
+   {
+ base = get_base_address (TREE_OPERAND (op, 0));
+ if (TREE_CODE (base) != MEM_REF)
+   break;
+ offset_known = false;
+   }
+ else
+   {
+ if (TREE_CODE (base) != MEM_REF)
+   break;
+ offset += extra_offset;
+   }
+ op = TREE_OPERAND (base, 0);
+ if (mem_ref_offset (base).to_shwi (_offset))
+   offset += extra_offset;
+ else
+   offset_known = false;
+   }
+  else if (TREE_CODE (op) == SSA_NAME
+  && !SSA_NAME_IS_DEFAULT_DEF (op))
+   {
+ gimple *pstmt = SSA_NAME_DEF_STMT (op);
+
+ if (gimple_assign_single_p (pstmt))
+   op = gimple_assign_rhs1 (pstmt);
+ else if (is_gimple_assign (pstmt)
+  && gimple_assign_rhs_code (pstmt) == POINTER_PLUS_EXPR)
+   {
+ poly_int64 extra_offset = 0;
+ if (ptrdiff_tree_p (gimple_assign_rhs2 (pstmt),
+   

Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Hongyu Wang via Gcc-patches
Uros Bizjak  于2020年10月14日周三 下午4:53写道:
>
> On Wed, Oct 14, 2020 at 10:42 AM Uros Bizjak  wrote:
> >
> > On Wed, Oct 14, 2020 at 10:34 AM Hongyu Wang  wrote:
> > >
> > > >
> > > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> > > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics
> > > > header.
> > > >
> > >
> > > Thanks for your review. We found that without adding -muintr, the 
> > > intrinsics header could also be tested. Make-check for these file all get 
> > > passed.
> > >
> > > And there is no intrinsic/builtin with const int parameter. So we remove 
> > > -muintr from these files.
> >
> > Can your double check that relevant instructions are indeed generated?
> > Without -muintr, relevant patterns in i386.md are effectively blocked,
> > and perhaps a call to __builtin_ia32_* is generated instead.
>
> Ah, I see the issue.
>
> The new header should be tested via x86gprintrin-* test cases.
>
> Uros.

Also for x86gprintrin-3.s, without -muintr:

_clui:
.LFB128:
.cfi_startproc
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq%rsp, %rbp
.cfi_def_cfa_register 6
clui
nop
popq%rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc

-- 
Regards,

Hongyu, Wang


Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Hongyu Wang via Gcc-patches
Uros Bizjak  于2020年10月14日周三 下午4:42写道:

> On Wed, Oct 14, 2020 at 10:34 AM Hongyu Wang 
> wrote:
> >
> > >
> > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics
> > > header.
> > >
> >
> > Thanks for your review. We found that without adding -muintr, the
> intrinsics header could also be tested. Make-check for these file all get
> passed.
> >
> > And there is no intrinsic/builtin with const int parameter. So we remove
> -muintr from these files.
>
> Can your double check that relevant instructions are indeed generated?
> Without -muintr, relevant patterns in i386.md are effectively blocked,
> and perhaps a call to __builtin_ia32_* is generated instead.
>

Yes, in sse-14.s we have

_clui:
.LFB136:
.cfi_startproc
pushq   %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq%rsp, %rbp
.cfi_def_cfa_register 6
clui
nop
popq%rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc


>
> Uros.
>


-- 
Regards,

Hongyu, Wang


[PATCH] i386: Improve chaining of _{addcarry,subborrow}_u{32,64} [PR97387]

2020-10-14 Thread Jakub Jelinek via Gcc-patches
Hi!

These builtins have two known issues and this patch fixes one of them.

One issue is that the builtins effectively return two results and
they make the destination addressable until expansion, which means
a stack slot is allocated for them and e.g. with -fstack-protector*
DSE isn't able to optimize that away.  I think for that we want to use
the technique of returning complex value; the patch doesn't handle that
though.  See PR93990 for that.

The other problem is optimization of successive uses of the builtin
e.g. for arbitrary precision arithmetic additions/subtractions.
As shown PR93990, combine is able to optimize the case when the first
argument to these builtins is 0 (the first instance when several are used
together), and also the last one if the last one ignores its result (i.e.
the carry/borrow is dead and thrown away in that case).
As shown in this PR, combiner refuses to optimize the rest, where it sees:
(insn 10 9 11 2 (set (reg:QI 88 [ _31 ])
(ltu:QI (reg:CCC 17 flags)
(const_int 0 [0]))) "include/adxintrin.h":69:10 785 {*setcc_qi}
 (expr_list:REG_DEAD (reg:CCC 17 flags)
(nil)))
- set pseudo 88 to CF from flags, then some uninteresting insns that
don't modify flags, and finally:
(insn 17 15 18 2 (parallel [
(set (reg:CCC 17 flags)
(compare:CCC (plus:QI (reg:QI 88 [ _31 ])
(const_int -1 [0x]))
(reg:QI 88 [ _31 ])))
(clobber (scratch:QI))
]) "include/adxintrin.h":69:10 350 {*addqi3_cconly_overflow_1}
 (expr_list:REG_DEAD (reg:QI 88 [ _31 ])
(nil)))
to set CF in flags back to what we saved earlier.  The combiner just punts
trying to combine the 10, 17 and following addcarrydi (etc.) instruction,
because
  if (i1 && !can_combine_p (i1, i3, i0, NULL, i2, NULL, , ))
{
  if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "Can't combine i1 into i3\n");
  undo_all ();
  return 0;
}
fails - the 3 insns aren't all adjacent and
  || (! all_adjacent
  && (((!MEM_P (src)
|| ! find_reg_note (insn, REG_EQUIV, src))
   && modified_between_p (src, insn, i3))
src (flags hard register) is modified between the first and third insn - in
the second insn.

The following patch optimizes this by optimizing just the two insns,
10 and 17 above, i.e. save CF into pseudo, set CF from that pseudo, into
a nop.  The new define_insn_and_split matches how combine simplifies those
two together (except without the ix86_cc_mode change it was choosing CCmode
for the destination instead of CCCmode, so had to change that function too,
and also adjust costs so that combiner understand it is beneficial).

With this, all the testcases are optimized, so that the:
setc%dl
...
addb$-1, %dl
insns in between the ad[dc][lq] or s[ub]b[lq] instructions are all optimized
away (sure, if something would clobber flags in between they wouldn't, but
there is nothing that can be done about that).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-10-14  Jakub Jelinek  

PR target/97387
* config/i386/i386.md (CC_CCC): New mode iterator.
(*setcc_qi_addqi3_cconly_overflow_1_): New
define_insn_and_split.
* config/i386/i386.c (ix86_cc_mode): Return CCCmode for
*setcc_qi_addqi3_cconly_overflow_1_ pattern operands.
(ix86_rtx_costs): Return true and *total = 0; for
*setcc_qi_addqi3_cconly_overflow_1_ pattern.

* gcc.target/i386/pr97387-1.c: New test.
* gcc.target/i386/pr97387-2.c: New test.

--- gcc/config/i386/i386.md.jj  2020-10-01 10:40:09.955758167 +0200
+++ gcc/config/i386/i386.md 2020-10-13 13:38:24.644980815 +0200
@@ -7039,6 +7039,20 @@ (define_expand "subborrow_0"
   (set (match_operand:SWI48 0 "register_operand")
   (minus:SWI48 (match_dup 1) (match_dup 2)))])]
   "ix86_binary_operator_ok (MINUS, mode, operands)")
+
+(define_mode_iterator CC_CCC [CC CCC])
+
+;; Pre-reload splitter to optimize
+;; *setcc_qi followed by *addqi3_cconly_overflow_1 with the same QI
+;; operand and no intervening flags modifications into nothing.
+(define_insn_and_split "*setcc_qi_addqi3_cconly_overflow_1_"
+  [(set (reg:CCC FLAGS_REG)
+   (compare:CCC (neg:QI (geu:QI (reg:CC_CCC FLAGS_REG) (const_int 0)))
+(ltu:QI (reg:CC_CCC FLAGS_REG) (const_int 0]
+  "ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(const_int 0)])
 
 ;; Overflow setting add instructions
 
--- gcc/config/i386/i386.c.jj   2020-10-01 10:40:09.951758225 +0200
+++ gcc/config/i386/i386.c  2020-10-13 13:40:20.471300518 +0200
@@ -15136,6 +15136,22 @@ ix86_cc_mode (enum rtx_code code, rtx op
  && (rtx_equal_p (op1, XEXP (op0, 0))
  || rtx_equal_p (op1, XEXP (op0, 1
return CCCmode;
+  /* Similarly for *setcc_qi_addqi3_cconly_overflow_1_* patterns.  */
+  else if 

[PATCH v3] arm: subdivide the type attribute "alu_shfit_imm"

2020-10-14 Thread Qian, Jianhua
Hi Richard

Thanks for reviewing again.
I have updated the patch to v3.

Regards
Qian

> -Original Message-
> From: Richard Sandiford 
> Sent: Tuesday, October 13, 2020 4:00 PM
> To: Qian, Jianhua/钱 建华 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v2] arm: subdivide the type attribute
> "alu_shfit_imm"
> 
> Thanks, the new patch looks great.  One minor suggestion:
> 
> "Qian, Jianhua"  writes:
> > @@ -1106,7 +1125,45 @@
> >mve_move,\
> >mve_store,\
> >mve_load"
> > -   (const_string "untyped"))
> > +   (cond [(eq_attr "autodetect_type" "alu_shift_lsr_op2")
> > +(const_string "alu_shift_imm_other")
> > +  (eq_attr "autodetect_type" "alu_shift_asr_op2")
> > +(const_string "alu_shift_imm_other")
> 
> This can be combined into:
> 
> +   (cond [(eq_attr "autodetect_type" "alu_shift_lsr_op2,alu_shift_asr_op2")
> +(const_string "alu_shift_imm_other")
> 
> But I think the patch is good to go as-is.
> 
> Thanks,
> Richard
> 





0001-arm-aarch64-subdivide-the-type-attribute-alu_shfit_i.patch
Description: 0001-arm-aarch64-subdivide-the-type-attribute-alu_shfit_i.patch


Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 10:42 AM Uros Bizjak  wrote:
>
> On Wed, Oct 14, 2020 at 10:34 AM Hongyu Wang  wrote:
> >
> > >
> > > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> > > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics
> > > header.
> > >
> >
> > Thanks for your review. We found that without adding -muintr, the 
> > intrinsics header could also be tested. Make-check for these file all get 
> > passed.
> >
> > And there is no intrinsic/builtin with const int parameter. So we remove 
> > -muintr from these files.
>
> Can your double check that relevant instructions are indeed generated?
> Without -muintr, relevant patterns in i386.md are effectively blocked,
> and perhaps a call to __builtin_ia32_* is generated instead.

Ah, I see the issue.

The new header should be tested via x86gprintrin-* test cases.

Uros.


Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 14, 2020 at 10:34 AM Hongyu Wang  wrote:
>
> >
> > Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> > gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics
> > header.
> >
>
> Thanks for your review. We found that without adding -muintr, the intrinsics 
> header could also be tested. Make-check for these file all get passed.
>
> And there is no intrinsic/builtin with const int parameter. So we remove 
> -muintr from these files.

Can your double check that relevant instructions are indeed generated?
Without -muintr, relevant patterns in i386.md are effectively blocked,
and perhaps a call to __builtin_ia32_* is generated instead.

Uros.


Fix SCC discovery in ipa-modref

2020-10-14 Thread Jan Hubicka
Hi,
this patch fixes SCC discovery in ipa-modref which is causing misoptimization
of gnat bootstrapped with LTO, PGO and -O3.

I also improved debug info and spotted wrong parameter to ignore_stores_p
(which is probably quite harmless since we only inline matching functions, but
it is better to be consistent).

Bootstrapped/regtested x86_64-linux, will commit it shortly.

gcc/ChangeLog:

2020-10-14  Jan Hubicka  

PR bootstrap/97350
* ipa-modref.c (ignore_edge): Do not ignore inlined edes.
(ipa_merge_modref_summary_after_inlining): Improve debug output and
fix parameter of ignore_stores_p.

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index 4f86b9ccea1..771a0a88f9a 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -1603,6 +1603,11 @@ make_pass_ipa_modref (gcc::context *ctxt)
 static bool
 ignore_edge (struct cgraph_edge *e)
 {
+  /* We merge summaries of inline clones into summaries of functions they
+ are inlined to.  For that reason the complete function bodies must
+ act as unit.  */
+  if (!e->inline_failed)
+return false;
   enum availability avail;
   cgraph_node *callee = e->callee->function_or_virtual_thunk_symbol
  (, e->caller);
@@ -1723,7 +1728,7 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge 
*edge)
 
   if (!callee_info && to_info)
 {
-  if (ignore_stores_p (edge->callee->decl, flags))
+  if (ignore_stores_p (edge->caller->decl, flags))
to_info->loads->collapse ();
   else
{
@@ -1733,7 +1738,7 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge 
*edge)
 }
   if (!callee_info_lto && to_info_lto)
 {
-  if (ignore_stores_p (edge->callee->decl, flags))
+  if (ignore_stores_p (edge->caller->decl, flags))
to_info_lto->loads->collapse ();
   else
{
@@ -1747,7 +1752,7 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge 
*edge)
 
   compute_parm_map (edge, _map);
 
-  if (!ignore_stores_p (edge->callee->decl, flags))
+  if (!ignore_stores_p (edge->caller->decl, flags))
{
  if (to_info && callee_info)
to_info->stores->merge (callee_info->stores, _map);
@@ -1762,14 +1767,38 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge 
*edge)
   if (summaries)
 {
   if (to_info && !to_info->useful_p (flags))
-   summaries->remove (to);
+   {
+ if (dump_file)
+   fprintf (dump_file, "Removed mod-ref summary for %s\n",
+to->dump_name ());
+ summaries->remove (to);
+   }
+  else if (to_info && dump_file)
+   {
+ if (dump_file)
+   fprintf (dump_file, "Updated mod-ref summary for %s\n",
+to->dump_name ());
+ to_info->dump (dump_file);
+   }
   if (callee_info)
summaries->remove (edge->callee);
 }
   if (summaries_lto)
 {
   if (to_info_lto && !to_info_lto->useful_p (flags))
-   summaries_lto->remove (to);
+   {
+ if (dump_file)
+   fprintf (dump_file, "Removed mod-ref summary for %s\n",
+to->dump_name ());
+ summaries_lto->remove (to);
+   }
+  else if (to_info_lto && dump_file)
+   {
+ if (dump_file)
+   fprintf (dump_file, "Updated mod-ref summary for %s\n",
+to->dump_name ());
+ to_info_lto->dump (dump_file);
+   }
   if (callee_info_lto)
summaries_lto->remove (edge->callee);
 }


Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Hongyu Wang via Gcc-patches
>
> Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics
> header.
>

Thanks for your review. We found that without adding -muintr, the
intrinsics header could also be tested. Make-check for these file all get
passed.

And there is no intrinsic/builtin with const int parameter. So we remove
-muintr from these files.


Uros Bizjak  于2020年10月14日周三 下午2:18写道:

> On Tue, Oct 13, 2020 at 10:30 AM Hongyu Wang 
> wrote:
> >
> > Hi:
> >
> > This patch is about to support User Interrupt (UINTR) instructions.
> >
> > This feature defines user interrupts as new events in the architecture.
> They are delivered to software operating in 64-bit mode with CPL = 3
> without any change to segmentation state.
> >
> > For more details, please refer to
> https://software.intel.com/content/dam/develop/external/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf
> >
> > Bootstrap ok, regression test on i386/x86 backend is ok.
> >
> > OK for master?
> >
> > gcc/
> > * common/config/i386/cpuinfo.h (get_available_features):
> > Detect UINTR.
> > * common/config/i386/i386-common.c (OPTION_MASK_ISA2_UINTR_SET
> > OPTION_MASK_ISA2_UINTR_UNSET): New.
> > (ix86_handle_option): Handle -muintr.
> > * common/config/i386/i386-cpuinfo.h (enum processor_features):
> > Add FEATURE_UINTR.
> > * common/config/i386/i386-isas.h: Add ISA_NAMES_TABLE_ENTRY
> > for uintr.
> > * config.gcc: Add uintrintrin.h to extra_headers.
> > * config/i386/uintrintrin.h: New.
> > * config/i386/cpuid.h (bit_UINTR): New.
> > * config/i386/driver-i386.c (host_detect_local_cpu): Detect UINTR.
> > * config/i386/i386-builtin-types.def: Add new types.
> > * config/i386/i386-builtin.def: Add new builtins.
> > * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins): Add
> > __builtin_ia32_testui.
> > * config/i386/i386-builtins.h (ix86_builtins): Add
> > IX86_BUILTIN_TESTUI.
> > * config/i386/i386-c.c (ix86_target_macros_internal): Define
> > __UINTR__.
> > * config/i386/i386-expand.c (ix86_expand_special_args_builtin):
> > Handle UINT8_FTYPE_VOID.
> > (ix86_expand_builtin): Handle IX86_BUILTIN_TESTUI.
> > * config/i386/i386-options.c (isa2_opts): Add -muintr.
> > (ix86_valid_target_attribute_inner_p): Handle UINTR.
> > (ix86_option_override_internal): Add TARGET_64BIT check for UINTR.
> > * config/i386/i386.h (TARGET_UINTR, TARGET_UINTR_P, PTA_UINTR): New.
> > (PTA_SAPPHIRRAPIDS): Add PTA_UINTR.
> > * config/i386/i386.opt: Add -muintr.
> > * config/i386/i386.md
> > (define_int_iterator UINTR_UNSPECV): New.
> > (define_int_attr uintr_unspecv): New.
> > (uintr_, uintr_senduipi, testui):
> > New define_insn patterns.
> > * config/i386/x86gprintrin.h: Include uintrintrin.h
> > * doc/invoke.texi: Document -muintr.
> > * doc/extend.texi: Document uintr.
> >
> > gcc/testsuite/
> >
> > * gcc.target/i386/funcspec-56.inc: Add new target attribute.
> > * gcc.target/i386/uintr-1.c: New test.
> > * gcc.target/i386/uintr-2.c: Ditto.
> > * gcc.target/i386/uintr-3.c: Ditto.
> > * gcc.target/i386/uintr-4.c: Ditto.
> > * gcc.target/i386/uintr-5.c: Ditto.
>
> Please also add -muintr to g++.dg/other/i386-{2,3}.C and
> gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics
> header.
>
> OK with the above change.
>
> Thanks,
> Uros.
>


Re: [PATCH] openmp: Add support for omp_get_supported_active_levels

2020-10-14 Thread Jakub Jelinek via Gcc-patches
On Tue, Oct 13, 2020 at 07:05:10PM +0100, Kwok Cheung Yeung wrote:
> +* omp_get_supported_active_levels:: Maxiumum number of active levels 
> supported

Sorry for not catching it during review, but there is a typo above.  Fixed
with patch below, committed to trunk.

> +@node omp_get_supported_active_levels
> +@section @code{omp_get_supported_active_levels} -- Maximum number of active 
> regions supported

I also wonder about the different wording between the above two places,
don't you want the same wording as earlier here?

> +@table @asis
> +@item @emph{Description}:
> +This function returns the maximum number of nested, active parallel regions
> +supported by this implementation.

2020-10-14  Jakub Jelinek  

* libgomp.texi (omp_get_supported_active_levels): Fix a typo.

--- libgomp/libgomp.texi.jj 2020-10-13 22:29:22.215958176 +0200
+++ libgomp/libgomp.texi2020-10-13 22:29:52.816516414 +0200
@@ -177,7 +177,7 @@ linkage, and do not throw exceptions.
 * omp_get_num_threads:: Size of the active team
 * omp_get_proc_bind::   Whether theads may be moved between CPUs
 * omp_get_schedule::Obtain the runtime scheduling method
-* omp_get_supported_active_levels:: Maxiumum number of active levels supported
+* omp_get_supported_active_levels:: Maximum number of active levels supported
 * omp_get_team_num::Get team number
 * omp_get_team_size::   Number of threads in a team
 * omp_get_thread_limit::Maximum number of threads


Jakub



Re: [PATCH 3/X] libsanitizer: Add option to bootstrap using HWASAN

2020-10-14 Thread Stott Graham via Gcc-patches
I'm fine,  thanks Sara

On Wed, 14 Oct 2020, 08:53 Richard Sandiford via Gcc-patches, <
gcc-patches@gcc.gnu.org> wrote:

> Matthew Malcomson  writes:
> > diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> > index
> d581a34653f61a440b3c3b832836fe109e2fbd08..25d041fcbb1f7c16f7ac47b7b5d4ea8308c6f69c
> 100644
> > --- a/gcc/doc/install.texi
> > +++ b/gcc/doc/install.texi
> > @@ -2767,6 +2767,11 @@ the build tree.
> >  Compiles GCC itself using Address Sanitization in order to catch
> invalid memory
> >  accesses within the GCC code.
> >
> > +@item @samp{bootstrap-hwasan}
> > +Compiles GCC itself using HWAddress Sanitization in order to catch
> invalid
> > +memory accesses within the GCC code.  This option is only available on
> AArch64
> > +targets running a Linux kernel that supports the required ABI (5.4 or
> later).
>
> I'd suggest rewording the last sentence, since it isn't clear whether 5.4
> is an ABI version or a Linux kernel version.  Maybe something like:
>
>   This option is only available on AArch64 systems that are running Linux
>   kernel version 5.4 or later.
>
> Not sure how good that is, suggestions for something better welcome.
>
> Looks good otherwise.
>
> Thanks,
> Richard
>


Re: [PATCH 3/X] libsanitizer: Add option to bootstrap using HWASAN

2020-10-14 Thread Richard Sandiford via Gcc-patches
Matthew Malcomson  writes:
> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> index 
> d581a34653f61a440b3c3b832836fe109e2fbd08..25d041fcbb1f7c16f7ac47b7b5d4ea8308c6f69c
>  100644
> --- a/gcc/doc/install.texi
> +++ b/gcc/doc/install.texi
> @@ -2767,6 +2767,11 @@ the build tree.
>  Compiles GCC itself using Address Sanitization in order to catch invalid 
> memory
>  accesses within the GCC code.
>  
> +@item @samp{bootstrap-hwasan}
> +Compiles GCC itself using HWAddress Sanitization in order to catch invalid
> +memory accesses within the GCC code.  This option is only available on 
> AArch64
> +targets running a Linux kernel that supports the required ABI (5.4 or later).

I'd suggest rewording the last sentence, since it isn't clear whether 5.4
is an ABI version or a Linux kernel version.  Maybe something like:

  This option is only available on AArch64 systems that are running Linux
  kernel version 5.4 or later.

Not sure how good that is, suggestions for something better welcome.

Looks good otherwise.

Thanks,
Richard


Re: [RFA,PATCH] Bail in bounds_of_var_in_loop if no step found.

2020-10-14 Thread Richard Biener via Gcc-patches
On Tue, Oct 13, 2020 at 6:12 PM Aldy Hernandez  wrote:
>
>
>
> On 10/13/20 6:02 PM, Richard Biener wrote:
> > On October 13, 2020 5:17:48 PM GMT+02:00, Aldy Hernandez via Gcc-patches 
> >  wrote:
> >> [Neither Andrew nor I are familiar with the SCEV code.  We treat it as
> >> a
> >> black box :).  So we could use a SCEV expert here.]
> >>
> >> In bounds_of_var_in_loop, evolution_part_in_loop_num is returning NULL:
> >>
> >>step = evolution_part_in_loop_num (chrec, loop->num);

(*)

> >
> > It means that Var doesn't vary in the loop.
> > That is, chrec isn't a polynomial chrec.
>
> That's what I thought, but it is:
>
> (gdb) p chrec
> $6 = 
> (gdb) dd chrec
> {0, +, 1}_2
>
> evolution_part_in_loop_num() is returning NULL deep in
> chrec_component_in_loop_num():
>
>   default:
> =>if (right)
>  return NULL_TREE;
>else
>  return chrec;
>
> Do you have any suggestions?

I can only guess (w/o a testcase) that loop->num at (*) is not 2 and thus that
chrec does not evolve in the loop we're asking.  But this doesn't make much
sense with the constraints we are calling this function (a loop header PHI
with loop == the loop and stmt a loop header PHI and var the PHIs lhs).

OK, so looking at the testcase you're doing

492   class loop *l = loop_containing_stmt (phi);
493   if (l)
494 {
495   range_of_ssa_name_with_loop_info (loop_range,
phi_def, l, phi);

but 'l' isn't a loop, it's the loop tree root.  Change to

  if (l && loop_outer (l))

Richard.

>
> Thanks.
> Aldy
>


Re: [Patch] x86: Enable GCC support for Intel Hreset extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Tue, Oct 13, 2020 at 10:49 AM Hongyu Wang  wrote:
>
> Hi:
>
> This patch is about to support Intel Hreset instruction.
>
> Hreset provides a hint to the processor to selectively reset the prediction 
> history of the current logical processor.
>
> For more details, please refer to 
> https://software.intel.com/content/dam/develop/external/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf
>
> Bootstrap ok, regression test on i386/x86 backend is ok.
>
> OK for master?
>
> gcc/
>
> * common/config/i386/cpuinfo.h (get_available_features):
> Detect HRESET.
> * common/config/i386/i386-common.c (OPTION_MASK_ISA2_HRESET_SET,
> OPTION_MASK_ISA2_HRESET_UNSET): New macros.
> (ix86_handle_option): Handle -mhreset.
> * common/config/i386/i386-cpuinfo.h (enum processor_features):
> Add FEATURE_HRESET.
> * common/config/i386/i386-isas.h: Add ISA_NAMES_TABLE_ENTRY
> for hreset.
> * config.gcc: Add hresetintrin.h
> * config/i386/hresetintrin.h: New header file.
> * config/i386/x86gprintrin.h: Include hresetintrin.h.
> * config/i386/cpuid.h (bit_HRESET): New.
> * config/i386/i386-builtin.def: Add new builtin.
> * config/i386/i386-expand.c (ix86_expand_builtin):
> Handle new builtin.
> * config/i386/i386-c.c (ix86_target_macros_internal): Define
> __HRESET__.
> * config/i386/i386-options.c (isa2_opts): Add -mhreset.
> (ix86_valid_target_attribute_inner_p): Handle hreset.
> * config/i386/i386.h (TARGET_HRESET, TARGET_HRESET_P,
> PTA_HRESET): New.
> (PTA_ALDERLAKE): Add PTA_HRESET.
> * config/i386/i386.opt: Add option -mhreset.
> * config/i386/i386.md (UNSPECV_HRESET): New unspec.
> (hreset): New define_insn.
> * doc/invoke.texi: Document -mhreset.
> * doc/extend.texi: Document hreset.
>
> gcc/testsuite/
>
> * gcc.target/i386/hreset-1.c: New test.
> * gcc.target/i386/funcspec-56.inc: Add new target attribute.
> * gcc.target/i386/sse-12.c: Update -mhreset.
> * gcc.target/i386/sse-13.c: Likewise.
> * gcc.target/i386/sse-14.c: Likewise.
> * gcc.target/i386/sse-22.c: Likewise.
> * gcc.target/i386/sse-23.c: Likewise.
> * g++.dg/other/i386-2.C: Likewise.
> * g++.dg/other/i386-3.C: Likewise.

The patch doesn't include all testsuite changes.

Otherwise OK.

Thanks,
Uros.


Re: [Patch] x86: Enable support for Intel UINTR extension

2020-10-14 Thread Uros Bizjak via Gcc-patches
On Tue, Oct 13, 2020 at 10:30 AM Hongyu Wang  wrote:
>
> Hi:
>
> This patch is about to support User Interrupt (UINTR) instructions.
>
> This feature defines user interrupts as new events in the architecture.  They 
> are delivered to software operating in 64-bit mode with CPL = 3 without any 
> change to segmentation state.
>
> For more details, please refer to 
> https://software.intel.com/content/dam/develop/external/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf
>
> Bootstrap ok, regression test on i386/x86 backend is ok.
>
> OK for master?
>
> gcc/
> * common/config/i386/cpuinfo.h (get_available_features):
> Detect UINTR.
> * common/config/i386/i386-common.c (OPTION_MASK_ISA2_UINTR_SET
> OPTION_MASK_ISA2_UINTR_UNSET): New.
> (ix86_handle_option): Handle -muintr.
> * common/config/i386/i386-cpuinfo.h (enum processor_features):
> Add FEATURE_UINTR.
> * common/config/i386/i386-isas.h: Add ISA_NAMES_TABLE_ENTRY
> for uintr.
> * config.gcc: Add uintrintrin.h to extra_headers.
> * config/i386/uintrintrin.h: New.
> * config/i386/cpuid.h (bit_UINTR): New.
> * config/i386/driver-i386.c (host_detect_local_cpu): Detect UINTR.
> * config/i386/i386-builtin-types.def: Add new types.
> * config/i386/i386-builtin.def: Add new builtins.
> * config/i386/i386-builtins.c (ix86_init_mmx_sse_builtins): Add
> __builtin_ia32_testui.
> * config/i386/i386-builtins.h (ix86_builtins): Add
> IX86_BUILTIN_TESTUI.
> * config/i386/i386-c.c (ix86_target_macros_internal): Define
> __UINTR__.
> * config/i386/i386-expand.c (ix86_expand_special_args_builtin):
> Handle UINT8_FTYPE_VOID.
> (ix86_expand_builtin): Handle IX86_BUILTIN_TESTUI.
> * config/i386/i386-options.c (isa2_opts): Add -muintr.
> (ix86_valid_target_attribute_inner_p): Handle UINTR.
> (ix86_option_override_internal): Add TARGET_64BIT check for UINTR.
> * config/i386/i386.h (TARGET_UINTR, TARGET_UINTR_P, PTA_UINTR): New.
> (PTA_SAPPHIRRAPIDS): Add PTA_UINTR.
> * config/i386/i386.opt: Add -muintr.
> * config/i386/i386.md
> (define_int_iterator UINTR_UNSPECV): New.
> (define_int_attr uintr_unspecv): New.
> (uintr_, uintr_senduipi, testui):
> New define_insn patterns.
> * config/i386/x86gprintrin.h: Include uintrintrin.h
> * doc/invoke.texi: Document -muintr.
> * doc/extend.texi: Document uintr.
>
> gcc/testsuite/
>
> * gcc.target/i386/funcspec-56.inc: Add new target attribute.
> * gcc.target/i386/uintr-1.c: New test.
> * gcc.target/i386/uintr-2.c: Ditto.
> * gcc.target/i386/uintr-3.c: Ditto.
> * gcc.target/i386/uintr-4.c: Ditto.
> * gcc.target/i386/uintr-5.c: Ditto.

Please also add -muintr to g++.dg/other/i386-{2,3}.C and
gcc.target/i386-sse-{12,13,14,22,23}.c. This will test new intrinsics
header.

OK with the above change.

Thanks,
Uros.


Re: [RFC][gimple] Move can_duplicate_bb_p to gimple_can_duplicate_bb_p

2020-10-14 Thread Richard Biener
On Tue, 13 Oct 2020, Tom de Vries wrote:

> On 10/12/20 9:15 AM, Richard Biener wrote:
> > On Fri, 9 Oct 2020, Tom de Vries wrote:
> > 
> >> Hi,
> >>
> >> The function gimple_can_duplicate_bb_p currently always returns true.
> >>
> >> The presence of can_duplicate_bb_p in tracer.c however suggests that
> >> there are cases when bb's indeed cannot be duplicated.
> >>
> >> Move the implementation of can_duplicate_bb_p to gimple_can_duplicate_bb_p.
> >>
> >> Bootstrapped and reg-tested on x86_64-linux.
> >>
> >> Build x86_64-linux with nvptx accelerator and tested libgomp.
> >>
> >> No issues found.
> >>
> >> As corner-case check, bootstrapped and reg-tested a patch that makes
> >> gimple_can_duplicate_bb_p always return false, resulting in
> >> PR97333 - "[gimple_can_duplicate_bb_p == false, tree-ssa-threadupdate]
> >> ICE in duplicate_block, at cfghooks.c:1093".
> >>
> >> Any comments?
> > 
> > In principle it's correct to move this to the CFG hook since there
> > now seem to be stmts that cannot be duplicated and thus we need
> > to implement can_duplicate_bb_p.
> > 
> > Some minor things below...
> > 
> >> Thanks,
> >> - Tom
> >>
> >> [gimple] Move can_duplicate_bb_p to gimple_can_duplicate_bb_p
> >>
> >> gcc/ChangeLog:
> >>
> >> 2020-10-09  Tom de Vries  
> >>
> >>* tracer.c (cached_can_duplicate_bb_p): Use can_duplicate_block_p
> >>instead of can_duplicate_bb_p.
> >>(can_duplicate_insn_p, can_duplicate_bb_no_insn_iter_p): Move ...
> >>* tree-cfg.c: ... here.
> >>* tracer.c (can_duplicate_bb_p): Move ...
> >>* tree-cfg.c (gimple_can_duplicate_bb_p): here.
> >>* tree-cfg.h (can_duplicate_insn_p, can_duplicate_bb_no_insn_iter_p):
> >>Declare.
> >>
> >> ---
> >>  gcc/tracer.c   | 61 
> >> +-
> >>  gcc/tree-cfg.c | 54 ++-
> >>  gcc/tree-cfg.h |  2 ++
> >>  3 files changed, 56 insertions(+), 61 deletions(-)
> >>
> >> diff --git a/gcc/tracer.c b/gcc/tracer.c
> >> index e1c2b9527e5..16b46c65b14 100644
> >> --- a/gcc/tracer.c
> >> +++ b/gcc/tracer.c
> >> @@ -84,65 +84,6 @@ bb_seen_p (basic_block bb)
> >>return bitmap_bit_p (bb_seen, bb->index);
> >>  }
> >>  
> >> -/* Return true if gimple stmt G can be duplicated.  */
> >> -static bool
> >> -can_duplicate_insn_p (gimple *g)
> >> -{
> >> -  /* An IFN_GOMP_SIMT_ENTER_ALLOC/IFN_GOMP_SIMT_EXIT call must be
> >> - duplicated as part of its group, or not at all.
> >> - The IFN_GOMP_SIMT_VOTE_ANY and IFN_GOMP_SIMT_XCHG_* are part of such 
> >> a
> >> - group, so the same holds there.  */
> >> -  if (is_gimple_call (g)
> >> -  && (gimple_call_internal_p (g, IFN_GOMP_SIMT_ENTER_ALLOC)
> >> -|| gimple_call_internal_p (g, IFN_GOMP_SIMT_EXIT)
> >> -|| gimple_call_internal_p (g, IFN_GOMP_SIMT_VOTE_ANY)
> >> -|| gimple_call_internal_p (g, IFN_GOMP_SIMT_XCHG_BFLY)
> >> -|| gimple_call_internal_p (g, IFN_GOMP_SIMT_XCHG_IDX)))
> >> -return false;
> >> -
> >> -  return true;
> >> -}
> >> -
> >> -/* Return true if BB can be duplicated.  Avoid iterating over the insns.  
> >> */
> >> -static bool
> >> -can_duplicate_bb_no_insn_iter_p (const_basic_block bb)
> >> -{
> >> -  if (bb->index < NUM_FIXED_BLOCKS)
> >> -return false;
> >> -
> >> -  if (gimple *g = last_stmt (CONST_CAST_BB (bb)))
> >> -{
> >> -  /* A transaction is a single entry multiple exit region.  It
> >> -   must be duplicated in its entirety or not at all.  */
> >> -  if (gimple_code (g) == GIMPLE_TRANSACTION)
> >> -  return false;
> >> -
> >> -  /* An IFN_UNIQUE call must be duplicated as part of its group,
> >> -   or not at all.  */
> >> -  if (is_gimple_call (g)
> >> -&& gimple_call_internal_p (g)
> >> -&& gimple_call_internal_unique_p (g))
> >> -  return false;
> >> -}
> >> -
> >> -  return true;
> >> -}
> >> -
> >> -/* Return true if BB can be duplicated.  */
> >> -static bool
> >> -can_duplicate_bb_p (const_basic_block bb)
> >> -{
> >> -  if (!can_duplicate_bb_no_insn_iter_p (bb))
> >> -return false;
> >> -
> >> -  for (gimple_stmt_iterator gsi = gsi_start_bb (CONST_CAST_BB (bb));
> >> -   !gsi_end_p (gsi); gsi_next ())
> >> -if (!can_duplicate_insn_p (gsi_stmt (gsi)))
> >> -  return false;
> >> -
> >> -  return true;
> >> -}
> >> -
> >>  static sbitmap can_duplicate_bb;
> >>  
> >>  /* Cache VAL as value of can_duplicate_bb_p for BB.  */
> >> @@ -167,7 +108,7 @@ cached_can_duplicate_bb_p (const_basic_block bb)
> >>return false;
> >>  }
> >>  
> >> -  return can_duplicate_bb_p (bb);
> >> +  return can_duplicate_block_p (bb);
> >>  }
> >>  
> >>  /* Return true if we should ignore the basic block for purposes of 
> >> tracing.  */
> >> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
> >> index 5caf3b62d69..a5677859ffc 100644
> >> --- a/gcc/tree-cfg.c
> >> +++ b/gcc/tree-cfg.c
> >> @@ -6208,11 +6208,63 @@ gimple_split_block_before_cond_jump (basic_block 
> >> bb)
> >>  }
> >>  

Re: [Patch] collect-utils.c, lto-wrapper + mkoffload: Improve -save-temps filename

2020-10-14 Thread Tom de Vries
On 10/13/20 9:37 PM, Tobias Burnus wrote:
> This patch avoids putting some more files to /tmp/cc* when
> -save-temps has been specified.
> 

Very nice.

> For my testcase, it now generates:
> a.lto_wrapper_args
> a.offload_args

> a.xnvptx-none.args
> a.xnvptx-none.gcc_args
> a.xamdgcn-amdhsa.gcc_args
> a.xamdgcn-amdhsa.gccnative_args

I'd prefer it if nvptx had the same suffixes as gcn, that is, gcc_args
and gccnative_args.  The ".args" is a bit too non-descript for me.

Thanks,
- Tom

> a.xamdgcn-amdhsa.ld_args
> 
> 
> This patch adds an additional argument to collect-utils.c's
> collect_execute (and is wrapper fork_execute) which, if not NULL,
> it is used in 'concat (dumppfx, atsuffix, NULL);'.
> 
> This patch adds a suffix to gcc/config/gcn/mkoffload.c,
> gcc/config/nvptx/mkoffload.c and gcc/lto-wrapper.c.
> 
> It does not (yet) add a suffix to gcc/collect2.c and
> gcc/config/i386/intelmic-mkoffload.c but just passes
> NULL; for intelmic it is not a work item as it does
> not use '@' files at all.
> 
> Hopefully, there is no file which is written twice
> with the same name (or otherwise overridden) and
> the files names do make sense.
> 
> OK?
> 
> Tobias
> 
> PS: There is still cceBdzZk.ofldlist (via lto-plugin/lto-plugin.c),
> and @/tmp/cc* in calls to lto1 and collect2. And collect2.c
> passes NULL also when use_atfile is true.
> 
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München /
> Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
> Alexander Walter


  1   2   >