Re: [patch, libfortran] Use macros for dtype in array intrinsics

2018-01-07 Thread Thomas Koenig
Am 06.01.2018 um 18:45 schrieb Thomas Koenig: Hello world, the attached patch removes explicit use of dtype in the array intrinsics, replacing them by macros instead. Functionally, this patch changes nothing. Steve pointed out on IRC that +#define GFC_DTYPE_IS_UNSET(a) (unlikely((a)->dtype)

Re: [PATCH, PR82428] Add __builtin_goacc_{gang,worker,vector}_{id,size}

2018-01-07 Thread Tom de Vries
On 01/06/2018 12:36 PM, Jakub Jelinek wrote: On Sat, Jan 06, 2018 at 09:21:59AM +0100, Tom de Vries wrote: this patch adds the following builtins in C/C++: - __builtin_goacc_gang_id - __builtin_goacc_worker_id - __builtin_goacc_vector_id - __builtin_goacc_gang_size - __builtin_goacc_worker_size

[PATCH] PR 78534, 83704 Handle large formatted I/O

2018-01-07 Thread Janne Blomqvist
In order to handle large characters when doing formatted I/O, use size_t and ptrdiff_t for lengths. Compared to the previous patch, based on discussions on IRC use size_t for sizes that don't need to be negative rather than ptrdiff_t everywhere. Regtested on x86_64-pc-linux-gnu, approved as part

New Spanish PO file for 'gcc' (version 7.2.0)

2018-01-07 Thread Translation Project Robot
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Spanish team of translators. The file is available at: http://translationproject.org/latest/gcc/es.po (This file, 'gcc-7.2.0.es.po', has

Re: [patch, libfortran] Use macros for dtype in array intrinsics

2018-01-07 Thread Paul Richard Thomas
Hi Thomas, That was well spotted by Steve! OK for trunk. Thanks Paul On 7 January 2018 at 12:23, Thomas Koenig wrote: > Am 06.01.2018 um 18:45 schrieb Thomas Koenig: >> >> Hello world, >> >> the attached patch removes explicit use of dtype in the >> array intrinsics,

Mostly revert r254296

2018-01-07 Thread Richard Sandiford
r254296 added support for (const ...) wrappers around vectors, but in the end the agreement was to use a variable-length encoding of CONST_VECTOR (and VECTOR_CST) instead. This patch therefore reverts the bits that are no longer needed. The rtl.texi part isn't a full revert, since r254296 also

Re: SLP reductions with variable-length vectors

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 12:43:11AM +, Jeff Law wrote: > On 11/22/2017 11:10 AM, Richard Sandiford wrote: > > Richard Sandiford writes: > >> Two things stopped us using SLP reductions with variable-length vectors: > >> > >> (1) We didn't have a way of constructing

Re: Add support for bitwise reductions

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 12:36:58AM +, Jeff Law wrote: > On 11/22/2017 11:12 AM, Richard Sandiford wrote: > > Richard Sandiford writes: > >> This patch adds support for the SVE bitwise reduction instructions > >> (ANDV, ORV and EORV). It's a fairly mechanical

Re: [SFN+LVU+IEPM v4 6/9] [SFN] Introduce -gstatement-frontiers option, enable debug markers

2018-01-07 Thread H.J. Lu
On Mon, Dec 11, 2017 at 6:42 PM, Alexandre Oliva wrote: > On Dec 7, 2017, Jeff Law wrote: > >> On 11/09/2017 07:34 PM, Alexandre Oliva wrote: >>> Introduce a command line option to enable statement frontiers, enabled >>> by default in optimized builds with

Re: [PATCH] [PR testsuite/81010] Fix PPC test

2018-01-07 Thread Segher Boessenkool
Hi! On Sun, Jan 07, 2018 at 12:58:25AM -0700, Jeff Law wrote: > As you note in the comments, the code we generate now is actually more > efficient so the test needs to be tweaked. > > Rather than checking the form in doloop, I check the form in .combine > and look for > > (compare:CC

Re: [3/4] [AArch64] SVE tests

2018-01-07 Thread James Greenhalgh
On Sat, Jan 06, 2018 at 07:13:22PM +, Richard Sandiford wrote: > James Greenhalgh writes: > > On Fri, Nov 03, 2017 at 05:50:54PM +, Richard Sandiford wrote: > >> This patch adds gcc.target/aarch64 tests for SVE, and forces some > >> existing Advanced SIMD tests

Re: Add support for fully-predicated loops

2018-01-07 Thread James Greenhalgh
On Mon, Dec 18, 2017 at 07:40:00PM +, Jeff Law wrote: > On 11/17/2017 07:56 AM, Richard Sandiford wrote: > > This patch adds support for using a single fully-predicated loop instead > > of a vector loop and a scalar tail. An SVE WHILELO instruction generates > > the predicate for each

Re: Add an empty_mask_is_expensive hook

2018-01-07 Thread James Greenhalgh
On Fri, Nov 17, 2017 at 06:12:49PM +, Jeff Law wrote: > On 11/17/2017 08:15 AM, Richard Sandiford wrote: > > This patch adds a hook to control whether we avoid executing masked > > (predicated) stores when the mask is all false. We don't want to do > > that by default for SVE. > > > > Tested

Re: Add support for vectorising live-out values using SVE LASTB

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:36:47PM +, Jeff Law wrote: > On 11/17/2017 08:24 AM, Richard Sandiford wrote: > > This patch uses the SVE LASTB instruction to optimise cases in which > > a value produced by the final scalar iteration of a vectorised loop is > > live outside the loop. Previously

Re: Add support for reductions in fully-masked loops

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:34:34PM +, Jeff Law wrote: > On 11/17/2017 07:59 AM, Richard Sandiford wrote: > > This patch removes the restriction that fully-masked loops cannot > > have reductions. The key thing here is to make sure that the > > reduction accumulator doesn't include any values

Re: Add support for masked load/store_lanes

2018-01-07 Thread James Greenhalgh
On Tue, Dec 12, 2017 at 12:59:33AM +, Jeff Law wrote: > On 11/17/2017 02:36 AM, Richard Sandiford wrote: > > Richard Sandiford writes: > >> This patch adds support for vectorising groups of IFN_MASK_LOADs > >> and IFN_MASK_STOREs using conditional

Re: Allow gather loads to be used for grouped accesses

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:53:27PM +, Jeff Law wrote: > On 11/17/2017 03:04 PM, Richard Sandiford wrote: > > Following on from the previous patch for strided accesses, this patch > > allows gather loads to be used with grouped accesses, if we otherwise > > would need to fall back to

[PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-07 Thread H.J. Lu
This set of patches for GCC 8 mitigates variant #2 of the speculative execution vulnerabilities on x86 processors identified by CVE-2017-5715, aka Spectre. They convert indirect branches to call and return thunks to avoid speculative execution via indirect call and jmp. H.J. Lu (5): x86:

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-07 Thread Jeff Law
On 01/07/2018 03:58 PM, H.J. Lu wrote: > This set of patches for GCC 8 mitigates variant #2 of the speculative > execution > vulnerabilities on x86 processors identified by CVE-2017-5715, aka Spectre. > They > convert indirect branches to call and return thunks to avoid speculative > execution

Re: [PATCH 1/3] [builtins] Generic support for __builtin_load_no_speculate()

2018-01-07 Thread Bill Schmidt
Hi Richard, Unfortunately, I don't see any way that this will be useful for the ppc targets. We don't have a way to force resolution of a condition prior to continuing speculation, so this will just introduce another comparison that we would speculate past. For our mitigation we will have to

[PATCH] Implementation of RANDOM_INIT from F2018

2018-01-07 Thread Steve Kargl
I have attached my current implementation for RANDOM_INIT. For programs compiled without -fcoarry= or with -fcoarray=single, the one gets, % cat random_init_2.f90 program foo real x(2) call random_init(.false., .false.) call random_number(x) print *, x call random_init(.false.,

Re: Add support for SVE gather loads

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 01:16:26AM +, Jeff Law wrote: > On 11/17/2017 02:58 PM, Richard Sandiford wrote: > > This patch adds support for SVE gather loads. It uses the basically > > the same analysis code as the AVX gather support, but after that > > there are two major differences: > > > > -

Re: Add support for SVE scatter stores

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 12:34:54AM +, Jeff Law wrote: > On 11/17/2017 03:10 PM, Richard Sandiford wrote: > > This is mostly a mechanical extension of the previous gather load > > support to scatter stores. The internal functions in this case are: > > > > IFN_SCATTER_STORE (base, offsets,

Re: Handle peeling for alignment with masking

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 12:12:01AM +, Jeff Law wrote: > On 11/17/2017 08:13 AM, Richard Sandiford wrote: > > This patch adds support for aligning vectors by using a partial > > first iteration. E.g. if the start pointer is 3 elements beyond > > an aligned address, the first iteration will

[PATCH 4/5] x86: Add -mindirect-branch-register

2018-01-07 Thread H.J. Lu
Add -mindirect-branch-register to force indirect branch via register. This is implemented by disabling patterns of indirect branch via memory, similar to TARGET_X32. -mindirect-branch= and -mfunction-return= tests are updated with -mno-indirect-branch-register to avoid false test failures when

Re: [PATCH][AArch64] Use LDP/STP in shrinkwrapping

2018-01-07 Thread Segher Boessenkool
Hi Wilco, On Fri, Jan 05, 2018 at 12:22:44PM +, Wilco Dijkstra wrote: > An example epilog in a shrinkwrapped function before: > > ldpx21, x22, [sp,#16] > ldrx23, [sp,#32] > ldrx24, [sp,#40] > ldpx25, x26, [sp,#48] > ldrx27, [sp,#64] > ldrx28, [sp,#72] > ldrx30,

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-07 Thread H.J. Lu
On Sun, Jan 7, 2018 at 3:36 PM, Jeff Law wrote: > On 01/07/2018 03:58 PM, H.J. Lu wrote: >> This set of patches for GCC 8 mitigates variant #2 of the speculative >> execution >> vulnerabilities on x86 processors identified by CVE-2017-5715, aka Spectre. >> They >> convert

PING^2 [PATCH] i386: Avoid PLT when shadow stack is enabled directly

2018-01-07 Thread H.J. Lu
On Fri, Dec 8, 2017 at 5:02 AM, H.J. Lu wrote: > On Tue, Oct 24, 2017 at 5:31 AM, H.J. Lu wrote: >> PLT should be avoided with shadow stack in 32-bit mode if more than 2 >> parameters are passed in registers since only 2 parameters can be passed >> in

Re: Handle more SLP constant and extern definitions for variable VF

2018-01-07 Thread James Greenhalgh
On Mon, Dec 11, 2017 at 11:04:28PM +, Jeff Law wrote: > On 11/09/2017 07:20 AM, Richard Sandiford wrote: > > This patch adds support for vectorising SLP definitions that are > > constant or external (i.e. from outside the loop) when the vectorisation > > factor isn't known at compile time. It

Re: Support for aliasing with variable strides

2018-01-07 Thread James Greenhalgh
On Thu, Dec 14, 2017 at 11:00:36AM +, Richard Biener wrote: > On Fri, Nov 17, 2017 at 11:17 PM, Richard Sandiford > wrote: > > This patch adds runtime alias checks for loops with variable strides, > > so that we can vectorise them even without a restrict

Re: [0/4] [AArch64] Add SVE support

2018-01-07 Thread James Greenhalgh
(Resending; this bounced) On Sat, Jan 06, 2018 at 07:39:46PM +, Richard Sandiford wrote: > James Greenhalgh writes: > > On Fri, Nov 24, 2017 at 03:59:58PM +, Richard Sandiford wrote: > >> Richard Sandiford writes: > >> > This

[patch, fortran] Make ABI ready for BACK argument of MINLOC and MAXLOC

2018-01-07 Thread Thomas Koenig
Hello world, the attached patch is a step towards the implementaion of the BACK argument for the MINLOC and MAXLOC intrinsics, a part of F2008. This readies the ABI for a later date. In order to avoid combinatrorial explosion in the library, I have opted to always add the BACK argument to the

[wwwdocs] readings.html - libre.adacore.com is gone

2018-01-07 Thread Gerald Pfeifer
...so adjust to where it redirects. Applied. Gerald Index: readings.html === RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v retrieving revision 1.287 diff -u -r1.287 readings.html --- readings.html 14 Dec 2017 01:10:53

Re: Rework the legitimize_address_displacement hook

2018-01-07 Thread James Greenhalgh
On Fri, Nov 17, 2017 at 07:45:28PM +, Jeff Law wrote: > On 11/17/2017 09:03 AM, Richard Sandiford wrote: > > This patch: > > > > - tweaks the handling of legitimize_address_displacement > > so that it gets called before rather than after the address has > > been expanded. This means that

Re: Use single-iteration epilogues when peeling for gaps

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:42:02PM +, Jeff Law wrote: > On 11/17/2017 08:38 AM, Richard Sandiford wrote: > > This patch adds support for fully-masking loops that require peeling > > for gaps. It peels exactly one scalar iteration and uses the masked > > loop to handle the rest. Previously we

[PATCH 2/5] libstdc++ futex: Use FUTEX_CLOCK_REALTIME for wait

2018-01-07 Thread Mike Crowe
The futex system call supports waiting for an absolute time if FUTEX_WAIT_BITSET is used rather than FUTEX_WAIT. Doing so provides two benefits: 1. The call to gettimeofday is not required in order to calculate a relative timeout. 2. If someone changes the system clock during the wait then

[PATCH 0/5] Make std::future::wait_* use std::chrono::steady_clock when required

2018-01-07 Thread Mike Crowe
This patch series was originally submitted back in September at https://gcc.gnu.org/ml/libstdc++/2017-09/msg00083.html which ended up as https://patchwork.ozlabs.org/cover/817379/ . The patches received no comments at all, which may mean that they are perfect or that they are so bad that no-one

Re: Use gather loads for strided accesses

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:47:40PM +, Jeff Law wrote: > On 11/17/2017 03:02 PM, Richard Sandiford wrote: > > This patch tries to use gather loads for strided accesses, > > rather than falling back to VMAT_ELEMENTWISE. > > > > Tested on aarch64-linux-gnu (with and without SVE),

[PATCH] fold strlen of constant aggregates (PR 83693)

2018-01-07 Thread Martin Sebor
GCC is able to fold references to members of global aggregate constants in many expressions but it doesn't known how do it for strlen() arguments. As a result, strlen calls with such arguments such the one below are not optimized: const struct { char a[4]; } s = { "123" }; void f (void)

Re: Add support for conditional reductions using SVE CLASTB

2018-01-07 Thread James Greenhalgh
On Wed, Dec 13, 2017 at 04:59:00PM +, Jeff Law wrote: > On 11/17/2017 08:29 AM, Richard Sandiford wrote: > > This patch uses SVE CLASTB to optimise conditional reductions. It means > > that we no longer need to maintain a separate index vector to record > > the most recent valid value, and no

Re: Allow single-element interleaving for non-power-of-2 strides

2018-01-07 Thread James Greenhalgh
On Fri, Nov 17, 2017 at 06:40:13PM +, Jeff Law wrote: > On 11/17/2017 08:33 AM, Richard Sandiford wrote: > > This allows LD3 to be used for isolated a[i * 3] accesses, in a similar > > way to the current a[i * 2] and a[i * 4] for LD2 and LD4 respectively. > > Given the problems with the cost

[PATCH 1/5] x86: Add -mindirect-branch=

2018-01-07 Thread H.J. Lu
Add -mindirect-branch= option to convert indirect call and jump to call and return thunks. The default is 'keep', which keeps indirect call and jump unmodified. 'thunk' converts indirect call and jump to call and return thunk. 'thunk-inline' converts indirect call and jump to inlined call and

[PATCH 2/5] x86: Add -mindirect-branch-loop=

2018-01-07 Thread H.J. Lu
Add -mindirect-branch-loop= option to control loop filler in call and return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" as loop filler. The default is 'lfence'. gcc/ * config/i386/i386-opts.h

[PATCH 3/5] x86: Add -mfunction-return=

2018-01-07 Thread H.J. Lu
Add -mfunction-return= option to convert function return to call and return thunks. The default is 'keep', which keeps function return unmodified. 'thunk' converts function return to call and return thunk. 'thunk-inline' converts function return to inlined call and return thunk. 'thunk-extern'

[PATCH 5/5] x86: Add 'V' register operand modifier

2018-01-07 Thread H.J. Lu
Add 'V', a special modifier which prints the name of the full integer register without '%'. For extern void (*func_p) (void); void foo (void) { asm ("call __x86_indirect_thunk_%V0" : : "a" (func_p)); } it generates: foo: movqfunc_p(%rip), %rax call

[wwwdocs] benchmarks/index.html - adjust second reference to gcc.opensuse.org

2018-01-07 Thread Gerald Pfeifer
A bit ago I updated the first reference, but missed the second on that page. Fixed thusly. Gerald Index: index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/benchmarks/index.html,v retrieving revision 1.39 diff -u -r1.39 index.html

PING: [PATCH] i386: Insert ENDBR before the profiling counter call

2018-01-07 Thread H.J. Lu
On Tue, Oct 24, 2017 at 10:58 AM, H.J. Lu wrote: > On Tue, Oct 24, 2017 at 10:40 AM, Andi Kleen wrote: >> "H.J. Lu" writes: >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/i386/pr82699-4.c >>> @@ -0,0 +1,11 @@ >>> +/* { dg-do

Re: Allow the number of iterations to be smaller than VF

2018-01-07 Thread James Greenhalgh
On Mon, Nov 20, 2017 at 12:12:38AM +, Jeff Law wrote: > On 11/17/2017 08:11 AM, Richard Sandiford wrote: > > Fully-masked loops can be profitable even if the iteration > > count is smaller than the vectorisation factor. In this case > > we're effectively doing a complete unroll followed by

[PATCH 5/5] Extra async tests, not for merging

2018-01-07 Thread Mike Crowe
These tests show that changing the system clock has an effect on std::future::wait_until when using std::chrono::system_clock but not when using std::chrono::steady_clock. Unfortunately these tests have a number of downsides: 1. Nothing that is attempting to keep the clock set correctly (ntpd,

[PATCH 3/5] libstdc++ futex: Support waiting on std::chrono::steady_clock directly

2018-01-07 Thread Mike Crowe
The user-visible effect of this change is for std::future::wait_until to use CLOCK_MONOTONIC when passed a timeout of std::chrono::steady_clock type. This makes it immune to any changes made to the system clock CLOCK_REALTIME. Add an overload of

[PATCH 1/5] Improve libstdc++-v3 async test

2018-01-07 Thread Mike Crowe
Add tests for waiting for the future using both std::chrono::steady_clock and std::chrono::system_clock in preparation for dealing with those clocks properly in futex.cc. --- libstdc++-v3/testsuite/30_threads/async/async.cc | 36 1 file changed, 36 insertions(+) diff

[PATCH 4/5] libstdc++ atomic_futex: Use std::chrono::steady_clock as reference clock

2018-01-07 Thread Mike Crowe
The user-visible effect of this change is that std::future::wait_for now uses std::chrono::steady_clock to determine the timeout. This makes it immune to changes made to the system clock. It also means that anyone using their own clock types with std::future::wait_until will have the timeout

Re: [PATCH] [PR testsuite/81010] Fix PPC test

2018-01-07 Thread Segher Boessenkool
Hi, On Sun, Jan 07, 2018 at 11:09:33AM -0600, Segher Boessenkool wrote: > > > /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { > > "-mcpu=power7" } } */ > > And you could delete this line, since nothing in the testcase _needs_ the > -mcpu=power7. Scratch that, those

Re: [PATCH 0/3] Add __builtin_load_no_speculate

2018-01-07 Thread Bernd Edlinger
Hi Richard, I wonder how to use this builtin correctly. Frankly it sounds like way too complicated. Do you expect we need to use this stuff on every array access now, or is that just a theoretical thing that can only happen with jave byte code interpreters? If I assume there is an array of int,

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-07 Thread Sandra Loosemore
On 01/07/2018 03:58 PM, H.J. Lu wrote: This set of patches for GCC 8 mitigates variant #2 of the speculative execution vulnerabilities on x86 processors identified by CVE-2017-5715, aka Spectre. They convert indirect branches to call and return thunks to avoid speculative execution via

[PATCH][PR rtl-optimization/81308] Conditionally cleanup the CFG after insn splitting

2018-01-07 Thread Jeff Law
This patch fixes the original problem reported in 81308. Namely that g++.dg/pr62079.C will trigger a checking failure on 32bit x86. As Jakub noted in the BZ the problem is we had an insn with an EH region note. That insn gets split and the split insns do not have an EH region note (nor do they

[PATCH][PR rtl-optimization/81308] Conditionally cleanup the CFG after switch conversion

2018-01-07 Thread Jeff Law
This patch fixes the second testcase in 81308 and the duplicate in 83724. For those cases we have a switch statement where one or more case labels are marked as __builtin_unreachable. Switch conversion calls group_case_labels which can drop the edges from the switch to the case labels that are

Re: [PATCH 0/5] x86: CVE-2017-5715, aka Spectre

2018-01-07 Thread Markus Trippelsdorf
On 2018.01.07 at 21:07 -0700, Sandra Loosemore wrote: > On 01/07/2018 03:58 PM, H.J. Lu wrote: > > This set of patches for GCC 8 mitigates variant #2 of the speculative > > execution > > vulnerabilities on x86 processors identified by CVE-2017-5715, aka Spectre. > > They > > convert indirect

Re: [PATCH 1/3] [builtins] Generic support for __builtin_load_no_speculate()

2018-01-07 Thread Jeff Law
On 01/07/2018 07:20 PM, Bill Schmidt wrote: > Hi Richard, > > Unfortunately, I don't see any way that this will be useful for the ppc > targets. We don't > have a way to force resolution of a condition prior to continuing > speculation, so this > will just introduce another comparison that we