[committed] libstdc++: Fix error in experimental::strand

2020-10-26 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog: * include/experimental/executor (strand::_State): Fix thinko. Tested powerpc64le-linux. Committed to trunk. commit b784bbbe45414084551a824504650f21cb653ca1 Author: Jonathan Wakely Date: Mon Oct 26 21:00:06 2020 libstdc++: Fix error in experimental::strand

[PATCH,rs6000] Add patterns for combine to support p10 fusion

2020-10-26 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey This patch adds the first couple patterns to support p10 fusion. These will allow combine to create a single insn for a pair of instructions that that power10 can fuse and execute. These particular ones have the requirement that only cr0 can be used when fusing a load with a

Re: [RS6000] Non-pcrel tests when power10

2020-10-26 Thread Alan Modra via Gcc-patches
On Mon, Oct 26, 2020 at 03:18:39PM -0500, Segher Boessenkool wrote: > Hi! > > On Thu, Oct 22, 2020 at 05:28:17PM +1030, Alan Modra wrote: > > These tests require -mno-pcrel because they are testing features > > of the non-pcrel ABI. > > > --- a/gcc/testsuite/gcc.target/powerpc/cprophard.c > >

Re: PowerPC: Map IEEE 128-bit long double built-in functions

2020-10-26 Thread will schmidt via Gcc-patches
On Thu, 2020-10-22 at 18:03 -0400, Michael Meissner via Gcc-patches wrote: > PowerPC: Map IEEE 128-bit long double built-in functions > > This patch is revised from the first and second versions of the patch posted. > It now uses the names that are not in the user's namespace (i.e. __sinieee128 >

Re: PowerPC: Add -mno-gnu-attributes to ibm-ldouble.o

2020-10-26 Thread will schmidt via Gcc-patches
On Thu, 2020-10-22 at 18:05 -0400, Michael Meissner via Gcc-patches wrote: > PowerPC: Add -mno-gnu-attributes to ibm-ldouble.o. > > I have split all of these patches into separate patches to hopefully get them > into the tree. > > This patch is split off from the patch adding __float128 <->

Re: PowerPC: Allow C/C++ to change long double type on GLIBC 2.32.

2020-10-26 Thread will schmidt via Gcc-patches
On Thu, 2020-10-22 at 18:15 -0400, Michael Meissner via Gcc-patches wrote: > PowerPC: Allow C/C++ to change long double type on GLIBC 2.32. > > This is a new patch. It turns off the warning about switching the long double > type via compile line if the GLIBC is 2.32 or newer. It only does this

Re: [RFC] Add support for the "retain" attribute utilizing SHF_GNU_RETAIN

2020-10-26 Thread Jozef Lawrynowicz
On Mon, Oct 26, 2020 at 07:08:06PM +, Pedro Alves via Gcc-patches wrote: > On 10/6/20 12:10 PM, Jozef Lawrynowicz wrote: > > > Should "used" apply SHF_GNU_RETAIN? > > === > > Another talking point is whether the existing "used" attribute should > > apply the

[PATCH] libstdc++: Add C++2a synchronization support

2020-10-26 Thread Thomas Rodgers
From: Thomas Rodgers Add support for - * atomic_flag::wait/notify_one/notify_all * atomic::wait/notify_one/notify_all * counting_semaphore * binary_semaphore * latch libstdc++-v3/ChangeLog: * include/Makefile.am (bits_headers): Add new header. * include/Makefile.in:

Add string builtins to builtin_fnspec

2020-10-26 Thread Jan Hubicka
Hi, this patch adds missing string builtins to builtin_fnspec. Bootstrapped/regtested x86_64-linux, OK? gcc/ChangeLog: 2020-10-26 Jan Hubicka * builtins.c (builtin_fnspec): Add bzero, memcmp, memcmp_eq, bcmp, strncmp, strncmp_eq, strncasecmp, rindex, strlen, strlnen,

Fix builtin decls generated in tree.c

2020-10-26 Thread Jan Hubicka
Hi, tree.c still produces "fn spec" attribute for memcpy, memmove and memset. This is not desirable since "1" is less informative than fnspec builtin_fnspec returns. Also the buitin would fire checker, since it misses the second caracter, so probably the whole logic is unused.

Re: [PATCH] rs6000: Don't split constant operator add before reload, move to temp register for future optimization

2020-10-26 Thread Segher Boessenkool
On Wed, Oct 21, 2020 at 03:25:29AM -0500, Xionghu Luo wrote: > Don't split code from add3 for SDI to allow a later pass to split. This is very problematic. > This allows later logic to hoist out constant load in add instructions. Later logic should be able to do that any way (I do not say that

[PATCH, rs6000] improve vec_ctf invalid parameter handling. (pr91903)

2020-10-26 Thread will schmidt via Gcc-patches
[PATCH, rs6000] improve vec_ctf invalid parameter handling. Hi, Per PR91903, GCC ICEs when we attempt to pass a variable (or out of range value) into the vec_ctf() builtin. Per investigation, the parameter checking exists for this builtin with the int types, but was missing for the long long

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Qing Zhao via Gcc-patches
> On Oct 26, 2020, at 3:33 PM, Uros Bizjak wrote: > > On Mon, Oct 26, 2020 at 9:05 PM Uros Bizjak wrote: >> >> On Mon, Oct 26, 2020 at 8:10 PM Qing Zhao wrote: >>> >>> >>> On Oct 26, 2020, at 1:42 PM, Uros Bizjak wrote: On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote:

Fix fnspecs for math builtins

2020-10-26 Thread Jan Hubicka
Hi, this patch makes us to use ".C" and ".P" fnspecs where applicable. I also noticed that gamma and variants are declared as storing to memory while they are not (gamma_r does) Bootstrapped/regtested x86_64-linux, OK? gcc/ChangeLog: 2020-10-26 Jan Hubicka * builtin-attrs.def

[PATCH v3] builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-10-26 Thread Raoni Fassina Firmino via Gcc-patches
Changes since v2[1]: - Added documentation for the new optabs; - Remove use of non portable __builtin_clz; - Changed feclearexcept and feraiseexcept to accept all 4 valid flags at the same time and added more test for that case; - Extended feclearexcept and feraiseexcept testcases to

[PATCH] c++: Check constraints before instantiation from mark_used [PR95132]

2020-10-26 Thread Patrick Palka via Gcc-patches
This makes mark_used check constraints of a function _before_ calling maybe_instantiate_decl, so that we don't try instantiating a function (as part of return type deduction) with unsatisfied constraints. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps

[PATCH] c++: Check constraints only on candidate conversion functions

2020-10-26 Thread Patrick Palka via Gcc-patches
In the testcase below, we're overeagerly checking the constraints on the conversion function B::operator bool() as part of finding an implicit conversion sequence from B to const A&. This behavior seems to be nonconforming because according to [over.match.copy] and [over.match.conv], only those

Re: PowerPC: Use __float128 instead of __ieee128 in tests.

2020-10-26 Thread will schmidt via Gcc-patches
On Thu, 2020-10-22 at 18:12 -0400, Michael Meissner via Gcc-patches wrote: > PowerPC: Use __float128 instead of __ieee128 in tests. > > I have split all of these patches into separate patches to hopefully get them > into the tree. > > Two of the tests used the __ieee128 keyword instead of

Re: [PATCH] Re: error: ‘EVRP_MODE_DEBUG’ was not declared – was: [PUSHED] Ranger classes.

2020-10-26 Thread Maciej W. Rozycki
On Mon, 26 Oct 2020, Andrew MacLeod wrote: > > It is still broken at `-O0', does not build with `--enable-werror-always' > > (which IMO should be on by default except for releases, just as we do with > > binutils AFAIK, so as to make sure people do not introduce build problems > > too easily):

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Uros Bizjak via Gcc-patches
On Mon, Oct 26, 2020 at 8:10 PM Qing Zhao wrote: > > > > > On Oct 26, 2020, at 1:42 PM, Uros Bizjak wrote: > > > > On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote: > >> > >> > >> The following is the current change in i386.c, could you check whether the > >> logic is good? > > > > x87 handling

Re: [RS6000] dimode_off.c test

2020-10-26 Thread Segher Boessenkool
On Thu, Oct 22, 2020 at 05:29:49PM +1030, Alan Modra wrote: > This tests behaviour near the limit of 16-bit signed offsets. If > power10 prefix instructions are enabled, no such testing occurs. > > * gcc.target/powerpc/dimode_off.c: Add -mno-prefixed to options. > > Regstrapped

[PATCH] PR fortran/97491 - Wrong restriction for VALUE arguments of pure procedures

2020-10-26 Thread Harald Anlauf
As found/reported by Thomas, the redefinition of dummy arguments with the VALUE attribute was erroneously rejected for pure procedures. A related purity check did not take VALUE into account and was therefore adjusted. Regtested on x86_64-pc-linux-gnu. OK for master? Thanks, Harald PR

Re: [RFC] Add support for the "retain" attribute utilizing SHF_GNU_RETAIN

2020-10-26 Thread Pedro Alves via Gcc-patches
On 10/6/20 12:10 PM, Jozef Lawrynowicz wrote: > The changes would also only affect targets > that support the GNU ELF OSABI, which would lead to inconsistent > behavior between non-GNU OS's. Well, a separate __attribute__((retain)) will necessarily only work on GNU ELF targets, so that just

Re: [RS6000] Non-pcrel tests when power10

2020-10-26 Thread Segher Boessenkool
Hi! On Thu, Oct 22, 2020 at 05:28:17PM +1030, Alan Modra wrote: > These tests require -mno-pcrel because they are testing features > of the non-pcrel ABI. > --- a/gcc/testsuite/gcc.target/powerpc/cprophard.c > +++ b/gcc/testsuite/gcc.target/powerpc/cprophard.c > @@ -1,6 +1,6 @@ > /* { dg-do

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Uros Bizjak via Gcc-patches
On Mon, Oct 26, 2020 at 9:05 PM Uros Bizjak wrote: > > On Mon, Oct 26, 2020 at 8:10 PM Qing Zhao wrote: > > > > > > > > > On Oct 26, 2020, at 1:42 PM, Uros Bizjak wrote: > > > > > > On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote: > > >> > > >> > > >> The following is the current change in

Re: [RS6000] Link power10 testcases

2020-10-26 Thread Segher Boessenkool
On Thu, Oct 22, 2020 at 05:31:15PM +1030, Alan Modra wrote: > Running the assembler and linker catches more errors. > > * gcc.target/powerpc/cfuged-1.c, > * gcc.target/powerpc/cntlzdm-1.c, There should be no star on the second and next line of one entry. Okay for trunk. Thanks!

Re: [PATCH] libstdc++: Implement C++20 features for

2020-10-26 Thread Jonathan Wakely via Gcc-patches
On 26/10/20 13:47 -0700, Thomas Rodgers wrote: From: Thomas Rodgers New ctors and ::view() accessor for - * basic_stingbuf * basic_istringstream * basic_ostringstream * basic_stringstreamm New ::get_allocator() accessor for basic_stringbuf. libstdc++-v3/ChangeLog: * acinclude.m4

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Qing Zhao via Gcc-patches
> On Oct 26, 2020, at 1:42 PM, Uros Bizjak wrote: > > On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote: >> >> >> The following is the current change in i386.c, could you check whether the >> logic is good? > > x87 handling looks good to me. > > One remaining question: If the function

Re: [PATCH] rs6000, Power 10 testsuite fixes

2020-10-26 Thread Segher Boessenkool
Hi! On Fri, Oct 23, 2020 at 02:43:40PM -0700, Carl Love wrote: > The following patch fixes a few issues with the tests. The DEBUG is > defined in each of the files thus the #ifdef DEBUG should just be #if > DEBUG. The other issue is a some of the line lengths for the error > prints exceed 80

[PATCH] Combine logical OR ranges properly. pr97567

2020-10-26 Thread Andrew MacLeod via Gcc-patches
In the core of gori_compute::logical_combine we are suppose to combine the calculated true and false ranges on each side of  the operation. when encountering [0,0] =   c_3  | c_4 we know we only need to consider the FALSE values of the range carried by c_3 and c_4, but it can be EITHER of

libgo patch committed: Additional BSD-specific syscall wrappers

2020-10-26 Thread Ian Lance Taylor via Gcc-patches
This libgo patch by Nikhil Benesch imports additional code from upstream for handing system calls on BSD systems. This makes the syscall package on NetBSD complete enough to compile the standard library. Boostrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed to mainline. Ian

Re: Fix fnspecs for math builtins

2020-10-26 Thread Joseph Myers
On Mon, 26 Oct 2020, Jan Hubicka wrote: > Hi, > this patch makes us to use ".C" and ".P" fnspecs where > applicable. I also noticed that gamma and variants are > declared as storing to memory while they are not (gamma_r does) I think the point is that they store to the global signgam. --

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Qing Zhao via Gcc-patches
Hi, Uros, Could you please check the change compared to the previous version for i386.c as following: Let me know any issue there. Thanks a lot. Qing --- gcc/config/i386/i386.c | 136 ++--- .../gcc.target/i386/zero-scratch-regs-28.c | 17

Re: [RS6000] Tests that use int128_t and -m32

2020-10-26 Thread Alan Modra via Gcc-patches
On Mon, Oct 26, 2020 at 07:33:49AM -0500, Segher Boessenkool wrote: > On Sun, Oct 25, 2020 at 09:50:01PM +1030, Alan Modra wrote: > > All these tests fail with -m32 due to lack of int128 support, > > Is there any good reason __int128 is not enabled for rs6000 -m32, btw? Lack of addti3 and subti3

Re: [RS6000] Unsupported test options for -m32

2020-10-26 Thread Alan Modra via Gcc-patches
On Mon, Oct 26, 2020 at 04:28:20PM +, Iain Sandoe wrote: > David Edelsohn via Gcc-patches wrote: > > > FAIL: gcc.target/powerpc/swaps-p8-22.c (test for excess errors) > > Excess errors: > > cc1: error: '-mcmodel' not supported in this configuration > > > > *

Re: [RFC] Add support for the "retain" attribute utilizing SHF_GNU_RETAIN

2020-10-26 Thread Pedro Alves via Gcc-patches
On 10/6/20 12:10 PM, Jozef Lawrynowicz wrote: > Should "used" apply SHF_GNU_RETAIN? > === > Another talking point is whether the existing "used" attribute should > apply the SHF_GNU_RETAIN flag to the containing section. > > It seems unlikely that a user applies

[PATCH] libstdc++: Implement C++20 features for

2020-10-26 Thread Thomas Rodgers
From: Thomas Rodgers New ctors and ::view() accessor for - * basic_stingbuf * basic_istringstream * basic_ostringstream * basic_stringstreamm New ::get_allocator() accessor for basic_stringbuf. libstdc++-v3/ChangeLog: * acinclude.m4 (glibcxx_SUBDIRS): Add src/c++20. *

Use EAF_RETURN_ARG in tree-ssa-ccp.c

2020-10-26 Thread Jan Hubicka
Hi, while looking for special cases of buitins I noticed that tree-ssa-ccp can use EAF_RETURNS_ARG. I wonder if same should be done by value numbering and other propagators Bootstrapped/regtested x86_64-linux, OK? Honza * tree-ssa-ccp.c (evaluate_stmt): Use EAF_RETURNS_ARG; do not

Re: Materialize clones on demand

2020-10-26 Thread Richard Biener
On Fri, 23 Oct 2020, Jan Hubicka wrote: > > Hi, > > > > On Thu, Oct 22 2020, Jan Hubicka wrote: > > > Hi, > > > this patch removes the pass to materialize all clones and instead this > > > is now done on demand. The motivation is to reduce lifetime of function > > > bodies in ltrans that should

Re: [PATCH] Add debug_bb_details and debug_bb_n_details

2020-10-26 Thread Richard Biener
On Mon, 26 Oct 2020, Xionghu Luo wrote: > > On 2020/10/23 18:18, Richard Biener wrote: > > On Fri, 23 Oct 2020, Xiong Hu Luo wrote: > > > >> Sometimes debug_bb_slim_bb_n_slim is not enough, how about adding > >> this debug_bb_details_bb_n_details? Or any other similar call > >> existed? > >

Re: [PATCH, OpenMP 5.0] Implement structure element mapping changes in 5.0

2020-10-26 Thread Jakub Jelinek via Gcc-patches
On Sat, Oct 24, 2020 at 01:43:26AM +0800, Chung-Lin Tang wrote: > On 2020/10/23 8:13 PM, Jakub Jelinek wrote: > > > In general, upon encountering a construct, we can't statically determine > > > and insert alloc/release maps > > > for each element of a structure variable, since we don't really

Re: [committed] libstdc++: Simplify std::shared_ptr construction from std::weak_ptr

2020-10-26 Thread Stephan Bergmann via Gcc-patches
On 21/10/2020 22:14, Jonathan Wakely via Gcc-patches wrote: The _M_add_ref_lock() and _M_add_ref_lock_nothrow() members of _Sp_counted_base are very similar, except that the former throws an exception when the use count is zero and the latter returns false. The former (and its callers) can be

[RS6000] Separate dg-require-effective-target options

2020-10-26 Thread Alan Modra via Gcc-patches
Subject was "[RS6000] Tests that use int128_t and -m32" I meant to make this change before committing too. Pushed. * gcc.target/powerpc/vsx_mask-count-runnable.c: Separate options passed to dg-require-effective-target. * gcc.target/powerpc/vsx_mask-expand-runnable.c:

float.h: C2x NaN and Inf macros

2020-10-26 Thread Joseph Myers
C2x adds macros for NaNs and infinities to , some of them previously in (and some still in as well in C2x as an obsolescent feature). Add these macros to GCC's implementation. This omits the macros for DFP signaling NaNs, leaving those to be added in a separate patch which will also need to

Re: [PATCH v7] genemit.c (main): split insn-emit.c for compiling parallelly

2020-10-26 Thread Jojo R
Ping …. …. Jojo 在 2020年10月24日 +0800 PM2:02,Jojo R ,写道: > Hi, > > Has this patch been merged ? > > I track this some weeks and the patch has reviewed still on the way ... > > Could someone help me ? > > Thanks so much. > > Jojo > 在 2020年10月8日 +0800 AM10:01,Jojo R ,写道: > >

Re: move sincos after pre

2020-10-26 Thread Alexandre Oliva
On Oct 23, 2020, Richard Biener wrote: > Can you move it one pass further after sink please? I did, but it didn't solve the recip regressions that my first attempt brought about. > Also I don't > remember exactly but does pass_sincos only handle sin/cos unifying? It rearranges some powi

[RS6000] Adjust testcases for power10 instructions V2

2020-10-26 Thread Alan Modra via Gcc-patches
Revised version. * gcc.dg/pr56727-2.c, gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c, gcc.target/powerpc/fold-vec-load-builtin_vec_xl-char.c, gcc.target/powerpc/fold-vec-load-builtin_vec_xl-double.c,

Re: [RS6000] Unsupported test options for -m32

2020-10-26 Thread David Edelsohn via Gcc-patches
On Mon, Oct 26, 2020 at 7:35 PM Alan Modra wrote: > $ grep mcmodel gcc/config/rs6000/*.opt > gcc/config/rs6000/aix64.opt:mcmodel= > gcc/config/rs6000/aix64.opt:Known code models (for use with the -mcmodel= > option): > gcc/config/rs6000/linux64.opt:mcmodel= > gcc/config/rs6000/linux64.opt:Known

[PATCH V2] aarch64: Add vstN_lane_bf16 + vstNq_lane_bf16 intrinsics

2020-10-26 Thread Andrea Corallo via Gcc-patches
Hi all, Second version of the patch here implementing the bfloat16_t neon related store intrinsics: vst2_lane_bf16, vst2q_lane_bf16, vst3_lane_bf16, vst3q_lane_bf16 vst4_lane_bf16, vst4q_lane_bf16. Please see refer to: ACLE ISA

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Alex Coplan via Gcc-patches
Hi Segher, On 22/10/2020 15:39, Segher Boessenkool wrote: > On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > > Currently, make_extraction() identifies where we can emit an ASHIFT of > > an extend in place of an extraction, but fails to make the corresponding > >

Re: [committed] libstdc++: Simplify std::shared_ptr construction from std::weak_ptr

2020-10-26 Thread Jonathan Wakely via Gcc-patches
On 26/10/20 08:07 +0100, Stephan Bergmann wrote: On 21/10/2020 22:14, Jonathan Wakely via Gcc-patches wrote: The _M_add_ref_lock() and _M_add_ref_lock_nothrow() members of _Sp_counted_base are very similar, except that the former throws an exception when the use count is zero and the latter

[Committed] IBM Z: Add vcond_mask expander

2020-10-26 Thread Andreas Krebbel via Gcc-patches
After adding vec_cmp expanders we have seen various performance related regression in the testsuite. These appear to be caused by a missing vcond_mask definition in the backend. Fixed with this patch. The patch fixes the following testsuite fails: FAIL: gcc.dg/vect/vect-21.c -flto

Re: Materialize clones on demand

2020-10-26 Thread Richard Biener
On Mon, 26 Oct 2020, Jan Hubicka wrote: > > > We seem to leak some hashtables: > > > dwarf2out.c:28850 (dwarf2out_init) 31M: 23.8% > > > 47M 19 : 0.0% ggc > > > > that one likely keeps quite some memory live... > > Yep, having in-memory dwaf2out for

Re: [PATCH] arm: Fix multiple inheritance thunks for thumb-1 with -mpure-code

2020-10-26 Thread Christophe Lyon via Gcc-patches
On Thu, 22 Oct 2020 at 17:22, Richard Earnshaw wrote: > > On 22/10/2020 09:45, Christophe Lyon via Gcc-patches wrote: > > On Wed, 21 Oct 2020 at 19:36, Richard Earnshaw > > wrote: > >> > >> On 21/10/2020 17:11, Christophe Lyon via Gcc-patches wrote: > >>> On Wed, 21 Oct 2020 at 18:07, Richard

Re: [PATCH] PR tree-optimization/97546 Bail out of find_bswap_or_nop on non-INTEGER_CST sizes

2020-10-26 Thread Jakub Jelinek via Gcc-patches
On Mon, Oct 26, 2020 at 11:32:43AM +, Kyrylo Tkachov wrote: > Thanks, that makes sense. > Is the attached patch ok? --- a/gcc/gimple-ssa-store-merging.c +++ b/gcc/gimple-ssa-store-merging.c @@ -851,12 +851,16 @@ find_bswap_or_nop_finalize (struct symbolic_number *n, uint64_t *cmpxchg,

[PATCH V2] aarch64: Add bfloat16 vldN_lane_bf16 + vldNq_lane_bf16 intrisics

2020-10-26 Thread Andrea Corallo via Gcc-patches
Hi all, Second version of the patch here implementing the bfloat16_t neon related load intrinsics: vld2_lane_bf16, vld2q_lane_bf16, vld3_lane_bf16, vld3q_lane_bf16 vld4_lane_bf16, vld4q_lane_bf16. This better narrows testcases so they do not cause regressions for the arm backend where these

[PATCH V2] aarch64: Add vcopy(q)__lane(q)_bf16 intrinsics

2020-10-26 Thread Andrea Corallo via Gcc-patches
Hi all, Second version of the patch here implementing the bfloat16_t neon related copy intrinsics: vcopy_lane_bf16, vcopyq_lane_bf16, vcopyq_laneq_bf16, vcopy_laneq_bf16. This better narrows testcases so they do not cause regressions for the arm backend where these intrinsics are not yet

Re: Materialize clones on demand

2020-10-26 Thread Jan Hubicka
> > > > > > > cselib.c:3137 (cselib_init) 34M: 25.9% > > > > 34M 1514k: 17.3% heap > > > > tree-scalar-evolution.c:2984 (scev_initialize) 37M: 27.6% > > > > 50M 228k: 2.6% ggc > > > > > > Hmm, so we do > > > > > >

[ping*n] aarch64: move and adjust PROBE_STACK_*_REG

2020-10-26 Thread Olivier Hainque
Ping, please ? Thanks in advance, Olivier > On 15 Oct 2020, at 08:38, Olivier Hainque wrote: > > Ping, please ? > > Patch re-attached for convenience. > > Thanks in advance! > > Best Regards, > > Olivier > >> On 24 Sep 2020, at 11:46, Olivier Hainque wrote: >> >> Re-proposing this

Re: Materialize clones on demand

2020-10-26 Thread Jan Hubicka
> > We seem to leak some hashtables: > > dwarf2out.c:28850 (dwarf2out_init) 31M: 23.8% > > 47M 19 : 0.0% ggc > > that one likely keeps quite some memory live... Yep, having in-memory dwaf2out for whole cc1plus eats a lot of memory quite naturally. > > >

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Alex Coplan via Gcc-patches
On 26/10/2020 05:48, Segher Boessenkool wrote: > Hi! > > On Mon, Oct 26, 2020 at 10:09:41AM +, Alex Coplan wrote: > > On 22/10/2020 15:39, Segher Boessenkool wrote: > > > On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > > > > Currently, make_extraction() identifies where we can

Re: [PATCH]AArch64 Fix overflow in memcopy expansion on aarch64.

2020-10-26 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >/* We can't do anything smart if the amount to copy is not constant. */ >if (!CONST_INT_P (operands[2])) > return false; > > - n = INTVAL (operands[2]); > + /* This may get truncated but that's fine as it would be above our maximum > + memset inline

[PATCH] PR tree-optimization/97546 Bail out of find_bswap_or_nop on non-INTEGER_CST sizes

2020-10-26 Thread Kyrylo Tkachov via Gcc-patches
Hi all, This patch fixes the ICE in the PR by bailing out of find_bswap_or_nop on poly_int sizes. I don't think it intends to handle them and from my reading of the code it's the most appropriate place to reject them here rather than in the callers. Bootstrapped and tested on

Re: [PATCH] PR tree-optimization/97546 Bail out of find_bswap_or_nop on non-INTEGER_CST sizes

2020-10-26 Thread Jakub Jelinek via Gcc-patches
On Mon, Oct 26, 2020 at 09:20:42AM +, Kyrylo Tkachov via Gcc-patches wrote: > This patch fixes the ICE in the PR by bailing out of find_bswap_or_nop on > poly_int sizes. > I don't think it intends to handle them and from my reading of the code it's > the most appropriate place to reject them

[PATCH] middle-end/97554 - avoid overflow in alloc size compute

2020-10-26 Thread Richard Biener
This avoids overflow in the allocation size computations in sbitmap_vector_alloc when the result exceeds 2GB. Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed. 2020-10-26 Richard Biener * sbitmap.c (sbitmap_vector_alloc): Use size_t for byte quantities to avoid

[PATCH] tree-optimization/97539 - reset out-of-loop debug uses before peeling

2020-10-26 Thread Richard Biener
This makes sure to reset out-of-loop debug uses before vectorizer loop peeling as we cannot make sure to retain the use-def dominance relationship when there are no LC SSA nodes. Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed. 2020-10-26 Richard Biener PR

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Segher Boessenkool
Hi! On Mon, Oct 26, 2020 at 10:09:41AM +, Alex Coplan wrote: > On 22/10/2020 15:39, Segher Boessenkool wrote: > > On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > > > Currently, make_extraction() identifies where we can emit an ASHIFT of > > > an extend in place of an

RE: [PATCH 6/6] ipa-cp: Separate and increase the large-unit parameter

2020-10-26 Thread Tamar Christina via Gcc-patches
Hi Martin, I have been playing with --param ipa-cp-large-unit-insns but it doesn't seem to have any meaningful effect on exchange2 and I still can't recover the 12% regression vs GCC 10. Do I need to use another parameter here? Thanks, Tamar > -Original Message- > From: Gcc-patches

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Alex Coplan via Gcc-patches
On 26/10/2020 11:06, Alex Coplan via Gcc-patches wrote: > Well, only the low 32 bits of the subreg are valid. But because those > low 32 bits are shifted left 2 times, the low 34 bits of the ashift are > valid: the bottom 2 bits of the ashift are zeros, and the 32 bits above > those are from the

RE: [PATCH] PR tree-optimization/97546 Bail out of find_bswap_or_nop on non-INTEGER_CST sizes

2020-10-26 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Jakub Jelinek > Sent: 26 October 2020 09:32 > To: Kyrylo Tkachov > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] PR tree-optimization/97546 Bail out of > find_bswap_or_nop on non-INTEGER_CST sizes > > On Mon, Oct 26, 2020 at 09:20:42AM +, Kyrylo

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Segher Boessenkool
On Mon, Oct 26, 2020 at 11:06:22AM +, Alex Coplan wrote: > Well, only the low 32 bits of the subreg are valid. But because those > low 32 bits are shifted left 2 times, the low 34 bits of the ashift are > valid: the bottom 2 bits of the ashift are zeros, and the 32 bits above > those are from

Fix simdclones pass

2020-10-26 Thread Jan Hubicka
Hi, this patch makes cleaning of stmt pointers in references more robust so late IPA passes do not break. Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: 2020-10-26 Jan Hubicka PR ipa/97576 * cgraphclones.c (cgraph_node::materialize_clone): Clear stmt

[PATCH] nvptx: Cache stacks block for OpenMP kernel launch

2020-10-26 Thread Julian Brown
Hi, This patch adds caching for the stack block allocated for offloaded OpenMP kernel launches on NVPTX. This is a performance optimisation -- we observed an average 11% or so performance improvement with this patch across a set of accelerated GPU benchmarks on one machine (results vary according

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread Jan Hubicka
> On 10/26/20 3:22 PM, Jan Hubicka wrote: > > Hi, > > this patch implements thre two-state optimize_for_size predicates, so with > > -Os > > and with profile feedback for never executed code it returns > > OPTIMIZE_SIZE_MAX > > while in cases we decide to optimize for size based on branch

Re: [RS6000] Tests that use int128_t and -m32

2020-10-26 Thread Segher Boessenkool
On Mon, Oct 26, 2020 at 10:34:20PM +1030, Alan Modra wrote: > On Sun, Oct 25, 2020 at 10:43:12AM -0400, David Edelsohn wrote: > > Another problem with all of the vsx_mask test cases is that they use > > -mcpu=power10 instead of -mdejagnu-cpu=power10. Can you follow up > > with that fix or do you

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Alex Coplan via Gcc-patches
On 26/10/2020 06:51, Segher Boessenkool wrote: > On Mon, Oct 26, 2020 at 11:06:22AM +, Alex Coplan wrote: > > Well, only the low 32 bits of the subreg are valid. But because those > > low 32 bits are shifted left 2 times, the low 34 bits of the ashift are > > valid: the bottom 2 bits of the

Re: [PATCH] g++, libstdc++: implement __is_nothrow_{constructible, assignable}

2020-10-26 Thread Jonathan Wakely via Gcc-patches
On 24/10/20 02:32 +0300, Ville Voutilainen via Libstdc++ wrote: @@ -1118,15 +1080,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION }; template -struct __is_nt_assignable_impl -: public integral_constant() = declval<_Up>())> -{ }; - - template -struct __is_nothrow_assignable_impl -

Re: [RS6000] Tests that use int128_t and -m32

2020-10-26 Thread Segher Boessenkool
Hi Alan, On Sun, Oct 25, 2020 at 09:50:01PM +1030, Alan Modra wrote: > All these tests fail with -m32 due to lack of int128 support, Is there any good reason __int128 is not enabled for rs6000 -m32, btw? > in some > cases with what I thought was not the best error message. For example >

Re: [PATCH]AArch64 Fix overflow in memcopy expansion on aarch64.

2020-10-26 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi Richard, > > The 10/26/2020 11:29, Richard Sandiford wrote: >> Tamar Christina writes: >> >/* We can't do anything smart if the amount to copy is not constant. */ >> >if (!CONST_INT_P (operands[2])) >> > return false; >> > >> > - n = INTVAL

Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3

2020-10-26 Thread Richard Sandiford via Gcc-patches
xiezhiheng writes: >> -Original Message- >> From: Richard Sandiford [mailto:richard.sandif...@arm.com] >> Sent: Wednesday, October 21, 2020 12:54 AM >> To: xiezhiheng >> Cc: Richard Biener ; gcc-patches@gcc.gnu.org >> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions

[PATCH] cp/decl.c: Set DECL_INITIAL before attribute processing

2020-10-26 Thread Jozef Lawrynowicz
Attribute handlers may want to examine DECL_INITIAL for a decl, to validate the attribute being applied. For C++, DECL_INITIAL is currently not set until cp_finish_decl, by which time attribute validation has already been performed. For msp430-elf this causes the "persistent" attribute to always

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Alex Coplan via Gcc-patches
On 26/10/2020 07:12, Segher Boessenkool wrote: > Hi! > > On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > > @@ -7650,20 +7650,27 @@ make_extraction (machine_mode mode, rtx inner, > > HOST_WIDE_INT pos, > > is_mode = GET_MODE (SUBREG_REG (inner)); > >inner = SUBREG_REG

Re: [PATCH] nvptx: Cache stacks block for OpenMP kernel launch

2020-10-26 Thread Jakub Jelinek via Gcc-patches
On Mon, Oct 26, 2020 at 07:14:48AM -0700, Julian Brown wrote: > This patch adds caching for the stack block allocated for offloaded > OpenMP kernel launches on NVPTX. This is a performance optimisation -- > we observed an average 11% or so performance improvement with this patch > across a set of

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread H.J. Lu via Gcc-patches
On Mon, Oct 26, 2020 at 7:23 AM Jan Hubicka wrote: > > Hi, > this patch implements thre two-state optimize_for_size predicates, so with -Os > and with profile feedback for never executed code it returns OPTIMIZE_SIZE_MAX > while in cases we decide to optimize for size based on branch prediction

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread Martin Liška
On 10/26/20 3:22 PM, Jan Hubicka wrote: Hi, this patch implements thre two-state optimize_for_size predicates, so with -Os and with profile feedback for never executed code it returns OPTIMIZE_SIZE_MAX while in cases we decide to optimize for size based on branch prediction logic it return

Re: [RS6000] Tests that use int128_t and -m32

2020-10-26 Thread Alan Modra via Gcc-patches
On Sun, Oct 25, 2020 at 10:43:12AM -0400, David Edelsohn wrote: > On Sun, Oct 25, 2020 at 7:20 AM Alan Modra wrote: > > > > All these tests fail with -m32 due to lack of int128 support, in some > > cases with what I thought was not the best error message. For example > >

Re: [PATCH]AArch64 Fix overflow in memcopy expansion on aarch64.

2020-10-26 Thread Tamar Christina via Gcc-patches
Hi Richard, The 10/26/2020 11:29, Richard Sandiford wrote: > Tamar Christina writes: > >/* We can't do anything smart if the amount to copy is not constant. */ > >if (!CONST_INT_P (operands[2])) > > return false; > > > > - n = INTVAL (operands[2]); > > + /* This may get

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Segher Boessenkool
Hi! On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > @@ -7650,20 +7650,27 @@ make_extraction (machine_mode mode, rtx inner, > HOST_WIDE_INT pos, > is_mode = GET_MODE (SUBREG_REG (inner)); >inner = SUBREG_REG (inner); > } > + else if ((GET_CODE (inner) == ASHIFT

[committed] libstdc++: Fix declarations of memalign etc. for freestanding [PR 97570]

2020-10-26 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog: PR libstdc++/97570 * libsupc++/new_opa.cc: Declare size_t in global namespace. Remove unused header. Tested x86_64-linux. Successfully built for avr cross (with avr-libc 2.0). Committed to trunk. commit 93e9a7bcd5434a24c945de33cd7fa01a25f68418

Re: [PATCH] cp/decl.c: Set DECL_INITIAL before attribute processing

2020-10-26 Thread Jozef Lawrynowicz
On Mon, Oct 26, 2020 at 01:30:29PM +, Jozef Lawrynowicz wrote: > Attribute handlers may want to examine DECL_INITIAL for a decl, to > validate the attribute being applied. For C++, DECL_INITIAL is currently > not set until cp_finish_decl, by which time attribute validation has > already been

Implement three-level optimize_for_size predicates

2020-10-26 Thread Jan Hubicka
Hi, this patch implements thre two-state optimize_for_size predicates, so with -Os and with profile feedback for never executed code it returns OPTIMIZE_SIZE_MAX while in cases we decide to optimize for size based on branch prediction logic it return OPTIMIZE_SIZE_BALLANCED. The idea is that for

[PATCH] Re: error: ‘EVRP_MODE_DEBUG’ was not declared – was: [PUSHED] Ranger classes.

2020-10-26 Thread Andrew MacLeod via Gcc-patches
On 10/25/20 8:37 PM, Maciej W. Rozycki wrote: On Tue, 6 Oct 2020, Andrew MacLeod via Gcc-patches wrote: Build fails here now with: gimple-range.h:168:59: error: ‘EVRP_MODE_DEBUG’ was not declared in this scope And now builds – as the "Hybrid EVRP and testcases" was pushed as well, a bit more

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread Jan Hubicka
> On Mon, Oct 26, 2020 at 7:23 AM Jan Hubicka wrote: > > > > Hi, > > this patch implements thre two-state optimize_for_size predicates, so with > > -Os > > and with profile feedback for never executed code it returns > > OPTIMIZE_SIZE_MAX > > while in cases we decide to optimize for size based

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread H.J. Lu via Gcc-patches
On Mon, Oct 26, 2020 at 10:14 AM Jan Hubicka wrote: > > > > > > > For example you had patch that limited "rep cmpsb" expansion for > > > -minline-all-stringops. Now the conditions could be > > > -minline-all-stringops || optimize_insn_for_size () == OPTIMIZE_SIZE_MAX > > > since it is still

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Qing Zhao via Gcc-patches
The following is the current change in i386.c, could you check whether the logic is good? thanks. Qing /* Check whether the register REGNO should be zeroed on X86. When ALL_SSE_ZEROED is true, all SSE registers have been zeroed together, no need to zero it again. When

Re: [PATCH V2] aarch64: Add vcopy(q)__lane(q)_bf16 intrinsics

2020-10-26 Thread Richard Sandiford via Gcc-patches
Andrea Corallo via Gcc-patches writes: > diff --git > a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcopy_lane_bf16_indices_1.c > > b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcopy_lane_bf16_indices_1.c > new file mode 100644 > index 000..9cbb5ea8110 > --- /dev/null

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Segher Boessenkool
On Mon, Oct 26, 2020 at 01:28:42PM +, Alex Coplan wrote: > On 26/10/2020 07:12, Segher Boessenkool wrote: > > On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > > Can you instead replace the mult by a shift somewhere earlier in > > make_extract? That would make a lot more sense

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Segher Boessenkool
On Mon, Oct 26, 2020 at 01:18:54PM +, Alex Coplan wrote: > - else if (GET_CODE (inner) == ASHIFT > + else if ((GET_CODE (inner) == ASHIFT || GET_CODE (inner) == MULT) As I wrote in the other mail, write this as two cases. Write something in the comment for the mult one that this is for the

Re: [PATCH v2] builtins: rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-10-26 Thread Segher Boessenkool
On Mon, Oct 26, 2020 at 01:05:00PM -0300, Raoni Fassina Firmino wrote: > On Mon, Sep 28, 2020 at 11:42:13AM -0500, will schmidt wrote: > > > +;; FE_INEXACT, FE_DIVBYZERO, FE_UNDERFLOW and FE_OVERFLOW flags. > > > +;; It doesn't handle values out of range, and always returns 0. > > > +;; Note that

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread Jan Hubicka
> > > > For example you had patch that limited "rep cmpsb" expansion for > > -minline-all-stringops. Now the conditions could be > > -minline-all-stringops || optimize_insn_for_size () == OPTIMIZE_SIZE_MAX > > since it is still useful size optimization. > > > > I am not sure if you had other

  1   2   >