Re: [committed] libstdc++: Simplify std::shared_ptr construction from std::weak_ptr

2020-10-26 Thread Stephan Bergmann via Gcc-patches
On 21/10/2020 22:14, Jonathan Wakely via Gcc-patches wrote: The _M_add_ref_lock() and _M_add_ref_lock_nothrow() members of _Sp_counted_base are very similar, except that the former throws an exception when the use count is zero and the latter returns false. The former (and its callers) can be imp

Re: Materialize clones on demand

2020-10-26 Thread Richard Biener
On Fri, 23 Oct 2020, Jan Hubicka wrote: > > Hi, > > > > On Thu, Oct 22 2020, Jan Hubicka wrote: > > > Hi, > > > this patch removes the pass to materialize all clones and instead this > > > is now done on demand. The motivation is to reduce lifetime of function > > > bodies in ltrans that should

Re: [PATCH] Add debug_bb_details and debug_bb_n_details

2020-10-26 Thread Richard Biener
On Mon, 26 Oct 2020, Xionghu Luo wrote: > > On 2020/10/23 18:18, Richard Biener wrote: > > On Fri, 23 Oct 2020, Xiong Hu Luo wrote: > > > >> Sometimes debug_bb_slim&debug_bb_n_slim is not enough, how about adding > >> this debug_bb_details&debug_bb_n_details? Or any other similar call > >> exist

Re: [PATCH, OpenMP 5.0] Implement structure element mapping changes in 5.0

2020-10-26 Thread Jakub Jelinek via Gcc-patches
On Sat, Oct 24, 2020 at 01:43:26AM +0800, Chung-Lin Tang wrote: > On 2020/10/23 8:13 PM, Jakub Jelinek wrote: > > > In general, upon encountering a construct, we can't statically determine > > > and insert alloc/release maps > > > for each element of a structure variable, since we don't really kno

[PATCH V2] aarch64: Add bfloat16 vldN_lane_bf16 + vldNq_lane_bf16 intrisics

2020-10-26 Thread Andrea Corallo via Gcc-patches
Hi all, Second version of the patch here implementing the bfloat16_t neon related load intrinsics: vld2_lane_bf16, vld2q_lane_bf16, vld3_lane_bf16, vld3q_lane_bf16 vld4_lane_bf16, vld4q_lane_bf16. This better narrows testcases so they do not cause regressions for the arm backend where these intri

[PATCH V2] aarch64: Add vstN_lane_bf16 + vstNq_lane_bf16 intrinsics

2020-10-26 Thread Andrea Corallo via Gcc-patches
Hi all, Second version of the patch here implementing the bfloat16_t neon related store intrinsics: vst2_lane_bf16, vst2q_lane_bf16, vst3_lane_bf16, vst3q_lane_bf16 vst4_lane_bf16, vst4q_lane_bf16. Please see refer to: ACLE ISA

[PATCH V2] aarch64: Add vcopy(q)__lane(q)_bf16 intrinsics

2020-10-26 Thread Andrea Corallo via Gcc-patches
Hi all, Second version of the patch here implementing the bfloat16_t neon related copy intrinsics: vcopy_lane_bf16, vcopyq_lane_bf16, vcopyq_laneq_bf16, vcopy_laneq_bf16. This better narrows testcases so they do not cause regressions for the arm backend where these intrinsics are not yet present.

[PATCH] PR tree-optimization/97546 Bail out of find_bswap_or_nop on non-INTEGER_CST sizes

2020-10-26 Thread Kyrylo Tkachov via Gcc-patches
Hi all, This patch fixes the ICE in the PR by bailing out of find_bswap_or_nop on poly_int sizes. I don't think it intends to handle them and from my reading of the code it's the most appropriate place to reject them here rather than in the callers. Bootstrapped and tested on aarch64-none-linux

Re: [PATCH] PR tree-optimization/97546 Bail out of find_bswap_or_nop on non-INTEGER_CST sizes

2020-10-26 Thread Jakub Jelinek via Gcc-patches
On Mon, Oct 26, 2020 at 09:20:42AM +, Kyrylo Tkachov via Gcc-patches wrote: > This patch fixes the ICE in the PR by bailing out of find_bswap_or_nop on > poly_int sizes. > I don't think it intends to handle them and from my reading of the code it's > the most appropriate place to reject them

Re: Materialize clones on demand

2020-10-26 Thread Jan Hubicka
> > We seem to leak some hashtables: > > dwarf2out.c:28850 (dwarf2out_init) 31M: 23.8% > > 47M 19 : 0.0% ggc > > that one likely keeps quite some memory live... Yep, having in-memory dwaf2out for whole cc1plus eats a lot of memory quite naturally. > > > c

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Alex Coplan via Gcc-patches
Hi Segher, On 22/10/2020 15:39, Segher Boessenkool wrote: > On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > > Currently, make_extraction() identifies where we can emit an ASHIFT of > > an extend in place of an extraction, but fails to make the corresponding > > canonicalization/simp

Re: [committed] libstdc++: Simplify std::shared_ptr construction from std::weak_ptr

2020-10-26 Thread Jonathan Wakely via Gcc-patches
On 26/10/20 08:07 +0100, Stephan Bergmann wrote: On 21/10/2020 22:14, Jonathan Wakely via Gcc-patches wrote: The _M_add_ref_lock() and _M_add_ref_lock_nothrow() members of _Sp_counted_base are very similar, except that the former throws an exception when the use count is zero and the latter retu

Re: Materialize clones on demand

2020-10-26 Thread Richard Biener
On Mon, 26 Oct 2020, Jan Hubicka wrote: > > > We seem to leak some hashtables: > > > dwarf2out.c:28850 (dwarf2out_init) 31M: 23.8% > > > 47M 19 : 0.0% ggc > > > > that one likely keeps quite some memory live... > > Yep, having in-memory dwaf2out for whole

[PATCH] middle-end/97554 - avoid overflow in alloc size compute

2020-10-26 Thread Richard Biener
This avoids overflow in the allocation size computations in sbitmap_vector_alloc when the result exceeds 2GB. Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed. 2020-10-26 Richard Biener * sbitmap.c (sbitmap_vector_alloc): Use size_t for byte quantities to avoid overfl

[PATCH] tree-optimization/97539 - reset out-of-loop debug uses before peeling

2020-10-26 Thread Richard Biener
This makes sure to reset out-of-loop debug uses before vectorizer loop peeling as we cannot make sure to retain the use-def dominance relationship when there are no LC SSA nodes. Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed. 2020-10-26 Richard Biener PR tree-optimization/

Re: Materialize clones on demand

2020-10-26 Thread Jan Hubicka
> > > > > > > cselib.c:3137 (cselib_init) 34M: 25.9% > > > > 34M 1514k: 17.3% heap > > > > tree-scalar-evolution.c:2984 (scev_initialize) 37M: 27.6% > > > > 50M 228k: 2.6% ggc > > > > > > Hmm, so we do > > > > > > scalar_e

[Committed] IBM Z: Add vcond_mask expander

2020-10-26 Thread Andreas Krebbel via Gcc-patches
After adding vec_cmp expanders we have seen various performance related regression in the testsuite. These appear to be caused by a missing vcond_mask definition in the backend. Fixed with this patch. The patch fixes the following testsuite fails: FAIL: gcc.dg/vect/vect-21.c -flto -ffat-lto-obj

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Segher Boessenkool
Hi! On Mon, Oct 26, 2020 at 10:09:41AM +, Alex Coplan wrote: > On 22/10/2020 15:39, Segher Boessenkool wrote: > > On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > > > Currently, make_extraction() identifies where we can emit an ASHIFT of > > > an extend in place of an extraction,

Re: [PATCH] arm: Fix multiple inheritance thunks for thumb-1 with -mpure-code

2020-10-26 Thread Christophe Lyon via Gcc-patches
On Thu, 22 Oct 2020 at 17:22, Richard Earnshaw wrote: > > On 22/10/2020 09:45, Christophe Lyon via Gcc-patches wrote: > > On Wed, 21 Oct 2020 at 19:36, Richard Earnshaw > > wrote: > >> > >> On 21/10/2020 17:11, Christophe Lyon via Gcc-patches wrote: > >>> On Wed, 21 Oct 2020 at 18:07, Richard Ear

RE: [PATCH 6/6] ipa-cp: Separate and increase the large-unit parameter

2020-10-26 Thread Tamar Christina via Gcc-patches
Hi Martin, I have been playing with --param ipa-cp-large-unit-insns but it doesn't seem to have any meaningful effect on exchange2 and I still can't recover the 12% regression vs GCC 10. Do I need to use another parameter here? Thanks, Tamar > -Original Message- > From: Gcc-patches On

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Alex Coplan via Gcc-patches
On 26/10/2020 05:48, Segher Boessenkool wrote: > Hi! > > On Mon, Oct 26, 2020 at 10:09:41AM +, Alex Coplan wrote: > > On 22/10/2020 15:39, Segher Boessenkool wrote: > > > On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > > > > Currently, make_extraction() identifies where we can e

[ping*n] aarch64: move and adjust PROBE_STACK_*_REG

2020-10-26 Thread Olivier Hainque
Ping, please ? Thanks in advance, Olivier > On 15 Oct 2020, at 08:38, Olivier Hainque wrote: > > Ping, please ? > > Patch re-attached for convenience. > > Thanks in advance! > > Best Regards, > > Olivier > >> On 24 Sep 2020, at 11:46, Olivier Hainque wrote: >> >> Re-proposing this patch

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Alex Coplan via Gcc-patches
On 26/10/2020 11:06, Alex Coplan via Gcc-patches wrote: > Well, only the low 32 bits of the subreg are valid. But because those > low 32 bits are shifted left 2 times, the low 34 bits of the ashift are > valid: the bottom 2 bits of the ashift are zeros, and the 32 bits above > those are from the i

Re: [PATCH]AArch64 Fix overflow in memcopy expansion on aarch64.

2020-10-26 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >/* We can't do anything smart if the amount to copy is not constant. */ >if (!CONST_INT_P (operands[2])) > return false; > > - n = INTVAL (operands[2]); > + /* This may get truncated but that's fine as it would be above our maximum > + memset inline li

RE: [PATCH] PR tree-optimization/97546 Bail out of find_bswap_or_nop on non-INTEGER_CST sizes

2020-10-26 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Jakub Jelinek > Sent: 26 October 2020 09:32 > To: Kyrylo Tkachov > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] PR tree-optimization/97546 Bail out of > find_bswap_or_nop on non-INTEGER_CST sizes > > On Mon, Oct 26, 2020 at 09:20:42AM +, Kyrylo Tk

Re: [PATCH] PR tree-optimization/97546 Bail out of find_bswap_or_nop on non-INTEGER_CST sizes

2020-10-26 Thread Jakub Jelinek via Gcc-patches
On Mon, Oct 26, 2020 at 11:32:43AM +, Kyrylo Tkachov wrote: > Thanks, that makes sense. > Is the attached patch ok? --- a/gcc/gimple-ssa-store-merging.c +++ b/gcc/gimple-ssa-store-merging.c @@ -851,12 +851,16 @@ find_bswap_or_nop_finalize (struct symbolic_number *n, uint64_t *cmpxchg, gimple

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Segher Boessenkool
On Mon, Oct 26, 2020 at 11:06:22AM +, Alex Coplan wrote: > Well, only the low 32 bits of the subreg are valid. But because those > low 32 bits are shifted left 2 times, the low 34 bits of the ashift are > valid: the bottom 2 bits of the ashift are zeros, and the 32 bits above > those are from t

Re: [RS6000] Tests that use int128_t and -m32

2020-10-26 Thread Alan Modra via Gcc-patches
On Sun, Oct 25, 2020 at 10:43:12AM -0400, David Edelsohn wrote: > On Sun, Oct 25, 2020 at 7:20 AM Alan Modra wrote: > > > > All these tests fail with -m32 due to lack of int128 support, in some > > cases with what I thought was not the best error message. For example > > vsx_mask-move-runnable.c:

Re: [PATCH]AArch64 Fix overflow in memcopy expansion on aarch64.

2020-10-26 Thread Tamar Christina via Gcc-patches
Hi Richard, The 10/26/2020 11:29, Richard Sandiford wrote: > Tamar Christina writes: > >/* We can't do anything smart if the amount to copy is not constant. */ > >if (!CONST_INT_P (operands[2])) > > return false; > > > > - n = INTVAL (operands[2]); > > + /* This may get truncated

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Segher Boessenkool
Hi! On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > @@ -7650,20 +7650,27 @@ make_extraction (machine_mode mode, rtx inner, > HOST_WIDE_INT pos, > is_mode = GET_MODE (SUBREG_REG (inner)); >inner = SUBREG_REG (inner); > } > + else if ((GET_CODE (inner) == ASHIFT |

Re: [PATCH] g++, libstdc++: implement __is_nothrow_{constructible, assignable}

2020-10-26 Thread Jonathan Wakely via Gcc-patches
On 24/10/20 02:32 +0300, Ville Voutilainen via Libstdc++ wrote: @@ -1118,15 +1080,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION }; template -struct __is_nt_assignable_impl -: public integral_constant() = declval<_Up>())> -{ }; - - template -struct __is_nothrow_assignable_impl -

Re: [RS6000] Tests that use int128_t and -m32

2020-10-26 Thread Segher Boessenkool
Hi Alan, On Sun, Oct 25, 2020 at 09:50:01PM +1030, Alan Modra wrote: > All these tests fail with -m32 due to lack of int128 support, Is there any good reason __int128 is not enabled for rs6000 -m32, btw? > in some > cases with what I thought was not the best error message. For example > vsx_mas

Re: [RS6000] Tests that use int128_t and -m32

2020-10-26 Thread Segher Boessenkool
On Mon, Oct 26, 2020 at 10:34:20PM +1030, Alan Modra wrote: > On Sun, Oct 25, 2020 at 10:43:12AM -0400, David Edelsohn wrote: > > Another problem with all of the vsx_mask test cases is that they use > > -mcpu=power10 instead of -mdejagnu-cpu=power10. Can you follow up > > with that fix or do you

[committed] libstdc++: Fix declarations of memalign etc. for freestanding [PR 97570]

2020-10-26 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog: PR libstdc++/97570 * libsupc++/new_opa.cc: Declare size_t in global namespace. Remove unused header. Tested x86_64-linux. Successfully built for avr cross (with avr-libc 2.0). Committed to trunk. commit 93e9a7bcd5434a24c945de33cd7fa01a25f68418 Aut

Re: [PATCH]AArch64 Fix overflow in memcopy expansion on aarch64.

2020-10-26 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi Richard, > > The 10/26/2020 11:29, Richard Sandiford wrote: >> Tamar Christina writes: >> >/* We can't do anything smart if the amount to copy is not constant. */ >> >if (!CONST_INT_P (operands[2])) >> > return false; >> > >> > - n = INTVAL (operands[2

Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3

2020-10-26 Thread Richard Sandiford via Gcc-patches
xiezhiheng writes: >> -Original Message- >> From: Richard Sandiford [mailto:richard.sandif...@arm.com] >> Sent: Wednesday, October 21, 2020 12:54 AM >> To: xiezhiheng >> Cc: Richard Biener ; gcc-patches@gcc.gnu.org >> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions >

Fix simdclones pass

2020-10-26 Thread Jan Hubicka
Hi, this patch makes cleaning of stmt pointers in references more robust so late IPA passes do not break. Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: 2020-10-26 Jan Hubicka PR ipa/97576 * cgraphclones.c (cgraph_node::materialize_clone): Clear stmt

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Alex Coplan via Gcc-patches
On 26/10/2020 06:51, Segher Boessenkool wrote: > On Mon, Oct 26, 2020 at 11:06:22AM +, Alex Coplan wrote: > > Well, only the low 32 bits of the subreg are valid. But because those > > low 32 bits are shifted left 2 times, the low 34 bits of the ashift are > > valid: the bottom 2 bits of the ash

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Alex Coplan via Gcc-patches
On 26/10/2020 07:12, Segher Boessenkool wrote: > Hi! > > On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > > @@ -7650,20 +7650,27 @@ make_extraction (machine_mode mode, rtx inner, > > HOST_WIDE_INT pos, > > is_mode = GET_MODE (SUBREG_REG (inner)); > >inner = SUBREG_REG (i

[PATCH] cp/decl.c: Set DECL_INITIAL before attribute processing

2020-10-26 Thread Jozef Lawrynowicz
Attribute handlers may want to examine DECL_INITIAL for a decl, to validate the attribute being applied. For C++, DECL_INITIAL is currently not set until cp_finish_decl, by which time attribute validation has already been performed. For msp430-elf this causes the "persistent" attribute to always b

Re: [PATCH] cp/decl.c: Set DECL_INITIAL before attribute processing

2020-10-26 Thread Jozef Lawrynowicz
On Mon, Oct 26, 2020 at 01:30:29PM +, Jozef Lawrynowicz wrote: > Attribute handlers may want to examine DECL_INITIAL for a decl, to > validate the attribute being applied. For C++, DECL_INITIAL is currently > not set until cp_finish_decl, by which time attribute validation has > already been pe

[PATCH] nvptx: Cache stacks block for OpenMP kernel launch

2020-10-26 Thread Julian Brown
Hi, This patch adds caching for the stack block allocated for offloaded OpenMP kernel launches on NVPTX. This is a performance optimisation -- we observed an average 11% or so performance improvement with this patch across a set of accelerated GPU benchmarks on one machine (results vary according

Implement three-level optimize_for_size predicates

2020-10-26 Thread Jan Hubicka
Hi, this patch implements thre two-state optimize_for_size predicates, so with -Os and with profile feedback for never executed code it returns OPTIMIZE_SIZE_MAX while in cases we decide to optimize for size based on branch prediction logic it return OPTIMIZE_SIZE_BALLANCED. The idea is that for p

[PATCH] Re: error: ‘EVRP_MODE_DEBUG’ was not declared – was: [PUSHED] Ranger classes.

2020-10-26 Thread Andrew MacLeod via Gcc-patches
On 10/25/20 8:37 PM, Maciej W. Rozycki wrote: On Tue, 6 Oct 2020, Andrew MacLeod via Gcc-patches wrote: Build fails here now with: gimple-range.h:168:59: error: ‘EVRP_MODE_DEBUG’ was not declared in this scope And now builds – as the "Hybrid EVRP and testcases" was pushed as well, a bit more

Re: [PATCH] nvptx: Cache stacks block for OpenMP kernel launch

2020-10-26 Thread Jakub Jelinek via Gcc-patches
On Mon, Oct 26, 2020 at 07:14:48AM -0700, Julian Brown wrote: > This patch adds caching for the stack block allocated for offloaded > OpenMP kernel launches on NVPTX. This is a performance optimisation -- > we observed an average 11% or so performance improvement with this patch > across a set of a

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread H.J. Lu via Gcc-patches
On Mon, Oct 26, 2020 at 7:23 AM Jan Hubicka wrote: > > Hi, > this patch implements thre two-state optimize_for_size predicates, so with -Os > and with profile feedback for never executed code it returns OPTIMIZE_SIZE_MAX > while in cases we decide to optimize for size based on branch prediction lo

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread Martin Liška
On 10/26/20 3:22 PM, Jan Hubicka wrote: Hi, this patch implements thre two-state optimize_for_size predicates, so with -Os and with profile feedback for never executed code it returns OPTIMIZE_SIZE_MAX while in cases we decide to optimize for size based on branch prediction logic it return OPTIMI

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread Jan Hubicka
> On 10/26/20 3:22 PM, Jan Hubicka wrote: > > Hi, > > this patch implements thre two-state optimize_for_size predicates, so with > > -Os > > and with profile feedback for never executed code it returns > > OPTIMIZE_SIZE_MAX > > while in cases we decide to optimize for size based on branch predict

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread Jan Hubicka
> On Mon, Oct 26, 2020 at 7:23 AM Jan Hubicka wrote: > > > > Hi, > > this patch implements thre two-state optimize_for_size predicates, so with > > -Os > > and with profile feedback for never executed code it returns > > OPTIMIZE_SIZE_MAX > > while in cases we decide to optimize for size based o

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Qing Zhao via Gcc-patches
>> >> +/* Generate insns to zero all st/mm registers together. >> + Return true when zeroing instructions are generated. >> + Assume the number of st registers that are zeroed is num_of_st, >> + we will emit the following sequence to zero them together: >> + fldz; \ >

*PING* RE: [Patch] testsuite: Avoid TCL errors when rootme or ASAN/TSAN/UBSAN is not available (was: Re: [Patch] testsuite: Avoid TCL errors when ASAN/TSAN/UBSAN is not available)

2020-10-26 Thread Burnus, Tobias
-Original Message- From: Tobias Burnus [mailto:tob...@codesourcery.com] Sent: Monday, October 19, 2020 6:03 PM To: gcc-patches ; Rainer Orth ; Mike Stump Subject: [Patch] testsuite: Avoid TCL errors when rootme or ASAN/TSAN/UBSAN is not available (was: Re: [Patch] testsuite: Avoid TCL

[PATCH] Refactor SLP instance analysis

2020-10-26 Thread Richard Biener
This refactors the toplevel entry to analyze an SLP instance to expose a worker analyzing from a vector of stmts and an SLP entry kind. Bootstrap & regtest running on x86_64-unknown-linux-gnu. 2020-10-26 Richard Biener * tree-vect-slp.c (enum slp_instance_kind): New. (vect_bui

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread H.J. Lu via Gcc-patches
On Mon, Oct 26, 2020 at 7:36 AM Jan Hubicka wrote: > > > On Mon, Oct 26, 2020 at 7:23 AM Jan Hubicka wrote: > > > > > > Hi, > > > this patch implements thre two-state optimize_for_size predicates, so > > > with -Os > > > and with profile feedback for never executed code it returns > > > OPTIMIZ

Re: Extend builtin fnspecs

2020-10-26 Thread Richard Biener
On Mon, 19 Oct 2020, Jan Hubicka wrote: > > > + /* True if memory reached by the argument is read. > > > + Valid only if all loads are known. */ > > > + bool > > > + arg_read_p (unsigned int i) > > > + { > > > +unsigned int idx = arg_idx (i); > > > +gcc_checking_assert (arg_specif

Re: [PATCH v2] builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-10-26 Thread Raoni Fassina Firmino via Gcc-patches
On Thu, Oct 01, 2020 at 03:08:19PM -0500, Segher Boessenkool wrote: > On Thu, Oct 01, 2020 at 08:08:01AM +0200, Richard Biener wrote: > > On Wed, 30 Sep 2020, Segher Boessenkool wrote: > > > It's going to be challenging to find a reasonable spot in there. > > > Oh well. > > > > Put it next to fmin

Re: [PATCH v2] builtins: rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-10-26 Thread Raoni Fassina Firmino via Gcc-patches
On Mon, Oct 05, 2020 at 10:36:22AM -0500, Segher Boessenkool wrote: > Should this pattern not allow setting more than one exception bit at > once, btw? Turns out allowing more than one bit was no problem at all. On Mon, Oct 05, 2020 at 10:36:22AM -0500, Segher Boessenkool wrote: > On Sun, Oct 04,

[PATCH] Move SLP nodes to an alloc-pool

2020-10-26 Thread Richard Biener
This introduces a global alloc-pool for SLP nodes to reduce overhead on SLP allocation churn which will get worse and to eventually release SLP cycles which will retain a refcount of one and thus are never freed at the moment. Bootstrap / regtest pending on x86_64-unknown-linux-gnu. 2020-10-26 R

Re: [PATCH V2] aarch64: Add bfloat16 vldN_lane_bf16 + vldNq_lane_bf16 intrisics

2020-10-26 Thread Richard Sandiford via Gcc-patches
Andrea Corallo via Gcc-patches writes: > Hi all, > > Second version of the patch here implementing the bfloat16_t neon > related load intrinsics: vld2_lane_bf16, vld2q_lane_bf16, > vld3_lane_bf16, vld3q_lane_bf16 vld4_lane_bf16, vld4q_lane_bf16. > > This better narrows testcases so they do not cau

[PATCH 1/x] arm: Add vld1_lane_bf16 + vldq_lane_bf16 intrinsics

2020-10-26 Thread Andrea Corallo via Gcc-patches
Hi all, I'd like to submit the following patch implementing the bfloat16_t neon related load intrinsics: vld1_lane_bf16, vld1q_lane_bf16. Please see refer to: ACLE ISA Regtested and bootstrapped. Oka

[PATCH 2/x] arm: add vst1_lane_bf16 + vstq_lane_bf16 intrinsics

2020-10-26 Thread Andrea Corallo via Gcc-patches
Hi all, Second patch of the serie here adding vst1_lane_bf16, vst1q_lane_bf16 bfloat16 related neon intrinsics. Please see refer to: ACLE ISA Regtested and bootstrapped. Okay for trunk? Andrea

Re: [PATCH v2] builtins: rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-10-26 Thread Raoni Fassina Firmino via Gcc-patches
On Mon, Sep 28, 2020 at 11:42:13AM -0500, will schmidt wrote: > > +/* Expand call EXP to either feclearexcept or feraiseexcept builtins (from > > C99 > > +fenv.h), returning the result and setting it in TARGET. Otherwise > > return > > +NULL_RTX on failure. */ > > +static rtx > > +expan

Re: [RS6000] Unsupported test options for -m32

2020-10-26 Thread David Edelsohn via Gcc-patches
FAIL: gcc.target/powerpc/swaps-p8-22.c (test for excess errors) Excess errors: cc1: error: '-mcmodel' not supported in this configuration * gcc.target/powerpc/swaps-p8-22.c: Disable for -m32. diff --git a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c b/gcc/testsuite/gcc.target/powerpc/swaps-p8-2

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Uros Bizjak via Gcc-patches
On Mon, Oct 26, 2020 at 3:45 PM Qing Zhao wrote: > > > +/* Generate insns to zero all st/mm registers together. > + Return true when zeroing instructions are generated. > + Assume the number of st registers that are zeroed is num_of_st, > + we will emit the following sequence to zero them to

Re: [PATCH V2] aarch64: Add bfloat16 vldN_lane_bf16 + vldNq_lane_bf16 intrisics

2020-10-26 Thread Andrea Corallo via Gcc-patches
Richard Sandiford writes: > Andrea Corallo via Gcc-patches writes: >> Hi all, >> >> Second version of the patch here implementing the bfloat16_t neon >> related load intrinsics: vld2_lane_bf16, vld2q_lane_bf16, >> vld3_lane_bf16, vld3q_lane_bf16 vld4_lane_bf16, vld4q_lane_bf16. >> >> This better

Re: [RS6000] Unsupported test options for -m32

2020-10-26 Thread Iain Sandoe via Gcc-patches
David Edelsohn via Gcc-patches wrote: FAIL: gcc.target/powerpc/swaps-p8-22.c (test for excess errors) Excess errors: cc1: error: '-mcmodel' not supported in this configuration * gcc.target/powerpc/swaps-p8-22.c: Disable for -m32. diff --git a/gcc/testsuite/gcc.target/powerpc/swaps-p8-22.c b/g

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Qing Zhao via Gcc-patches
> On Oct 26, 2020, at 11:13 AM, Uros Bizjak wrote: > > On Mon, Oct 26, 2020 at 3:45 PM Qing Zhao > wrote: >> >> >> +/* Generate insns to zero all st/mm registers together. >> + Return true when zeroing instructions are generated. >> + Assume the number of st

Re: [PATCH v2] builtins: rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-10-26 Thread Segher Boessenkool
On Mon, Oct 26, 2020 at 01:05:00PM -0300, Raoni Fassina Firmino wrote: > On Mon, Sep 28, 2020 at 11:42:13AM -0500, will schmidt wrote: > > > +;; FE_INEXACT, FE_DIVBYZERO, FE_UNDERFLOW and FE_OVERFLOW flags. > > > +;; It doesn't handle values out of range, and always returns 0. > > > +;; Note that F

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread Jan Hubicka
> > > > For example you had patch that limited "rep cmpsb" expansion for > > -minline-all-stringops. Now the conditions could be > > -minline-all-stringops || optimize_insn_for_size () == OPTIMIZE_SIZE_MAX > > since it is still useful size optimization. > > > > I am not sure if you had other chang

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread H.J. Lu via Gcc-patches
On Mon, Oct 26, 2020 at 10:14 AM Jan Hubicka wrote: > > > > > > > For example you had patch that limited "rep cmpsb" expansion for > > > -minline-all-stringops. Now the conditions could be > > > -minline-all-stringops || optimize_insn_for_size () == OPTIMIZE_SIZE_MAX > > > since it is still usefu

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Qing Zhao via Gcc-patches
The following is the current change in i386.c, could you check whether the logic is good? thanks. Qing /* Check whether the register REGNO should be zeroed on X86. When ALL_SSE_ZEROED is true, all SSE registers have been zeroed together, no need to zero it again. When EXIT_WITH_MMX_

Re: [PATCH V2] aarch64: Add vcopy(q)__lane(q)_bf16 intrinsics

2020-10-26 Thread Richard Sandiford via Gcc-patches
Andrea Corallo via Gcc-patches writes: > diff --git > a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcopy_lane_bf16_indices_1.c > > b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcopy_lane_bf16_indices_1.c > new file mode 100644 > index 000..9cbb5ea8110 > --- /dev/null >

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Segher Boessenkool
On Mon, Oct 26, 2020 at 01:28:42PM +, Alex Coplan wrote: > On 26/10/2020 07:12, Segher Boessenkool wrote: > > On Thu, Oct 15, 2020 at 09:59:24AM +0100, Alex Coplan wrote: > > Can you instead replace the mult by a shift somewhere earlier in > > make_extract? That would make a lot more sense :-)

Re: [PATCH 2/2] combine: Don't turn (mult (extend x) 2^n) into extract

2020-10-26 Thread Segher Boessenkool
On Mon, Oct 26, 2020 at 01:18:54PM +, Alex Coplan wrote: > - else if (GET_CODE (inner) == ASHIFT > + else if ((GET_CODE (inner) == ASHIFT || GET_CODE (inner) == MULT) As I wrote in the other mail, write this as two cases. Write something in the comment for the mult one that this is for the

[PATCH] Handle signed 1-bit ranges in irange::invert.

2020-10-26 Thread Aldy Hernandez via Gcc-patches
The problem here is we are trying to add 1 to a -1 in a signed 1-bit field and coming up with UNDEFINED because of the overflow. Signed 1-bits are annoying because you can't really add or subtract one, because the one is unrepresentable. For invert() we have a special subtract_one() function that

Re: [RS6000] VSX_MM_SUFFIX

2020-10-26 Thread Segher Boessenkool
On Sun, Oct 25, 2020 at 05:16:10AM -0500, Segher Boessenkool wrote: > On Sun, Oct 25, 2020 at 11:55:39AM +1030, Alan Modra wrote: > > > If you use a macro that doesn't exist, the compiler simply does not > > > build! > > > > My empirical evidence to the contrary says your theoretical arguments > >

Re: Make default duplicate and insert methods of summaries abort; fix fallout

2020-10-26 Thread Martin Liška
On 10/25/20 2:22 PM, Jan Hubicka wrote: Hi, the default duplicate and insert methods of sumaries produce empty summary that is not useful for anything and makes it easy to introduce bugs. This patch makes the default hooks to abort and summaries that do not need dupicaito/insertion disable the c

Re: Make default duplicate and insert methods of summaries abort; fix fallout

2020-10-26 Thread Jan Hubicka
> > gcc/ChangeLog: > > * symbol-summary.h (function_summary_base::unregister_hooks): > Call disable_insertion_hook and disable_duplication_hook. > (function_summary_base::symtab_insertion): New field. > (function_summary_base::symtab_removal): Likewise. > (function_s

Re: Implement three-level optimize_for_size predicates

2020-10-26 Thread Jan Hubicka
> On Mon, Oct 26, 2020 at 10:14 AM Jan Hubicka wrote: > > > > > > > > > > For example you had patch that limited "rep cmpsb" expansion for > > > > -minline-all-stringops. Now the conditions could be > > > > -minline-all-stringops || optimize_insn_for_size () == OPTIMIZE_SIZE_MAX > > > > since it

[PATCH] lto: no sub-make when --jobserver-auth= is missing

2020-10-26 Thread Martin Liška
We newly correctly detect that a job server is not active for a LTO linking: lto-wrapper: warning: jobserver is not available: '--jobserver-auth=' is not present in 'MAKEFLAGS' In that situation we should not call make -f abc.mk as it can leed to N^2 LTRANS units. Ready for master? Thanks, Mar

Re: [RS6000] Unsupported test options for -m32

2020-10-26 Thread Segher Boessenkool
On Sun, Oct 25, 2020 at 09:51:29PM +1030, Alan Modra wrote: > FAIL: gcc.target/powerpc/swaps-p8-22.c (test for excess errors) > Excess errors: > cc1: error: '-mcmodel' not supported in this configuration This is because your build is not biarch. We really should not allow such configurations, the

Re: [RS6000] Remove -mpcrel from tests

2020-10-26 Thread Segher Boessenkool
On Sun, Oct 25, 2020 at 09:52:40PM +1030, Alan Modra wrote: > When running with -m32 > FAIL: gcc.target/powerpc/pr94740.c (test for excess errors) > Excess errors: > cc1: error: '-mpcrel' requires '-mcmodel=medium' > > The others don't run for -m32, but remove the unnecessary -mpcrel > anyway. >

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Uros Bizjak via Gcc-patches
On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote: > > > The following is the current change in i386.c, could you check whether the > logic is good? x87 handling looks good to me. One remaining question: If the function uses MMX regs (either internally or as an argument register), but exits in x8

Re: [RS6000] biarch test fail

2020-10-26 Thread Segher Boessenkool
On Sun, Oct 25, 2020 at 09:55:32PM +1030, Alan Modra wrote: > I thought this one was worth at least commenting as to why it fails > when biarch testing. OK? > > * gcc.target/powerpc/bswap64-4.c: Comment. > > diff --git a/gcc/testsuite/gcc.target/powerpc/bswap64-4.c > b/gcc/testsuite/gcc.t

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Richard Sandiford via Gcc-patches
Qing Zhao writes: > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index c9f7299..3a884e1 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -3992,6 +3992,49 @@ performing a link with relocatable output (i.e.@: > @code{ld -r}) on them. > A declaration to which @code{we

Re: [RFC] Add support for the "retain" attribute utilizing SHF_GNU_RETAIN

2020-10-26 Thread Pedro Alves via Gcc-patches
On 10/6/20 12:10 PM, Jozef Lawrynowicz wrote: > Should "used" apply SHF_GNU_RETAIN? > === > Another talking point is whether the existing "used" attribute should > apply the SHF_GNU_RETAIN flag to the containing section. > > It seems unlikely that a user applies th

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Qing Zhao via Gcc-patches
> On Oct 26, 2020, at 1:42 PM, Uros Bizjak wrote: > > On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote: >> >> >> The following is the current change in i386.c, could you check whether the >> logic is good? > > x87 handling looks good to me. > > One remaining question: If the function uses

Re: [RFC] Add support for the "retain" attribute utilizing SHF_GNU_RETAIN

2020-10-26 Thread Pedro Alves via Gcc-patches
On 10/6/20 12:10 PM, Jozef Lawrynowicz wrote: > The changes would also only affect targets > that support the GNU ELF OSABI, which would lead to inconsistent > behavior between non-GNU OS's. Well, a separate __attribute__((retain)) will necessarily only work on GNU ELF targets, so that just shifts

Re: [PATCH] rs6000, Power 10 testsuite fixes

2020-10-26 Thread Segher Boessenkool
Hi! On Fri, Oct 23, 2020 at 02:43:40PM -0700, Carl Love wrote: > The following patch fixes a few issues with the tests. The DEBUG is > defined in each of the files thus the #ifdef DEBUG should just be #if > DEBUG. The other issue is a some of the line lengths for the error > prints exceed 80 cha

Re: [PATCH] Re: error: ‘EVRP_MODE_DEBUG’ was not declared – was: [PUSHED] Ranger classes.

2020-10-26 Thread Maciej W. Rozycki
On Mon, 26 Oct 2020, Andrew MacLeod wrote: > > It is still broken at `-O0', does not build with `--enable-werror-always' > > (which IMO should be on by default except for releases, just as we do with > > binutils AFAIK, so as to make sure people do not introduce build problems > > too easily): >

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Uros Bizjak via Gcc-patches
On Mon, Oct 26, 2020 at 8:10 PM Qing Zhao wrote: > > > > > On Oct 26, 2020, at 1:42 PM, Uros Bizjak wrote: > > > > On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote: > >> > >> > >> The following is the current change in i386.c, could you check whether the > >> logic is good? > > > > x87 handling

Re: [RS6000] Non-pcrel tests when power10

2020-10-26 Thread Segher Boessenkool
Hi! On Thu, Oct 22, 2020 at 05:28:17PM +1030, Alan Modra wrote: > These tests require -mno-pcrel because they are testing features > of the non-pcrel ABI. > --- a/gcc/testsuite/gcc.target/powerpc/cprophard.c > +++ b/gcc/testsuite/gcc.target/powerpc/cprophard.c > @@ -1,6 +1,6 @@ > /* { dg-do comp

Re: [RS6000] dimode_off.c test

2020-10-26 Thread Segher Boessenkool
On Thu, Oct 22, 2020 at 05:29:49PM +1030, Alan Modra wrote: > This tests behaviour near the limit of 16-bit signed offsets. If > power10 prefix instructions are enabled, no such testing occurs. > > * gcc.target/powerpc/dimode_off.c: Add -mno-prefixed to options. > > Regstrapped powerpc64le

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

2020-10-26 Thread Uros Bizjak via Gcc-patches
On Mon, Oct 26, 2020 at 9:05 PM Uros Bizjak wrote: > > On Mon, Oct 26, 2020 at 8:10 PM Qing Zhao wrote: > > > > > > > > > On Oct 26, 2020, at 1:42 PM, Uros Bizjak wrote: > > > > > > On Mon, Oct 26, 2020 at 6:30 PM Qing Zhao wrote: > > >> > > >> > > >> The following is the current change in i386

Re: [RS6000] Link power10 testcases

2020-10-26 Thread Segher Boessenkool
On Thu, Oct 22, 2020 at 05:31:15PM +1030, Alan Modra wrote: > Running the assembler and linker catches more errors. > > * gcc.target/powerpc/cfuged-1.c, > * gcc.target/powerpc/cntlzdm-1.c, There should be no star on the second and next line of one entry. Okay for trunk. Thanks! Se

Re: [RFC] Add support for the "retain" attribute utilizing SHF_GNU_RETAIN

2020-10-26 Thread Jozef Lawrynowicz
On Mon, Oct 26, 2020 at 07:08:06PM +, Pedro Alves via Gcc-patches wrote: > On 10/6/20 12:10 PM, Jozef Lawrynowicz wrote: > > > Should "used" apply SHF_GNU_RETAIN? > > === > > Another talking point is whether the existing "used" attribute should > > apply the SHF

[PATCH] libstdc++: Implement C++20 features for

2020-10-26 Thread Thomas Rodgers
From: Thomas Rodgers New ctors and ::view() accessor for - * basic_stingbuf * basic_istringstream * basic_ostringstream * basic_stringstreamm New ::get_allocator() accessor for basic_stringbuf. libstdc++-v3/ChangeLog: * acinclude.m4 (glibcxx_SUBDIRS): Add src/c++20. * co

[PATCH v3] builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-10-26 Thread Raoni Fassina Firmino via Gcc-patches
Changes since v2[1]: - Added documentation for the new optabs; - Remove use of non portable __builtin_clz; - Changed feclearexcept and feraiseexcept to accept all 4 valid flags at the same time and added more test for that case; - Extended feclearexcept and feraiseexcept testcases to ma

[PATCH] PR fortran/97491 - Wrong restriction for VALUE arguments of pure procedures

2020-10-26 Thread Harald Anlauf
As found/reported by Thomas, the redefinition of dummy arguments with the VALUE attribute was erroneously rejected for pure procedures. A related purity check did not take VALUE into account and was therefore adjusted. Regtested on x86_64-pc-linux-gnu. OK for master? Thanks, Harald PR fortran

Re: [PATCH] libstdc++: Implement C++20 features for

2020-10-26 Thread Jonathan Wakely via Gcc-patches
On 26/10/20 13:47 -0700, Thomas Rodgers wrote: From: Thomas Rodgers New ctors and ::view() accessor for - * basic_stingbuf * basic_istringstream * basic_ostringstream * basic_stringstreamm New ::get_allocator() accessor for basic_stringbuf. libstdc++-v3/ChangeLog: * acinclude.m4 (

Re: [PATCH] rs6000: Don't split constant operator add before reload, move to temp register for future optimization

2020-10-26 Thread Segher Boessenkool
On Wed, Oct 21, 2020 at 03:25:29AM -0500, Xionghu Luo wrote: > Don't split code from add3 for SDI to allow a later pass to split. This is very problematic. > This allows later logic to hoist out constant load in add instructions. Later logic should be able to do that any way (I do not say that w

  1   2   >