from:"Richard Biener"

Re: [PATCH] tree-optimization/114074 - CHREC multiplication and undefined overflow

2024-02-26 Thread Richard Biener

On Mon, 26 Feb 2024, Jakub Jelinek wrote: > On Mon, Feb 26, 2024 at 03:15:02PM +0100, Richard Biener wrote: > > When folding a multiply CHRECs are handled like {a, +, b} * c > > is {a*c, +, b*c} but that isn't generally correct when overflow > > invokes undefined behavior

[PATCH] tree-optimization/114074 - CHREC multiplication and undefined overflow

2024-02-26 Thread Richard Biener

When folding a multiply CHRECs are handled like {a, +, b} * c is {a*c, +, b*c} but that isn't generally correct when overflow invokes undefined behavior. The following uses unsigned arithmetic unless either a is zero or a and b have the same sign. I've used simple early outs for INTEGER_CSTs and

RE: [PATCH]middle-end: delay updating of dominators until later during vectorization. [PR114081]

2024-02-26 Thread Richard Biener

t-loop.cc b/gcc/tree-vect-loop.cc > > > index > > 35f1f8c7d4245135ace740ff9be548919587..ab19ad6a6be516e3ee1f0fbeaae > > effeae1fb900f 100644 > > > --- a/gcc/tree-vect-loop.cc > > > +++ b/gcc/tree-vect-loop.cc > > > @@ -11987,7 +11987,12 @@ vect_tra

[PATCH 2/2] tree-optimization/114099 - virtual LC PHIs and early exit vect

2024-02-26 Thread Richard Biener

In some cases exits can lack LC PHI nodes for the virtual operand. We have to create them when the epilog loop requires them which also allows us to remove some only halfway correct fixups. This is the variant triggering for alternate exits. Bootstrap and regtest pending on

[PATCH 1/2] tree-optimization/114068 - missed virtual LC PHI after vect peeling

2024-02-26 Thread Richard Biener

When we choose the IV exit to be one leading to no virtual use we fail to have a virtual LC PHI even though we need it for the epilog entry. The following makes sure to create it so that later updating works. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. PR

Re: [PATCH]middle-end: delay updating of dominators until later during vectorization. [PR114081]

2024-02-26 Thread Richard Biener

torizer.h > +++ b/gcc/tree-vectorizer.h > @@ -961,6 +961,10 @@ public: >/* Statements whose VUSES need updating if early break vectorization is to > happen. */ >auto_vec early_break_vuses; > + > + /* Dominators that need to be recalculated that have been deferred un

Re: [PATCH]middle-end: update vuses out of loop which use a vdef that's moved [PR114068]

2024-02-26 Thread Richard Biener

latch_edge (loop)); > + FOR_EACH_IMM_USE_STMT (use_stmt, iter, last_seen_vuse) > + { > + if (flow_bb_inside_loop_p (loop, use_stmt->bb)) > + continue; > + FOR_EACH_IMM_USE_ON_STMT (use_p, iter) > + SET_USE (use_p, vuse); > + } > +} > + >/* And update the LC PHIs on exits. */ >for (edge e : get_loop_exit_edges (LOOP_VINFO_LOOP (loop_vinfo))) > if (!dominated_by_p (CDI_DOMINATORS, e->src, dest_bb)) > > > > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH v2] Do not emulate vectors containing floats.

2024-02-26 Thread Richard Biener

On Mon, 26 Feb 2024, Jakub Jelinek wrote: > On Mon, Feb 26, 2024 at 09:53:41AM +0100, Richard Biener wrote: > > On Mon, 26 Feb 2024, Jakub Jelinek wrote: > > > > > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote: > > > > > > @@ -6756,7 +

Re: [PATCH v2] Do not emulate vectors containing floats.

2024-02-26 Thread Richard Biener

On Mon, 26 Feb 2024, Jakub Jelinek wrote: > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote: > > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo, > > > > those through even when the mode isn't word_mode. For > >

Re: [PATCH] match.pd: Guard 2 simplifications on integral TYPE_OVERFLOW_UNDEFINED [PR114090]

2024-02-26 Thread Richard Biener

_attribute__((noipa)) int > +bar (int x) > +{ > + int w = (x >= 0 ? x : 0); > + int z = (x <= 0 ? -x : 0); > + return w + z; > +} > + > +__attribute__((noipa)) int > +baz (int x) > +{ > + return x <= 0 ? -x : 0; > +} > + > +int > +main () > +{ &

Re: [PATCH] fold-const: Avoid infinite recursion in +-*&|^minmax reassociation [PR114084]

2024-02-26 Thread Richard Biener

> return > fold_convert_loc (loc, type, associate_trees (loc, var0, con0, > code, atype)); > --- gcc/testsuite/gcc.dg/bitint-94.c.jj 2024-02-24 11:18:32.607018363 > +0100 > +++ gcc/testsuite/gcc.dg/bitint-94.c 2024-02-24 11:19:09.023500121 +0100 > @@ -0,0 +1,12 @@ > +/* PR middle-end/114084 */ > +/* { dg-do compile { target bitint } } */ > +/* { dg-options "-std=c23 -pedantic-errors" } */ > + > +typedef unsigned _BitInt(31) T; > +T a, b; > + > +void > +foo (void) > +{ > + b = (T) ((a | (-1U >> 1)) >> 1 | (a | 5) << 4); > +} > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH v2] Do not emulate vectors containing floats.

2024-02-26 Thread Richard Biener

ed branches - the effective check should be the same in GCC 13 at least, but with some added ad-hoc costing which might make this not trigger (maybe_lt (nunits_out, 4U)) - so we'd need a word_mode that can cover 4 FP elements. Possibly triggerable with HFmode? Thanks, Richard. > LGTM, but please wait until Monday evening so that Richi or Richard > have a chance to chime in. > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] middle-end/114070 - folding breaking VEC_COND expansion

2024-02-25 Thread Richard Biener

The following properly guards the simplifications that move operations into VEC_CONDs, in particular when that changes the type constraints on this operation. This needed a genmatch fix which was recording spurious implicit fors when tcc_comparison is used in a C expression. Bootstrapped and

Re: [PATCH v1] RTL: Bugfix ICE after allow vector type in DSE

2024-02-25 Thread Richard Biener

On Mon, Feb 26, 2024 at 4:26 AM wrote: > > From: Pan Li > > We allowed vector type for get_stored_val when read is less than or > equal to store in previous. Unfortunately, we missed to adjust the > validate_subreg part accordingly. For vector type, we don't need to > restrict the mode size is

Re: [PATCH] Use HOST_WIDE_INT_{C,UC,0,0U,1,1U} macros some more

2024-02-24 Thread Richard Biener

> Am 24.02.2024 um 08:44 schrieb Jakub Jelinek : > > Hi! > > I've searched for some uses of (HOST_WIDE_INT) constant or (unsigned > HOST_WIDE_INT) constant and turned them into uses of the appropriate > macros. > THere are quite a few cases in non-i386 backends but I've left that out > for

Re: [PATCH] bitint: Handle VIEW_CONVERT_EXPRs between large/huge BITINT_TYPEs and VECTOR/COMPLEX_TYPE etc. [PR114073]

2024-02-24 Thread Richard Biener

> Am 24.02.2024 um 08:40 schrieb Jakub Jelinek : > > Hi! > > The following patch implements support for VIEW_CONVERT_EXPRs from/to > large/huge _BitInt to/from vector or complex types or anything else but > integral/pointer types which doesn't need to live in memory. > >

Re: [PATCH] vect: Tighten check for impossible SLP layouts [PR113205]

2024-02-24 Thread Richard Biener

> Am 24.02.2024 um 11:06 schrieb Richard Sandiford : > > During its forward pass, the SLP layout code tries to calculate > the cost of a layout change on an incoming edge. This is taken > as the minimum of two costs: one in which the source partition > keeps its current layout (chosen

Re: [PATCH] vect: Fix integer overflow calculating mask

2024-02-23 Thread Richard Biener

On Fri, 23 Feb 2024, Jakub Jelinek wrote: > On Fri, Feb 23, 2024 at 02:22:19PM +, Andrew Stubbs wrote: > > On 23/02/2024 13:02, Jakub Jelinek wrote: > > > On Fri, Feb 23, 2024 at 12:58:53PM +, Andrew Stubbs wrote: > > > > This is a follow-up to the previous patch to ensure that integer

Re: [PATCH] vect: Fix integer overflow calculating mask

2024-02-23 Thread Richard Biener

> Am 23.02.2024 um 14:03 schrieb Jakub Jelinek : > > On Fri, Feb 23, 2024 at 12:58:53PM +, Andrew Stubbs wrote: >> This is a follow-up to the previous patch to ensure that integer vector >> bit-masks do not have excess bits set. It fixes a bug, observed on >> amdgcn, in which the mask

Re: [PATCH] expr: Fix REDUCE_BIT_FIELD in multiplication expansion [PR114054]

2024-02-23 Thread Richard Biener

29.464277919 +0100 > @@ -0,0 +1,17 @@ > +/* PR rtl-optimization/114054 */ > +/* { dg-do compile { target bitint } } */ > +/* { dg-options "-Og -fwhole-program -fno-tree-ccp -fprofile-use > -fno-tree-copy-prop -w" } */ > + > +int x; > + > +void > +foo (int i, u

Re: [PATCH] bitintlower: Fix .{ADD,SUB}_OVERFLOW lowering [PR114040]

2024-02-23 Thread Richard Biener

"-flto" } { "" } } */ > + > +unsigned a; > +signed char b; > +short c; > +long d; > +__int128 e; > +int f; > + > +#if __BITINT_MAXWIDTH__ >= 511 > +__attribute__((noinline)) void > +foo (_BitInt(3) x, unsigned _BitInt(511) y, unsigned *z) > +

[PATCH][www] Document ia64--* obsolescence

2024-02-23 Thread Richard Biener

The following documents obsoleting of ia64*-*-*. Pushed. * gcc-14/changes.html: Document ia64*-*-* obsoleting. --- htdocs/gcc-14/changes.html | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index

[PATCH] Add ia64--* to the list of obsolete targets

2024-02-23 Thread Richard Biener

The following deprecates ia64*-*-* for GCC 14. Since we plan to force LRA for GCC 15 and the target only has slim chances of getting updated this notifies people in advance. Given both Linux and glibc have axed the target further development is also made difficult. "Tested" for ia64-elf and

[PATCH] tree-optimization/114048 - ICE in copy_reference_ops_from_ref

2024-02-22 Thread Richard Biener

The following adds another omission to the assert verifying we're not running into spurious off == -1. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/114048 * tree-ssa-sccvn.cc (copy_reference_ops_from_ref): MEM_REF can also produce -1

[PATCH] tree-optimization/114027 - conditional reduction chain

2024-02-22 Thread Richard Biener

When we classify a conditional reduction chain as CONST_COND_REDUCTION we fail to verify all involved conditionals have the same constant. That's a quite unlikely situation so the following simply disables such classification when there's more than one reduction statement. Bootstrapped and tested

Re: [PATCH] profile-count: Don't dump through a temporary buffer [PR111960]

2024-02-22 Thread Richard Biener

On Thu, Feb 22, 2024 at 10:07 AM Jakub Jelinek wrote: > > Hi! > > The profile_count::dump (char *, struct function * = NULL) const; > method has a single caller, the > profile_count::dump (FILE *f, struct function *fun) const; > method and for that going through a temporary buffer is just slower

Re: [PATCH] call-cdce: Add missing BUILT_IN_F{32,64}X handling and improve BUILT_IN_L [PR113993]

2024-02-22 Thread Richard Biener

FLT128_MANT_DIG__ > +void > +flt128 (_Float128 f1, _Float128 f2, _Float128 f3, _Float128 f4, _Float128 f5, > + _Float128 f6, _Float128 f7, _Float128 f8, _Float128 f9) > +{ > + if (!(f1 >= -1.0f128 && f1 <= 1.0f128)) __builtin_unreachable (); > + __builtin_acosf

Re: [PATCH] libcpp: Improve location for macro names [PR66290]

2024-02-22 Thread Richard Biener

On Tue, Feb 20, 2024 at 3:33 PM Lewis Hyatt wrote: > > On Mon, Feb 19, 2024 at 11:36 PM Alexandre Oliva wrote: > > > > This backport for gcc-13 is the first of two required for the > > g++.dg/pch/line-map-3.C test to stop hitting a variant of the known > > problem mentioned in that testcase: on

Re: [PATCH] bitintlower: Fix .MUL_OVERFLOW overflow checking [PR114038]

2024-02-22 Thread Richard Biener

run_expensive_tests } { "*" } { "-O0" "-O2" } } */ > +/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */ > + > +#if __BITINT_MAXWIDTH__ >= 129 > +int > +foo (unsigned _BitInt(63) x, unsigned _BitInt

Re: Stabilizing flaky libgomp GCN target/offloading testing (was: libgomp GCN gfx1030/gfx1100 offloading status)

2024-02-21 Thread Richard Biener

> Am 21.02.2024 um 13:34 schrieb Thomas Schwinge : > > Hi! > >> On 2024-02-01T15:49:02+0100, Richard Biener wrote: >>> On Thu, 1 Feb 2024, Thomas Schwinge wrote: >>> On 2024-01-26T10:45:10+0100, Richard Biener wrote: >>>> On Fri, 26 Jan 2024,

Re: [PATCH] aarch64: Allow aarch64-linux-muscl for heap trampolines [PR113971].

2024-02-20 Thread Richard Biener

On Tue, Feb 20, 2024 at 11:27 AM Iain Sandoe wrote: > > Tested on aarch64-linux-gnu, aarch64-darwin by me and on aarch64-linux-musl > by Sam James (thanks!). OK for trunk? OK > thanks > Iain > > --- 8< --- > > > This allows the same trampoline pattern to be used on all linux variants > rather

Re: [PATCH] c-family, c++, v2: Fix up handling of types which may have padding in __atomic_{compare_}exchange

2024-02-20 Thread Richard Biener

On Tue, 20 Feb 2024, Jakub Jelinek wrote: > On Tue, Feb 20, 2024 at 09:01:10AM +0100, Richard Biener wrote: > > I'm not sure those would be really equivalent (MEM_REF vs. V_C_E > > as well as combined vs. split). It really depends how RTL expansion > > handles this (as yo

Re: GCN RDNA2+ vs. GCC SLP vectorizer

2024-02-20 Thread Richard Biener

On Tue, 20 Feb 2024, Thomas Schwinge wrote: > Hi Richard! > > On 2024-02-20T08:44:35+0100, Richard Biener wrote: > > On Mon, 19 Feb 2024, Thomas Schwinge wrote: > >> On 2024-02-19T17:31:20+0100, I wrote: > >> > On 2024-02-19T11:52:55+0100, Richard Biener

Re: [PATCH] c-family, c++: Fix up handling of types which may have padding in __atomic_{compare_}exchange

2024-02-20 Thread Richard Biener

On Tue, 20 Feb 2024, Jakub Jelinek wrote: > On Tue, Feb 20, 2024 at 12:12:11AM +, Jason Merrill wrote: > > On 2/19/24 02:55, Jakub Jelinek wrote: > > > On Fri, Feb 16, 2024 at 01:51:54PM +, Jonathan Wakely wrote: > > > > Ah, although __atomic_compare_exchange only takes pointers, the > >

Re: GCN RDNA2+ vs. GCC SLP vectorizer

2024-02-19 Thread Richard Biener

On Mon, 19 Feb 2024, Thomas Schwinge wrote: > Hi! > > On 2024-02-19T17:31:20+0100, I wrote: > > On 2024-02-19T11:52:55+0100, Richard Biener wrote: > >> On Mon, 19 Feb 2024, Thomas Schwinge wrote: > >>> On 2024-02-16T14:53:04+0100, I wrote: > >>&

Re: [PATCH] ipa: Convert lattices from pure array to vector (PR 113476)

2024-02-19 Thread Richard Biener

a-prop.h > index 9c78dc9f486..ee3c0006add 100644 > --- a/gcc/ipa-prop.h > +++ b/gcc/ipa-prop.h > @@ -627,7 +627,7 @@ public: >vec *descriptors; >/* Pointer to an array of structures describing individual formal > parameters. */ > - class ipcp_param_lattices * G

Re: veclower: improve selection of vector mode when lowering [PR 112787]

2024-02-19 Thread Richard Biener

a+sve conflicts > with -mcpu=neoverse-n2 in previous gcc versions. Yes. Thanks, Richard. > Kind Regards, > Andre > > On 20/12/2023 14:30, Richard Biener wrote: > > On Wed, 20 Dec 2023, Andre Vieira (lists) wrote: > > > >> Thanks, fully agree with all comm

Re: [PATCH] rtl-optimization/54052 - RTL SSA PHI insertion compile-time hog

2024-02-19 Thread Richard Biener

On Mon, 19 Feb 2024, Richard Sandiford wrote: > Richard Biener writes: > >> I suppose that's better than the first version when a block has a > >> large number of dominance frontiers. But I can't remember whether > >> that was the case in PR98863. I have a

Re: [PATCH] rtl-optimization/54052 - RTL SSA PHI insertion compile-time hog

2024-02-19 Thread Richard Biener

On Mon, 19 Feb 2024, Richard Sandiford wrote: > Richard Biener writes: > > On Mon, 19 Feb 2024, Richard Sandiford wrote: > > > >> Richard Biener writes: > >> > The following tries to address the PHI insertion compile-time hog in > >> > RTL

Re: [PATCH] rtl-optimization/54052 - RTL SSA PHI insertion compile-time hog

2024-02-19 Thread Richard Biener

On Mon, 19 Feb 2024, Richard Sandiford wrote: > Richard Biener writes: > > The following tries to address the PHI insertion compile-time hog in > > RTL fwprop observed with the PR54052 testcase where the loop computing > > the "unfiltered" set of variables poss

Re: GCN RDNA2+ vs. GCC SLP vectorizer

2024-02-19 Thread Richard Biener

On Mon, 19 Feb 2024, Thomas Schwinge wrote: > Hi! > > On 2024-02-16T14:53:04+0100, I wrote: > > On 2024-02-16T12:41:06+, Andrew Stubbs wrote: > >> On 16/02/2024 12:26, Richard Biener wrote: > >>> On Fri, 16 Feb 2024, Andrew Stubbs wrote: > >>

[PATCH] rtl-optimization/54052 - RTL SSA PHI insertion compile-time hog

2024-02-19 Thread Richard Biener

The following tries to address the PHI insertion compile-time hog in RTL fwprop observed with the PR54052 testcase where the loop computing the "unfiltered" set of variables possibly needing PHI nodes for each block exhibits quadratic compile-time and memory-use. Instead of only pruning the set

Re: [PATCH] match.pd: Fix ICE on BIT_INSERT_EXPR of BIT_FIELD_REF folding [PR113967]

2024-02-19 Thread Richard Biener

+/* PR tree-optimization/113967 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +typedef unsigned short W __attribute__((vector_size (4 * sizeof (short)))); > + > +void > +foo (W *p) > +{ > + W x = *p; > + W y = {}; > + __builtin_memc

Re: [PATCH v1] Internal-fn: Add new internal function SAT_ADDU

2024-02-18 Thread Richard Biener

On Sat, Feb 17, 2024 at 11:30 AM wrote: > > From: Pan Li > > This patch would like to add the middle-end presentation for the > unsigned saturation add. Aka set the result of add to the max > when overflow. It will take the pattern similar as below. > > SAT_ADDU (x, y) => (x + y) |

[PATCH][RFC] tree-optimization/113910 - bitmap_hash is weak, improve iterative_hash_*

2024-02-16 Thread Richard Biener

The following addresses the weak bitmap_hash function which results in points-to analysis taking a long time because of a high collision rate in one of its bitmap hash tables. Using a better hash function like in the bitmap.cc hunk below doesn't help unless one also replaces the hash function in

Re: GCN RDNA2+ vs. GCC SLP vectorizer

2024-02-16 Thread Richard Biener

On Fri, 16 Feb 2024, Andrew Stubbs wrote: > On 16/02/2024 10:17, Richard Biener wrote: > > On Fri, 16 Feb 2024, Thomas Schwinge wrote: > > > >> Hi! > >> > >> On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: > >>

[PATCH] tree-optimization/113895 - consistency check fails in copy_reference_ops_from_ref

2024-02-16 Thread Richard Biener

The following addresses consistency check fails in copy_reference_ops_from_ref when we are handling out-of-bound array accesses (it's almost impossible to identically mimic the get_ref_base_and_extent behavior). It also addresses the case where an out-of-bound constant offset computes to a -1 off

Re: GCN RDNA2+ vs. GCC SLP vectorizer (was: [committed] amdgcn: add -march=gfx1030 EXPERIMENTAL)

2024-02-16 Thread Richard Biener

On Fri, 16 Feb 2024, Thomas Schwinge wrote: > Hi! > > On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: > > I've committed this patch > > ... as commit c7ec7bd1c6590cf4eed267feab490288e0b8d691 > "amdgcn: add -march=gfx1030 EXPERIMENTAL", which the later RDNA3/gfx1100 > support builds on top

Re: [PATCH] c++/modules: optimize tree flag streaming

2024-02-16 Thread Richard Biener

On Thu, Feb 15, 2024 at 7:38 PM Patrick Palka wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look > OK for trunk? Btw, there's the "bitpack" streaming support in data-streamer.h also added for exactly the same reason, it's likely not easily re-usable but this kind of

Re: [PATCH] tree-optimization/113910 - huge compile time during PTA

2024-02-15 Thread Richard Biener

> Am 15.02.2024 um 18:06 schrieb Richard Sandiford : > > Richard Biener writes: >>> On Wed, 14 Feb 2024, Richard Biener wrote: >>> >>> For the testcase in PR113910 we spend a lot of time in PTA comparing >>> bitmaps for looking up equivalence cla

Re: [PATCH] expand: Fix handling of asm goto outputs vs. PHI argument adjustments [PR113921]

2024-02-15 Thread Richard Biener

} > } > } > --- gcc/testsuite/gcc.target/i386/pr113921.c.jj 2024-02-14 > 21:21:15.194178515 +0100 > +++ gcc/testsuite/gcc.target/i386/pr113921.c 2024-02-14 21:20:52.745476040 > +0100 > @@ -0,0 +1,20 @@ > +/* PR middle-end/113921 */ > +/* { dg-do run } */ &g

[PATCH] tree-optimization/111156 - properly dissolve SLP only groups

2024-02-15 Thread Richard Biener

The following fixes the omission of failing to look at pattern stmts when we need to dissolve SLP only groups. Bootstrapped and tested on x86-64-unknown-linux-gnu, pushed. PR tree-optimization/56 * tree-vect-loop.cc (vect_dissolve_slp_only_groups): Look at the pattern

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-15 Thread Richard Biener

On Thu, 15 Feb 2024, Andrew Stubbs wrote: > On 15/02/2024 10:21, Richard Biener wrote: > [snip] > >>> I suppse if RDNA really only has 32 lane vectors (it sounds like it, > >>> even if it can "simulate" 64 lane ones?) then it might make sense to

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-15 Thread Richard Biener

On Thu, 15 Feb 2024, Andrew Stubbs wrote: > On 15/02/2024 07:49, Richard Biener wrote: > > On Wed, 14 Feb 2024, Andrew Stubbs wrote: > > > >> On 14/02/2024 13:43, Richard Biener wrote: > >>> On Wed, 14 Feb 2024, Andrew Stubbs wrote: > >>>

Re: [PATCH] middle-end/113576 - avoid out-of-bound vector element access

2024-02-15 Thread Richard Biener

On Thu, 15 Feb 2024, Richard Sandiford wrote: > Richard Biener writes: > > On Wed, 14 Feb 2024, Richard Sandiford wrote: > > > >> Richard Biener writes: > >> > On Wed, 14 Feb 2024, Richard Sandiford wrote: > >> > > >> >> Richa

Re: [PATCH 2/2] doc: Add documentation of which operand matches the mode of the standard pattern name [PR113508]

2024-02-15 Thread Richard Biener

On Thu, Feb 15, 2024 at 12:16 AM Andrew Pinski wrote: > > In some of the standard pattern names, it is not obvious which mode is being > used in the pattern > name. Is it operand 0, 1, or 2? Is it the wider mode or the narrower mode? > This fixes that so there is no confusion by adding a

Re: [PATCH 1/2] doc: Fix some standard named pattern documentation modes

2024-02-15 Thread Richard Biener

On Thu, Feb 15, 2024 at 12:16 AM Andrew Pinski wrote: > > Currently these use `@var{m3}` but the 3 here is a literal 3 > and not part of the mode itself so it should not be inside > the var. Fixed as such. > > Built the documentation to make sure it looks correct now. OK > gcc/ChangeLog: > >

[PATCH] Do not record dependences from debug stmts in tail merging

2024-02-15 Thread Richard Biener

The following avoids recording BB dependences for debug stmt uses. Bootstrap and regtest running on x86_64-unknown-linux-gnu. It's unlikely a dependence is just because of debug stmts so actual compare-debug issues are very unlikely. Still spotted while investigating a CI regression mail (for

Re: [PATCH] lower-bitint: Ensure we don't get coalescing ICEs for (ab) SSA_NAMEs used in mul/div/mod [PR113567]

2024-02-15 Thread Richard Biener

> @@ -0,0 +1,23 @@ > +/* PR tree-optimization/113567 */ > +/* { dg-do compile { target bitint } } */ > +/* { dg-options "-O2" } */ > + > +#if __BITINT_MAXWIDTH__ >= 129 > +_BitInt(129) v; > + > +void > +foo (_BitInt(129) a, int i) > +{ > + __label__

Re: [PATCH] middle-end/113576 - avoid out-of-bound vector element access

2024-02-15 Thread Richard Biener

On Wed, 14 Feb 2024, Richard Sandiford wrote: > Richard Biener writes: > > On Wed, 14 Feb 2024, Richard Sandiford wrote: > > > >> Richard Biener writes: > >> > The following avoids accessing out-of-bound vector elements when > >> > native

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Richard Biener

On Wed, 14 Feb 2024, Andrew Stubbs wrote: > On 14/02/2024 13:43, Richard Biener wrote: > > On Wed, 14 Feb 2024, Andrew Stubbs wrote: > > > >> On 14/02/2024 13:27, Richard Biener wrote: > >>> On Wed, 14 Feb 2024, Andrew Stubbs wrote: > >>>

Re: [PATCH] [libiberty] remove TBAA violation in iterative_hash, improve code-gen

2024-02-14 Thread Richard Biener

> Am 14.02.2024 um 16:22 schrieb Jakub Jelinek : > > On Wed, Feb 14, 2024 at 04:13:51PM +0100, Richard Biener wrote: >> The following removes the TBAA violation present in iterative_hash. >> As we eventually LTO that it's important to fix. This also improves >> co

Re: [PATCH]middle-end: inspect all exits for additional annotations for loop.

2024-02-14 Thread Richard Biener

> Am 14.02.2024 um 16:16 schrieb Tamar Christina : > > >> >> >> I think this isn't entirely good. For simple cases for >> do {} while the condition ends up in the latch while for while () {} >> loops it ends up in the header. In your case the latch isn't empty >> so it doesn't end up

[PATCH][RFC] tree-optimization/113910 - improve bitmap_hash

2024-02-14 Thread Richard Biener

The following tries to improve the actual hash function for hashing bitmaps. We're still getting collision rates as high as 23 for the testcase in the PR. The following improves this by properly mixing in the bitmap element starting bit number. This brings down the collision rate below 1.4,

[PATCH] [libiberty] remove TBAA violation in iterative_hash, improve code-gen

2024-02-14 Thread Richard Biener

The following removes the TBAA violation present in iterative_hash. As we eventually LTO that it's important to fix. This also improves code generation for the >= 12 bytes loop by using | to compose the 4 byte words as at least GCC 7 and up can recognize that pattern and perform a 4 byte load

Re: [PATCH] tree-optimization/113910 - huge compile time during PTA

2024-02-14 Thread Richard Biener

On Wed, 14 Feb 2024, Richard Biener wrote: > For the testcase in PR113910 we spend a lot of time in PTA comparing > bitmaps for looking up equivalence class members. This points to > the very weak bitmap_hash function which effectively hashes set > and a subset of not set bits. T

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Richard Biener

On Wed, 14 Feb 2024, Andrew Stubbs wrote: > On 14/02/2024 13:27, Richard Biener wrote: > > On Wed, 14 Feb 2024, Andrew Stubbs wrote: > > > >> On 13/02/2024 08:26, Richard Biener wrote: > >>> On Mon, 12 Feb 2024, Thomas Schwinge wrote: > >>> >

Re: [PATCH]middle-end: inspect all exits for additional annotations for loop.

2024-02-14 Thread Richard Biener

e7bc33654ffa027b493f23d278ac..a29681bffb902d2d05e3f18764ab519aacb3c5bc > 100644 > --- a/gcc/tree-cfg.cc > +++ b/gcc/tree-cfg.cc > @@ -327,6 +327,10 @@ replace_loop_annotate (void) >if (loop->latch) > replace_loop_annotate_in_block (loop->latch, loop); > &g

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Richard Biener

On Wed, 14 Feb 2024, Andrew Stubbs wrote: > On 13/02/2024 08:26, Richard Biener wrote: > > On Mon, 12 Feb 2024, Thomas Schwinge wrote: > > > >> Hi! > >> > >> On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: > >>

Re: [PATCH] middle-end/113576 - avoid out-of-bound vector element access

2024-02-14 Thread Richard Biener

On Wed, 14 Feb 2024, Richard Sandiford wrote: > Richard Biener writes: > > The following avoids accessing out-of-bound vector elements when > > native encoding a boolean vector with sub-BITS_PER_UNIT precision > > elements. The error was basing the number o

Re: [PATCH] middle-end/113576 - zero padding of vector bools when expanding compares

2024-02-14 Thread Richard Biener

On Wed, 14 Feb 2024, Richard Sandiford wrote: > Richard Biener writes: > > The following zeros paddings of vector bools when expanding compares > > and the mode used for the compare is an integer mode. In that case > > targets cannot distinguish between a 4 element

[PATCH] tree-optimization/113910 - huge compile time during PTA

2024-02-14 Thread Richard Biener

For the testcase in PR113910 we spend a lot of time in PTA comparing bitmaps for looking up equivalence class members. This points to the very weak bitmap_hash function which effectively hashes set and a subset of not set bits. The following improves it by mixing that weak result with the

[PATCH][GCC 12] tree-optimization/113896 - reduction of permuted external vector

2024-02-14 Thread Richard Biener

The following fixes eliding of the permutation of a BB reduction of an existing vector which breaks materialization of live lanes as we fail to permute the SLP_TREE_SCALAR_STMTS vector. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/113896 *

Re: [PATCH] vect/testsuite: Fix vect-simd-clone-1[02].c when dg-do default is compile [PR113899]

2024-02-14 Thread Richard Biener

On Tue, Feb 13, 2024 at 10:46 PM Andrew Pinski wrote: > > The vect testsuite will chose the dg-do default based on if it knows the > running target does not support running with the vector extensions enabled > (for easy of testing). The problem is when it is decided the default is > compile >

[PATCH] tree-optimization/113895 - copy_reference_ops_from_ref vs. bitfields

2024-02-13 Thread Richard Biener

The recent enhancement to discover constant array indices by range info used by get_ref_base_and_extent doesn't work when the outermost component reference is to a bitfield because we track the running offset in the reference ops as bytes. The following does as ao_ref_init_from_vn_reference and

[PATCH] tree-optimization/113896 - testcase for fixed PR

2024-02-13 Thread Richard Biener

The SLP permute optimization rewrite fixed this. Tested on x86_64-unknown-linux-gnu, pushed to trunk and 13 branch. PR tree-optimization/113896 * g++.dg/torture/pr113896.C: New testcase. --- gcc/testsuite/g++.dg/torture/pr113896.C | 35 + 1 file changed,

[PATCH] Fix comment typo in ao_ref_init_from_vn_reference

2024-02-13 Thread Richard Biener

Pushed. PR tree-optimization/113831 * tree-ssa-sccvn.cc (ao_ref_init_from_vn_reference): Fix typo in comment. --- gcc/tree-ssa-sccvn.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc index

[PATCH] tree-optimization/113902 - fix VUSE update in move_early_exit_stmts

2024-02-13 Thread Richard Biener

The following adjusts move_early_exit_stmts to track the last seen VUSE instead of getting it from the last store which could be a PHI where gimple_vuse doesn't work. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/113902 * tree-vect-loop.cc

Re: [PATCH]middle-end: update vector loop upper bounds when early break vect [PR113734]

2024-02-13 Thread Richard Biener

= LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo); > += LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) > + || LOOP_VINFO_EARLY_BREAKS (loop_vinfo); >/* The minimum number of iterations performed by the epilogue. This > is 1 when peeling for gaps because we always need a final scalar > iteration. */ > > > > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] tree-optimization/113898 - ICE with sanity checking for VN ref adjustment

2024-02-13 Thread Richard Biener

The following fixes a missing add to the accumulated offset when adjusting an ARRAY_REF op for value-ranges applied to by get_ref_base_and_extent. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/113898 * tree-ssa-sccvn.cc

Re: [PATCH] libgcc: Fix UB in FP_FROM_BITINT

2024-02-13 Thread Richard Biener

__CHAR_BIT__ + 1 \ > + - __builtin_clzll (~msb)); \ > if (BIL_TYPE_SIZE > DI##_BITS && n > DI##_BITS) \ > { \ > iv = msb >> (n - DI##_BITS

Re: [PATCH] hwint: Fix up preprocessor conditions for GCC_PRISZ/fmt_size_t

2024-02-13 Thread Richard Biener

> -#if SIZE_MAX <= INT_MAX > +#if SIZE_MAX <= UINT_MAX > # define GCC_PRISZ "" > # define fmt_size_t unsigned int > -#elif SIZE_MAX <= LONG_MAX > +#elif SIZE_MAX <= ULONG_MAX > # define GCC_PRISZ HOST_LONG_FORMAT > # define fmt_size_t unsigned long int &g

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts" (was: [committed] amdgcn: add -march=gfx1030 EXPERIMENTAL)

2024-02-13 Thread Richard Biener

On Mon, 12 Feb 2024, Thomas Schwinge wrote: > Hi! > > On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: > > I've committed this patch > > ... as commit c7ec7bd1c6590cf4eed267feab490288e0b8d691 > "amdgcn: add -march=gfx1030 EXPERIMENTAL". > > The RDNA2 ISA variant doesn't support certain

Re: [PATCH] lower-bitint: Fix handle_cast when used e.g. in comparisons of precisions multiple of limb_prec [PR113849]

2024-02-12 Thread Richard Biener

> Am 12.02.2024 um 18:47 schrieb Jakub Jelinek : > > Hi! > > handle_cast handles the simple way all narrowing large/huge bitint to > large/huge bitint conversions and also such widening conversions if we can > assume that the most significant limb is processed using constant index > and both

Re: [PATCH] gengtype: Use HOST_SIZE_T_PRINT_UNSIGNED in another spot

2024-02-12 Thread Richard Biener

> Am 12.02.2024 um 18:14 schrieb Jakub Jelinek : > > Hi! > > This patch depends on the libiberty/vprintf-support.c change. > > Ok for trunk if that one is approved? Ok > 2024-02-12 Jakub Jelinek > >* gengtype.cc (adjust_field_rtx_def): Use HOST_SIZE_T_PRINT_UNSIGNED >and cast

Re: [PATCH] testsuite: Fix up gcc.dg/pr113693.c for ia32

2024-02-12 Thread Richard Biener

> Am 12.02.2024 um 18:13 schrieb Jakub Jelinek : > > Hi! > > As I wrote earlier and we've discussed on IRC, with the ia32 _BitInt > enablement patch this testcase FAILs on ia32, there is nothing vectorized in > there, even with -mavx512{vl,bw,dq}, so no dbgcnt messages are emitted. > > The

Re: [RFC] GCC Security policy

2024-02-12 Thread Richard Biener

On Mon, Feb 12, 2024 at 2:35 PM Siddhesh Poyarekar wrote: > > On 2024-02-12 08:16, Martin Jambor wrote: > >> This probably ties in somewhat with an idea David Malcolm had riffed on > >> with me earlier, of caching files for diagnostics. If we could unify > >> file accesses somehow, we could make

[PATCH] tree-optimization/113831 - wrong VN with structurally identical ref

2024-02-12 Thread Richard Biener

When we use get_ref_base_and_extent during VN and that ends up using global ranges to restrict the range of a ref we have to take care of not using the same expression in the hashtable as for a ref that could not use that global range. The following attempts to ensure this by applying similar

[PATCH] tree-optimization/113863 - elide degenerate virtual PHIs when moving ee stores

2024-02-12 Thread Richard Biener

This makes sure to elide degenerate virtual PHIs when moving stores across early exits. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. Richard. PR tree-optimization/113863 * tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Record crossed

Re: [PATCH] rtl-optimization/113597 - recover base term for argument pointers

2024-02-12 Thread Richard Biener

r me, as for you, it works for x86_64-linux-gnu: > > https://gcc.gnu.org/pipermail/gcc-testresults/2024-February/807609.html > > I hope this helps. > > Kind regards, > Toon Moene. > > On 2/9/24 11:26, Richard Biener wrote: > > The following allows a base term to be d

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model

2024-02-11 Thread Richard Biener

On Sat, Feb 10, 2024 at 1:55 PM Anbazhagan, Karthiban wrote: > > [Public] > > > Hi all, > > > > PFA, the patch that enables support for the next generation AMD Zen5 CPU via > -march=znver5 with basic znver5 scheduler Model. > > We may update the scheduler model going forward. > > > > Good for

Re: [PATCH] lower-bitint: Fix up .{ADD,SUB}_OVERFLOW lowering

2024-02-10 Thread Richard Biener

> Am 10.02.2024 um 11:03 schrieb Jakub Jelinek : > > Hi! > > torture/bitint-37.c test FAILed on i686-linux e.g. on > signed _BitInt(575) + unsigned _BitInt(575) -> signed _BitInt(575) > __builtin_add_overflow. With 64-bit limbs, we use 4 .UADDC calls in > the IL, 2 in a loop (which handles

Re: [PATCH] libgcc: Fix a bug in _BitInt -> dfp conversions

2024-02-10 Thread Richard Biener

> Am 10.02.2024 um 10:56 schrieb Jakub Jelinek : > > Hi! > > The ia32 _BitInt support revealed a bug in floatbitint?d.c. > As can be even guessed from how the code is written in the loop, > the intention was to set inexact to non-zero whenever the remainder > after division wasn't zero, but

Re: [PATCH] libgcc: Fix BIL_TYPE_SIZE == 32 support in _BitInt <-> dfp support

2024-02-10 Thread Richard Biener

> Am 10.02.2024 um 10:50 schrieb Jakub Jelinek : > > Hi! > > I've tried last night to enable _BitInt support for i?86-linux, and > a few spots in libgcc emitted -Wshift-count-overflow warnings and clearly > didn't do what it was supposed to do. > > Fixed thusly, bootstrapped/regtested on

Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-02-10 Thread Richard Biener

> Am 10.02.2024 um 10:41 schrieb Jakub Jelinek : > > Hi! > > In the previous patch I haven't touched the gcc diagnostic routines, > using HOST_SIZE_T_PRINT* for those is obviously undesirable because we > want the strings to be translatable. We already have %w[diox] for > HOST_WIDE_INT

Re: [PATCH] gimple-low: Fix up handling of volatile automatic vars in assume attribute [PR110754]

2024-02-10 Thread Richard Biener

> Am 10.02.2024 um 10:46 schrieb Jakub Jelinek : > > Hi! > > As the following testcases show, the gimple-low outlining of assume > magic functions handled volatile automatic vars (including > parameters/results) like non-volatile ones except it copied volatile > to the new PARM_DECL, which

Re: [PATCH] Use HOST_SIZE_T_PRINT_* and HOST_WIDE_INT_T_PRINT_* some more

2024-02-10 Thread Richard Biener

> Am 10.02.2024 um 10:39 schrieb Jakub Jelinek : > > Hi! > > I went through suspicios %l in format strings of *printf family functions > combined with casts to (long) or (unsigned long) and tried to find out the > types of the original expressions that were cast. > Quite a few had size_t

Re: [PATCH] bitint: Fix handling of VIEW_CONVERT_EXPRs to minimally supported huge INTEGER_TYPEs [PR113783]

2024-02-09 Thread Richard Biener

ree-optimization/113783 */ > +/* { dg-do compile { target bitint } } */ > +/* { dg-options "-O2" } */ > +/* { dg-additional-options "-mavx512f" { target i?86-*-* x86_64-*-* } } */ > + > +int i; > + > +#if __BITINT_MAXWIDTH__ >= 246 > +void > +

Re: [PATCH] Change gcc/ira-conflicts.cc build_conflict_bit_table to use size_t/%zu

2024-02-09 Thread Richard Biener

On Thu, Feb 1, 2024 at 4:26 PM Jakub Jelinek wrote: > > On Thu, Feb 01, 2024 at 03:55:51PM +0100, Jakub Jelinek wrote: > > No, besides the formatting being incorrect both in ChangeLog and in the > > patch, this pessimizes ILP32 hosts unnecessarily. > > So like this instead? OK. Thanks, Richard.

Re: [PATCH] [testsuite] tsvc: skip include malloc.h when unavailable

2024-02-09 Thread Richard Biener

dg/vect/tsvc/vect-tsvc-s000.c (test for excess errors) > > > Kind regards, > Torbjörn > > On 2023-05-24 11:02, Richard Biener via Gcc-patches wrote: > > On Wed, May 24, 2023 at 7:17 AM Alexandre Oliva via Gcc-patches > > wrote: > >> > >> > >&g

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 25289 matches

Mail list logo