The following addresses endless recursion in the
chrec_fold_{plus,multiply} functions when handling sign-conversions.
We only need to apply tricks when we'd fail (there's a chrec in the
converted operand) and we need to make sure to not turn the other
operand into something worse (for the
gt; +{
> + if (x == 25)
> +x = foo (2);
> + else if (x == 42)
> +x = foo (foo (3));
> + *y = bar (*p);
> +}
> +
> +void
> +corge (int x, int *y)
> +{
> + void *q[] = { &, &, &, & };
> + if (x == 25)
> +{
> +l1:
> + x = foo (2);
> +}
> + else if (x == 42)
> +{
> +l2:
> + x = foo (foo (3));
> +}
> +l3:
> + *y = bar (*p);
> + if (x < 4)
> +goto *q[x & 3];
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Tue, 12 Mar 2024, Jakub Jelinek wrote:
> On Tue, Mar 12, 2024 at 01:47:53PM +0100, Richard Biener wrote:
> > > Admittedly the above is the ugliest part of the patch IMHO.
> > > It isn't needed in all cases, but e.g. for the pr112709-2.c (qux) case
> > > we have
On Tue, 12 Mar 2024, Jakub Jelinek wrote:
> On Tue, Mar 12, 2024 at 11:42:03AM +0100, Richard Biener wrote:
> > > +static edge
> > > +edge_before_returns_twice_call (basic_block bb)
> > > +{
> > > + gimple_stmt_iterator gsi = gsi_start_nondebu
42;
> + return s;
> +}
> +
> +void
> +baz (struct S *p)
> +{
> + foo (1);
> + *p = bar (0);
> +}
> +
> +void
> +qux (int x, struct S *p)
> +{
> + if (x == 25)
> +x = foo (2);
> + else if (x == 42)
> +x = foo (foo (3));
> + *p = bar (x);
> +}
> +
> +void
> +corge (int x, struct S *p)
> +{
> + void *q[] = { &, &, &, & };
> + if (x == 25)
> +{
> +l1:
> + x = foo (2);
> +}
> + else if (x == 42)
> +{
> +l2:
> + x = foo (foo (3));
> +}
> +l3:
> + *p = bar (x);
> + if (x < 4)
> +goto *q[x & 3];
> +}
> --- gcc/testsuite/gcc.dg/ubsan/pr112709-2.c.jj2024-03-11
> 16:55:37.000378840 +0100
> +++ gcc/testsuite/gcc.dg/ubsan/pr112709-2.c 2024-03-11 17:13:37.517599492
> +0100
> @@ -0,0 +1,50 @@
> +/* PR sanitizer/112709 */
> +/* { dg-do compile } */
> +/* { dg-options "-fsanitize=undefined -O2" } */
> +
> +struct S { char c[1024]; } *p;
> +int foo (int);
> +
> +__attribute__((returns_twice, noipa)) int
> +bar (struct S x)
> +{
> + (void) x.c[0];
> + return 0;
> +}
> +
> +void
> +baz (int *y)
> +{
> + foo (1);
> + *y = bar (*p);
> +}
> +
> +void
> +qux (int x, int *y)
> +{
> + if (x == 25)
> +x = foo (2);
> + else if (x == 42)
> +x = foo (foo (3));
> + *y = bar (*p);
> +}
> +
> +void
> +corge (int x, int *y)
> +{
> + void *q[] = { &, &, &, & };
> + if (x == 25)
> +{
> +l1:
> + x = foo (2);
> +}
> + else if (x == 42)
> +{
> +l2:
> + x = foo (foo (3));
> +}
> +l3:
> + *y = bar (*p);
> + if (x < 4)
> +goto *q[x & 3];
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
> + if (x < 4)
> +goto *q[x & 3];
> +}
> --- gcc/testsuite/g++.dg/asan/pr69276.C.jj2020-01-14 20:02:46.691611212
> +0100
> +++ gcc/testsuite/g++.dg/asan/pr69276.C 2024-03-12 09:09:05.901446463
> +0100
> @@ -35,4 +35,5 @@ int main()
> }
>
> /* { dg-output "ERROR: AddressSanitizer: heap-buffer-overflow.*(\n|\r\n|\r)"
> } */
> -/* { dg-output "#0 0x\[0-9a-f\]+ +in A::A()" } */
> +/* { dg-output "#0 0x\[0-9a-f\]+ +in (A::A\\\(\\\)|vnull::operator
> vec\\\(\\\).*(\n|\r\n|\r)" } */
> +/* { dg-output "#1 0x\[0-9a-f\]+ +in A::A\\\(\\\))" } */
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
+/* PR tree-optimization/114293 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -w" } */
> +
> +int
> +foo (int x)
> +{
> + __builtin_memset (, 5, -1);
> + return __builtin_strlen ((char *) );
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
The following makes sure to pass in the SLP node for the live stmts
we are generating the reduction epilogue for to
vect_create_epilog_for_reduction. This follows the previous fix for
the non-SLP path.
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
Richard.
PR
relevant stmt not supported: _3 = *_2;
>
> so I think the tests need to require vect_hw_misalign. This is what
> this patch does.
>
> Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11.
>
> Ok for trunk?
OK.
Thanks,
Richard.
> Rainer
>
>
--
Richard
2;
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17:
> missed: unsupported vect permute { 1 0 3 2 5 4 }
> /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr37027.c:24:17:
> missed: unsupported load permutation
> /vol/gcc/src/hg/master/local/gcc/tests
On Mon, 11 Mar 2024, Jakub Jelinek wrote:
> On Mon, Mar 11, 2024 at 11:31:51AM +0100, Richard Biener wrote:
> > On Mon, 11 Mar 2024, Jakub Jelinek wrote:
> >
> > > On Sat, Mar 09, 2024 at 12:25:42PM +0100, Richard Biener wrote:
> > > > Ideally
On Mon, 11 Mar 2024, Jakub Jelinek wrote:
> On Sat, Mar 09, 2024 at 12:25:42PM +0100, Richard Biener wrote:
> > Ideally we?d clear TREE_ADDRESSABLE but set DECL_NOT_GIMPLE_REG,
> > I think the analysis where we check the base would be a more
> > appropriate place to enfor
When internal_get_tmp_var fails to gimplify the value the temporary
SSA name is supposed to be initialized with we can leak SSA names
with a NULL SSA_NAME_DEF_STMT into the IL. That's bad, so recover
from this by instead returning a decl in that case.
Bootstrapped and tested on
BitInt(64) b = *(_BitInt(64) *) __builtin_memmove (, p, sizeof
> (_BitInt(64)));
> +}
> +
> +#if __BITINT_MAXWIDTH__ >= 128
> +void
> +bar (void *p)
> +{
> + _BitInt(128) b = *(_BitInt(128) *) __builtin_memmove (, p, sizeof
> (_BitInt(128)));
> +}
> +#endif
>
On Mon, Mar 11, 2024 at 8:46 AM Richard Biener
wrote:
>
> On Sun, Mar 10, 2024 at 10:09 PM Jeff Law wrote:
> >
> >
> >
> > On 3/10/24 3:05 PM, Andrew Pinski wrote:
> > > On Sun, Mar 10, 2024 at 2:04 PM Jeff Law wrote:
> > >>
> > >&
On Sun, Mar 10, 2024 at 10:09 PM Jeff Law wrote:
>
>
>
> On 3/10/24 3:05 PM, Andrew Pinski wrote:
> > On Sun, Mar 10, 2024 at 2:04 PM Jeff Law wrote:
> >>
> >> Here's a potential approach to fixing PR92539, a P2 -Warray-bounds false
> >> positive triggered by loop unrolling.
> >>
> >> As I
On Sat, Mar 9, 2024 at 10:10 AM Alexandre Oliva wrote:
>
>
> The earlier patch for PR112938 arranged for volatile parms to be made
> indirect in internal strub wrapped bodies.
>
> The first problem that remained, more evident, was that the indirected
> parameter remained volatile, despite the
On Fri, Mar 8, 2024 at 6:50 PM Ken Matsui wrote:
>
> On Thu, Mar 7, 2024 at 10:49 PM Richard Biener
> wrote:
> >
> > On Thu, Mar 7, 2024 at 8:29 PM Ken Matsui wrote:
> > >
> > > On Tue, Mar 5, 2024 at 7:58 AM Richard Biener
> > > wrote:
> >
> Am 10.03.2024 um 11:02 schrieb Li, Pan2 :
>
> Committed, thanks Richard.
You might want to investigate why you get mask and not Len for a particular
stmt. mixing will cause variable length vectorization to fail.
> Pan
>
> -Original Message-----
> From: R
> Am 10.03.2024 um 04:14 schrieb pan2...@intel.com:
>
> From: Pan Li
>
> This patch would like to fix one ICE in vectorizable_store when both the
> loop_masks and loop_lens are enabled. The ICE looks like below when build
> with "-march=rv64gcv -O3".
>
> during GIMPLE pass: vect
> test.c:
> Am 09.03.2024 um 09:28 schrieb Jakub Jelinek :
>
> Hi!
>
> The following testcase ICEs, because update-address-taken subpass of
> fre5 rewrites
> _BitInt(128) b;
> vector(16) unsigned char _3;
>
> [local count: 1073741824]:
> _3 = MEM [(char * {ref-all})p_2(D)];
> MEM [(char *
> Am 09.03.2024 um 09:36 schrieb Jakub Jelinek :
>
> Hi!
>
> Before the recent PR111267 r14-8319 fwprop changes, fwprop would never try
> to propagate what was not considered PROFITABLE, where the profitable part
> actually was partly about profitability, partly about very good reasons
> not
On Fri, Mar 8, 2024 at 2:59 PM Richard Biener
wrote:
>
> On Fri, Mar 8, 2024 at 1:04 AM wrote:
> >
> > From: Pan Li
> >
> > This patch would like to fix one ICE in vectorizable_store for both the
> > loop_masks and loop_lens. The ICE looks like
On Fri, Mar 8, 2024 at 1:04 AM wrote:
>
> From: Pan Li
>
> This patch would like to fix one ICE in vectorizable_store for both the
> loop_masks and loop_lens. The ICE looks like below with "-march=rv64gcv -O3".
>
> during GIMPLE pass: vect
> test.c: In function ‘d’:
> test.c:6:6: internal
The following addresses a performance regression caused by the recent
SCEV analysis fix with regard to folding multiplications and undefined
behavior on overflow. We do not handle (T) { a, +, b } * c but can
treat sign-conversions from unsigned by performing the multiplication
in the unsigned
> + return AI();
> +}
> +}
> +
> +N1::N2::N3::AB ab;
> +
> +N1::N2::N3::AB &
> +N1::N2::N3::AB::bleh()
> +{
> + return ab;
> +}
> +
> +N1::N2::N3::AC::AC(int)
> +{
> +}
> +
> +void
> +N1::N2::N3::AC::m1(R::S)
> +{
> +}
> +
> +#ifndef SHARED
> +int
> +main()
> +{
> +}
> +#endif
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
RN (j)));
> +&& (asm_noperands (PATTERN (j))
> +> 0));
> edge e2 = find_edge (cur_bb, e->dest);
> if (e2)
> e2->flags
The testcase only XFAILs on targets where int has an alignment
of sizeof(int). Align the respective array this way to make it
XFAIL consistenlty.
Tested on x86_64-unknown-linux-gnu and cris-elf. Pushed.
PR testsuite/108355
* gcc.dg/tree-ssa/ssa-fre-104.c: Align e.
---
On Thu, 7 Mar 2024, Richard Sandiford wrote:
> Sorry, still catching up on email, but:
>
> Richard Biener writes:
> > We have optimize_vectors_before_lowering_p but we shouldn't even there
> > turn supported into not supported ops and as said, what's supported or
>
On Thu, Mar 7, 2024 at 8:29 PM Ken Matsui wrote:
>
> On Tue, Mar 5, 2024 at 7:58 AM Richard Biener
> wrote:
> >
> > On Tue, Mar 5, 2024 at 1:51 PM Ken Matsui wrote:
> > >
> > > On Tue, Mar 5, 2024 at 12:38 AM Richard Biener
> > > wrote:
> >
On Thu, Mar 7, 2024 at 1:25 PM Robin Dapp wrote:
>
> Attached v2 combines the checks.
>
> Bootstrapped and regtested on x86 an power10, aarch64 still running.
> Regtested on riscv64.
LGTM.
> Regards
> Robin
>
>
> Subject: [PATCH v2] vect: Do not peel epilogue for partial vectors.
>
>
On Thu, 7 Mar 2024, Jakub Jelinek wrote:
> On Thu, Mar 07, 2024 at 11:11:35AM +0100, Uros Bizjak wrote:
> > > Since you CCed me - looking at the code I wonder why we fatally fail.
> > > The following might also fix the issue and preserve more of the
> > > rest of the flow of the function.
> > >
>
On Thu, 7 Mar 2024, Uros Bizjak wrote:
> On Thu, Mar 7, 2024 at 10:56?AM Richard Biener wrote:
> >
> > On Thu, 7 Mar 2024, Uros Bizjak wrote:
> >
> > > The compiler, configured with --enable-checking=yes,rtl,extra ICEs with:
> > >
> > > internal c
On Thu, 7 Mar 2024, Uros Bizjak wrote:
> The compiler, configured with --enable-checking=yes,rtl,extra ICEs with:
>
> internal compiler error: RTL check: expected elt 0 type 'e' or 'u',
> have 'E' (rtx unspec) in try_combine, at combine.cc:3237
>
> This is
>
> 3236 /* Just replace
cc1_plugin;
> >
> >
> > diff --git a/libcc1/libcp1plugin.cc b/libcc1/libcp1plugin.cc
> > index 0eff7c68d29..da68c5d0ac1 100644
> > --- a/libcc1/libcp1plugin.cc
> > +++ b/libcc1/libcp1plugin.cc
> > @@ -33,6 +33,7 @@
> > #undef PACKAGE_VERSION
> &g
+ bar ("");
> + asm goto ("" : : : : l2);
> + asm ("");
> +l2:
> + goto l1;
> +}
> +
> +void
> +qux (void)
> +{
> + asm goto ("" : : : : l1);
> + bar ("");
> + goto l1;
> +l1:
> + baz ("");
> +}
> +
> +void
> +corge (void)
> +{
> + asm goto ("" : : : : l1);
> + baz ("");
> +l2:
> + return;
> +l1:
> + bar ("");
> + goto l2;
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
; +++ gcc/testsuite/gcc.dg/pr105533.c 2024-03-06 16:03:26.226084751 +0100
> @@ -0,0 +1,9 @@
> +/* PR middle-end/105533 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +long long
> +foo (long long x, long long y)
> +{
> + return ((x < 0) & (y !=
On Wed, Mar 6, 2024 at 9:21 PM Robin Dapp wrote:
>
> Hi,
>
> r14-7036-gcbf569486b2dec added an epilogue vectorization guard for early
> break but PR114196 shows that we also run into the problem without early
> break. Therefore remove early break from the conditions.
>
> gcc/ChangeLog:
>
>
21,7 @@ ao_ref_init_from_vn_reference (ao_ref *r
> if (maybe_eq (op->off, -1))
> max_size = -1;
> else
> - offset += op->off << LOG2_BITS_PER_UNIT;
> + offset += op->off * BITS_PER_UNIT;
> break;
>
>
; {
> sint64_t mul = a * b;
>
> return mul >= (sint64_t)INT32_MIN && mul <= (sint64_t)INT32_MAX ?
> (sint32_t)mul : INT32_MAX + ((x ^ y) < 0);
> }
>
> uint32_t sat_udiv (uint32_t a, uint32_t b)
> {
> return a / b; // never overflow
> }
>
> sint32
On Wed, Mar 6, 2024 at 10:56 PM Morten Linderud wrote:
>
> I've made an attempt at patching this issue as it produces unreproducible
> unreproducible binaries for Golang. I don't know C/C++ and it's my first gcc
> patch so please bear with me :)
I think this is a very fragile area - see
On Wed, 6 Mar 2024, Jakub Jelinek wrote:
> On Wed, Mar 06, 2024 at 11:45:42AM +0100, Richard Biener wrote:
> > OK, though feel free to add ARG_UNUSED to 'captures' as well.
>
> Ok, done below.
>
> > I think the INTEGRAL_TYPE_P should be redundant - the pattern
>
On Wed, 6 Mar 2024, Andrew Stubbs wrote:
> On 06/03/2024 12:09, Thomas Schwinge wrote:
> > Hi!
> >
> > On 2024-02-21T17:32:13+0100, Richard Biener wrote:
> >> Am 21.02.2024 um 13:34 schrieb Thomas Schwinge :
> >>> [...] per my work on <https://gcc.gn
The following reworks vectorizable_live_operation to pass the
live stmt to vect_create_epilog_for_reduction also for early breaks
and a peeled main exit. This is to be able to figure the scalar
definition to replace. This reverts the PR114192 fix as it is
subsumed by this cleanup.
Bootstrapped
41.274541636 +0100
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr114009.c 2024-03-05 15:16:09.056589675
> +0100
> @@ -0,0 +1,24 @@
> +/* PR tree-optimization/114009 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-
The following makes sure to strip type conversions added by
build_fold_addr_expr before placing the result in a call argument.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/114246
* tree-ssa-dse.cc (increment_start_addr): Strip useless
When we scrap the last def of an odd lane numbered BB reduction
we can end up recording a pattern def which will later wreck
code generation. The following puts this logic where it better
belongs, avoiding this issue.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR
On Tue, Mar 5, 2024 at 1:51 PM Ken Matsui wrote:
>
> On Tue, Mar 5, 2024 at 12:38 AM Richard Biener
> wrote:
> >
> > On Mon, Mar 4, 2024 at 9:40 PM Ken Matsui wrote:
> > >
> > > (x - y) CMP 0 is equivalent to x CMP y where x and y are signed
> > >
The following makes sure to use recognized patterns when vectorizing
roots during BB SLP discovery. We need to apply those late since
during root discovery we've not yet done pattern recognition.
All parts of the vectorizer assume patterns get used, for the testcase
we mix this up when doing live
On Tue, 5 Mar 2024, Jakub Jelinek wrote:
> On Tue, Mar 05, 2024 at 09:27:22AM +0100, Richard Biener wrote:
> > On Tue, 5 Mar 2024, Jakub Jelinek wrote:
> > > The following patch adds support for BIT_FIELD_REF lowering with
> > > large/huge _BitInt lhs. BIT_FIELD_REF r
8) at
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/main.cc:39
>
> BTW, does match.pd support nested cond like below? I am debugging into
> gimple_simplify_COND_EXPR for why not hit the pattern...
> +(simplify
> + (cond
> +(lt @0 integer_zerop)
On Mon, Mar 4, 2024 at 9:40 PM Ken Matsui wrote:
>
> (x - y) CMP 0 is equivalent to x CMP y where x and y are signed
> integers and CMP is <, <=, >, or >=. Similarly, 0 CMP (x - y) is
> equivalent to y CMP x. As reported in PR middle-end/113680, this
> equivalence does not hold for types other
ed int u;
> +V v;
> +
> +V
> +foo (unsigned __int128 h)
> +{
> + h = h << 64 | h >> 64;
> + h *= ~u;
> + return h + v;
> +}
> +
> +int
> +main ()
> +{
> + V x = foo (1);
> + if (x[0] != (unsigned __int128) 0x << 64)
> +__builtin_abort ();
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
e/gcc.target/i386/avx2-pr114157.c.jj 2024-03-04
> 19:12:46.001437331 +0100
> +++ gcc/testsuite/gcc.target/i386/avx2-pr114157.c 2024-03-04
> 19:12:31.639631618 +0100
> @@ -0,0 +1,5 @@
> +/* PR middle-end/114157 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O2 -std=c23 -Wno-psabi -w -mavx2 -mno-avx512f" } */
> +
> +#include "../../gcc.dg/bitint-98.c"
> --- gcc/testsuite/gcc.target/i386/avx512f-pr114157.c.jj 2024-03-04
> 19:13:01.190231847 +0100
> +++ gcc/testsuite/gcc.target/i386/avx512f-pr114157.c 2024-03-04
> 19:13:12.018085362 +0100
> @@ -0,0 +1,5 @@
> +/* PR middle-end/114157 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O2 -std=c23 -Wno-psabi -w -mavx512f" } */
> +
> +#include "../../gcc.dg/bitint-98.c"
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
The following avoids lowering a volatile bitfiled access and in case
the if-converted and original loops end up in different outer loops
because of simplifcations enabled scrap the result since that is not
how the vectorizer expects the loops to be laid out.
Bootstrapped and tested on
Status
==
The GCC development branch which will become GCC 14 is still
in regression and documentation fixes only mode (Stage 4).
GCC 14.1 will be released when we reach the milestone of
zero P1 regressions.
We've been into regression fixing for a good month now and at
least the pace of new
For precision less than int we apply the adjustment to make it defined
at zero after the adjustment to make it compute CLZ rather than CTZ.
That's wrong.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/114203
* tree-ssa-loop-niter.cc
The following fixes a missing replacement of the reduction value
used in the epilog, causing the scalar reduction to be kept live
across the early break exit path.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/114192
* tree-vect-loop.cc
On Fri, Mar 1, 2024 at 11:16 PM Edwin Lu wrote:
>
> When adding the new_preheader to the cfg, only the new_preheader's dominator
> information is updated. If one of the new basic block's children was part
> of the original cfg and adding new_preheader to the cfg introduces another
> path
> to
"sub$F$a3")
> OPTAB_NX(sub_optab, "sub$Q$a3")
> OPTAB_VL(subv_optab, "subv$I$a3", MINUS, "sub", '3', gen_intv_fp_libfunc)
> OPTAB_VX(subv_optab, "sub$F$a3")
> -OPTAB_NL(sssub_optab, "sssub$Q$a3", SS_MINUS, "sssub&q
ctor_size__(16)));
> +typedef _Float128 W __attribute__((__vector_size__(16)));
> +
> +_Float128
> +foo (void *p)
> +{
> + signed char c = *(_BitInt(128) *) p;
> + _Float128 f = *(_Float128 *) p;
> + W w = *(W *) p;
> + signed char r = ((union { W a; signed char b[16]; }) w).b[
On Sun, 3 Mar 2024, Jeff Law wrote:
>
>
> On 2/9/24 03:26, Richard Biener wrote:
> > The following allows a base term to be derived from an existing
> > MEM_EXPR, notably the points-to set of a MEM_REF base. For the
> > testcase in the PR this helps RTL
> Am 03.03.2024 um 13:56 schrieb Roger Sayle :
>
>
> This patch fixes PR target/114187 a typo/missed-optimization in simplify-rtx
> that's exposed by (my) changes to x86_64's parameter passing. The context
> is that construction of double word (TImode) values now uses the idiom:
>
>
> Am 03.03.2024 um 02:51 schrieb Iain Buclaw :
>
> Hi,
>
> This patch fixes a wrong code issue in the D front-end where lowered
> struct comparisons would reinterpret fields with a different (usually
> bigger) alignment than the original. Use `build_aligned_type' to
> preserve the alignment
On Thu, 29 Feb 2024, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 11:16:54AM +0100, Richard Biener wrote:
> > That said, the quick experiment shows this isn't anything for stage4.
>
> The earlier the vector lowering is moved in the pass list, the higher
> are the possibili
The following avoids creating unsupported VEC_COND_EXPRs as part of
SIMD clone call mask argument setup during vectorization which results
in inefficient decomposing of the operation during vector lowering.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
Will push on Monday when arm CI is
Andre Vieira
>
> On 10/11/2023 13:16, Richard Biener wrote:
> > The following fixes the issue that when SLP stmts are internal defs
> > but appear invariant because they end up only using invariant defs
> > then they get scheduled outside of the loop. This nice opti
t; --- gcc/function.cc.jj2024-01-12 13:47:20.834428745 +0100
> +++ gcc/function.cc 2024-02-29 21:14:35.275889093 +0100
> @@ -3650,7 +3650,8 @@ assign_parms (tree fndecl)
>assign_parms_initialize_all ();
>fnargs = assign_parms_augmented_arg_list ();
>
> - if (TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (fndecl)))
> + if (TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (fndecl))
> + && fnargs.is_empty ())
> {
>struct assign_parm_data_one data = {};
>assign_parms_setup_varargs (, , false);
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
x86_64-*-* } } */
> +
> +#if __BITINT_MAXWIDTH__ >= 128
> +_BitInt(128) a, b;
> +#else
> +int a, b;
> +#endif
> +
> +void
> +foo (void)
> +{
> + int u = b;
> + __builtin_memmove (, , sizeof (a));
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
The following removes the over-broad rejection of patterns for SLP
reductions which is done by removing them from LOOP_VINFO_REDUCTIONS
during pattern detection. That's also insufficient in case the
pattern only appears on the reduction path. Instead this implements
the proper correctness check
On Thu, Feb 29, 2024 at 1:47 PM Jakub Jelinek wrote:
>
> On Thu, Feb 29, 2024 at 04:26:00AM -0800, H.J. Lu wrote:
> > > > Adding Hongtao and Honza into the loop as the ones who acked the
> > > > original
> > > > patch.
> > > >
> > > > The no_callee_saved_registers by default for noreturn
On Thu, 29 Feb 2024, Jakub Jelinek wrote:
> On Thu, Feb 29, 2024 at 09:21:02AM +0100, Richard Biener wrote:
> > The following switches the logic in chrec_fold_multiply to
> > get_range_pos_neg since handling POLY_INT_CST possibly mixed with
> > non-poly ranges will make th
On Thu, 29 Feb 2024, Richard Biener wrote:
> The following amends the PR114070 fix to optimistically allow
> the folding when we cannot expand the current vec_cond using
> vcond_mask and we're still before vector lowering. This leaves
> a small window between vectorization and lower
The following amends the PR114070 fix to optimistically allow
the folding when we cannot expand the current vec_cond using
vcond_mask and we're still before vector lowering. This leaves
a small window between vectorization and lowering where we could
break vec_conds that can be expanded via
The following switches the logic in chrec_fold_multiply to
get_range_pos_neg since handling POLY_INT_CST possibly mixed with
non-poly ranges will make the open-coding awkward and while not
a perfect fit it should work.
In turn the following makes get_range_pos_neg aware of POLY_INT_CSTs.
I
On Wed, Feb 28, 2024 at 4:14 PM David Malcolm wrote:
>
> On Wed, 2024-02-28 at 08:58 +0100, Richard Biener wrote:
> > On Tue, Feb 27, 2024 at 10:20 PM Robert Dubner
> > wrote:
> > >
> > > Richard,
> > >
> > > Thank you very much f
On Wed, 28 Feb 2024, Andre Vieira (lists) wrote:
>
>
> On 27/02/2024 08:47, Richard Biener wrote:
> > On Mon, 26 Feb 2024, Andre Vieira (lists) wrote:
> >
> >>
> >>
> >> On 05/02/2024 09:56, Richard Biener wrote:
>
> Am 28.02.2024 um 16:05 schrieb Jeff Law :
>
>
>
>> On 2/28/24 03:05, Richard Biener wrote:
>>
>> Untested fix for targets that cannot handle the original IL below.
>> I'm not convinced that's the way to go here, is it? Or scrap
>> the testcase
This reverts the original fix for PR113831 which is better fixed by
the PR114121 fix. I've XFAILed instead of removing the PR108355
testcase again.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/113831
PR tree-optimization/108355
*
When VN ends up exploiting range-info specifying the ao_ref offset
and max_size we have to make sure to reflect this in the hashtable
entry for the recorded expression. The PR113831 fix handled the
case where we can encode this in the operands themselves but this
bug shows the issue is more
On Tue, 27 Feb 2024, Richard Biener wrote:
> On Tue, 27 Feb 2024, Jeff Law wrote:
>
> >
> >
> > On 2/27/24 06:53, Richard Biener wrote:
> > > On Tue, 27 Feb 2024, Jeff Law wrote:
> > >
> > >>
> > >>
> > >> On 2
On Wed, Feb 28, 2024 at 9:25 AM Jakub Jelinek wrote:
>
> On Wed, Feb 28, 2024 at 08:58:08AM +0100, Richard Biener wrote:
> > Incidentially this looks like something fit for a google summer of code
> > project.
> > Ideally it would hook into print-tree.cc providing an
*/
> +
> +unsigned a[24], b[24];
> +enum E { E0 = 0, E1 = 1, E42 = 42, E56 = 56 };
> +
> +__attribute__((noipa)) unsigned
> +foo (enum E x)
> +{
> + for (int i = 0; i < 24; ++i)
> +a[i] = i;
> + unsigned e;
> + if (x >= E42)
> +e = __builtin_clz ((un
= 256
> +void
> +foo (void *p, _BitInt(256) x)
> +{
> + __builtin_memcpy (p, , sizeof x);
> +}
> +
> +_BitInt(256)
> +bar (void *p, _BitInt(256) x)
> +{
> + _BitInt(246) y = x + 1;
> + __builtin_memcpy (p, , sizeof y);
> + return x;
> +}
> +#endif
&g
From a maintainance point I think it's important to have "dump a tree node"
once, so when fields are added or deemed useful for presenting in a dump
you don't have to chase down more than one place. Maintenance is also
the reason to not simply accept your contribution as-is.
I do hope th
On Tue, 27 Feb 2024, Jeff Law wrote:
>
>
> On 2/27/24 06:53, Richard Biener wrote:
> > On Tue, 27 Feb 2024, Jeff Law wrote:
> >
> >>
> >>
> >> On 2/27/24 00:43, Richard Biener wrote:
> >>> On Tue, 27
On Tue, 27 Feb 2024, Jeff Law wrote:
>
>
> On 2/27/24 00:43, Richard Biener wrote:
> > On Tue, 27 Feb 2024, haochen.jiang wrote:
> >
> >> On Linux/x86_64,
> >>
> >> af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> >> co
to see the bigger picture to be kept in mind
before altering the GIMPLE IL.
Adding an internal function for an already present optab is a
no-brainer. Adding a vectorizer
and/or if-conversion pattern to make use of this during vectorization
is existing practice.
Adding pattern recognition to
On Tue, Feb 27, 2024 at 1:50 PM Eric Botcazou wrote:
>
> Hi,
>
> this is a regression present on the mainline, 13 and 12 branches. For the
> attached Ada case, it's a tree checking failure on the mainline at -O:
>
> +===GNAT BUG DETECTED==+
> |
When folding a multiply CHRECs are handled like {a, +, b} * c
is {a*c, +, b*c} but that isn't generally correct when overflow
invokes undefined behavior. The following uses unsigned arithmetic
unless either a is zero or a and b have the same sign.
I've used simple early outs for INTEGER_CSTs and
On Tue, Feb 27, 2024 at 10:13 AM Jakub Jelinek wrote:
>
> On Tue, Feb 27, 2024 at 10:04:06AM +0100, Jakub Jelinek wrote:
> > > I hope we at least avoid that at -O0, possibly also with -Og?
> >
> > r14-8495 fixed at least that.
> >
> > Of course, it can break debugging experience even when the
On Mon, Feb 26, 2024 at 3:22 PM wrote:
>
> From: Pan Li
>
> We allowed vector type for get_stored_val when read is less than or
> equal to store in previous. Unfortunately, we missed to adjust the
> validate_subreg part accordingly. When the vector type's size is
> less than vector register,
On Sun, Feb 25, 2024 at 10:01 AM Tamar Christina
wrote:
>
> Hi Pan,
>
> > From: Pan Li
> >
> > Hi Richard & Tamar,
> >
> > Try the DEF_INTERNAL_INT_EXT_FN as your suggestion. By mapping
> > us_plus$a3 to the RTL representation (us_plus:m x y) in optabs.def.
> > And then expand_US_PLUS in
On Thu, Feb 22, 2024 at 5:46 PM Robert Dubner wrote:
>
> As part of an effort to learn how create a GENERIC tree in order to
> implement a
> COBOL front end, I created the dump_generic_nodes(), which accepts a
> function_decl at the point it is provided to the middle end. The routine
> generates
On Tue, Feb 27, 2024 at 9:42 AM Jakub Jelinek wrote:
>
> Hi!
>
> As mentioned in the PR, on x86_64 currently a lot of ICEs end up
> with crashes in the unwinder like:
> during RTL pass: expand
> pr114044-2.c: In function ‘foo’:
> pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn,
On Tue, 27 Feb 2024, Jakub Jelinek wrote:
> On Tue, Feb 27, 2024 at 09:35:43AM +0100, Richard Biener wrote:
> > I do wonder whether we can handle the missing LHS case generically
> > in the direct optab expander for fns that are PURE or CONST?
>
> Maybe the 2 operand expan
On Mon, 26 Feb 2024, Andre Vieira (lists) wrote:
>
>
> On 05/02/2024 09:56, Richard Biener wrote:
> > On Thu, 1 Feb 2024, Andre Vieira (lists) wrote:
> >
> >>
> >>
> >> On 01/02/2024 07:19, Richard Biener wrote:
> >>> On Wed, 31 Jan 2
024-02-26 14:19:30.079824133 +0100
> @@ -0,0 +1,45 @@
> +/* PR rtl-optimization/114044 */
> +/* { dg-do compile { target bitint575 } } */
> +/* { dg-options "-O -fno-tree-dce" } */
> +
> +void
> +foo (void)
> +{
> + unsigned _BitInt (575) a = 3;
> + __builtin_clzg (a)
On Tue, 27 Feb 2024, haochen.jiang wrote:
> On Linux/x86_64,
>
> af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> commit af66ad89e8169f44db723813662917cf4cbb78fc
> Author: Richard Biener
> Date: Fri Feb 23 16:06:05 2024 +0100
>
> middle-end/1
The following implements manual update for multi-exit loop prologue
peeling during vectorization.
Boostrap / regtest running on x86_64-unknown-linux-gnu.
I think the amount of coverage for prologue peeling with early exits
is very low, so my testing success might not mean much.
Richard.
401 - 500 of 25289 matches
Mail list logo