On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> On Mon, Feb 26, 2024 at 03:15:02PM +0100, Richard Biener wrote:
> > When folding a multiply CHRECs are handled like {a, +, b} * c
> > is {a*c, +, b*c} but that isn't generally correct when overflow
> > invokes undefined behavior
When folding a multiply CHRECs are handled like {a, +, b} * c
is {a*c, +, b*c} but that isn't generally correct when overflow
invokes undefined behavior. The following uses unsigned arithmetic
unless either a is zero or a and b have the same sign.
I've used simple early outs for INTEGER_CSTs and
t-loop.cc b/gcc/tree-vect-loop.cc
> > > index
> > 35f1f8c7d4245135ace740ff9be548919587..ab19ad6a6be516e3ee1f0fbeaae
> > effeae1fb900f 100644
> > > --- a/gcc/tree-vect-loop.cc
> > > +++ b/gcc/tree-vect-loop.cc
> > > @@ -11987,7 +11987,12 @@ vect_tra
In some cases exits can lack LC PHI nodes for the virtual operand.
We have to create them when the epilog loop requires them which also
allows us to remove some only halfway correct fixups. This is the
variant triggering for alternate exits.
Bootstrap and regtest pending on
When we choose the IV exit to be one leading to no virtual use we
fail to have a virtual LC PHI even though we need it for the epilog
entry. The following makes sure to create it so that later updating
works.
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
PR
torizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -961,6 +961,10 @@ public:
>/* Statements whose VUSES need updating if early break vectorization is to
> happen. */
>auto_vec early_break_vuses;
> +
> + /* Dominators that need to be recalculated that have been deferred un
latch_edge (loop));
> + FOR_EACH_IMM_USE_STMT (use_stmt, iter, last_seen_vuse)
> + {
> + if (flow_bb_inside_loop_p (loop, use_stmt->bb))
> + continue;
> + FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
> + SET_USE (use_p, vuse);
> + }
> +}
> +
>/* And update the LC PHIs on exits. */
>for (edge e : get_loop_exit_edges (LOOP_VINFO_LOOP (loop_vinfo)))
> if (!dominated_by_p (CDI_DOMINATORS, e->src, dest_bb))
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> On Mon, Feb 26, 2024 at 09:53:41AM +0100, Richard Biener wrote:
> > On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> >
> > > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > > > @@ -6756,7 +
On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > > those through even when the mode isn't word_mode. For
> >
_attribute__((noipa)) int
> +bar (int x)
> +{
> + int w = (x >= 0 ? x : 0);
> + int z = (x <= 0 ? -x : 0);
> + return w + z;
> +}
> +
> +__attribute__((noipa)) int
> +baz (int x)
> +{
> + return x <= 0 ? -x : 0;
> +}
> +
> +int
> +main ()
> +{
&
> return
> fold_convert_loc (loc, type, associate_trees (loc, var0, con0,
> code, atype));
> --- gcc/testsuite/gcc.dg/bitint-94.c.jj 2024-02-24 11:18:32.607018363
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-94.c 2024-02-24 11:19:09.023500121 +0100
> @@ -0,0 +1,12 @@
> +/* PR middle-end/114084 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23 -pedantic-errors" } */
> +
> +typedef unsigned _BitInt(31) T;
> +T a, b;
> +
> +void
> +foo (void)
> +{
> + b = (T) ((a | (-1U >> 1)) >> 1 | (a | 5) << 4);
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
ed branches - the effective check should be the
same in GCC 13 at least, but with some added ad-hoc costing which might
make this not trigger (maybe_lt (nunits_out, 4U)) - so we'd need a
word_mode that can cover 4 FP elements. Possibly triggerable with
HFmode?
Thanks,
Richard.
> LGTM, but please wait until Monday evening so that Richi or Richard
> have a chance to chime in.
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
The following properly guards the simplifications that move
operations into VEC_CONDs, in particular when that changes the
type constraints on this operation.
This needed a genmatch fix which was recording spurious implicit fors
when tcc_comparison is used in a C expression.
Bootstrapped and
On Mon, Feb 26, 2024 at 4:26 AM wrote:
>
> From: Pan Li
>
> We allowed vector type for get_stored_val when read is less than or
> equal to store in previous. Unfortunately, we missed to adjust the
> validate_subreg part accordingly. For vector type, we don't need to
> restrict the mode size is
> Am 24.02.2024 um 08:44 schrieb Jakub Jelinek :
>
> Hi!
>
> I've searched for some uses of (HOST_WIDE_INT) constant or (unsigned
> HOST_WIDE_INT) constant and turned them into uses of the appropriate
> macros.
> THere are quite a few cases in non-i386 backends but I've left that out
> for
> Am 24.02.2024 um 08:40 schrieb Jakub Jelinek :
>
> Hi!
>
> The following patch implements support for VIEW_CONVERT_EXPRs from/to
> large/huge _BitInt to/from vector or complex types or anything else but
> integral/pointer types which doesn't need to live in memory.
>
>
> Am 24.02.2024 um 11:06 schrieb Richard Sandiford :
>
> During its forward pass, the SLP layout code tries to calculate
> the cost of a layout change on an incoming edge. This is taken
> as the minimum of two costs: one in which the source partition
> keeps its current layout (chosen
On Fri, 23 Feb 2024, Jakub Jelinek wrote:
> On Fri, Feb 23, 2024 at 02:22:19PM +, Andrew Stubbs wrote:
> > On 23/02/2024 13:02, Jakub Jelinek wrote:
> > > On Fri, Feb 23, 2024 at 12:58:53PM +, Andrew Stubbs wrote:
> > > > This is a follow-up to the previous patch to ensure that integer
> Am 23.02.2024 um 14:03 schrieb Jakub Jelinek :
>
> On Fri, Feb 23, 2024 at 12:58:53PM +, Andrew Stubbs wrote:
>> This is a follow-up to the previous patch to ensure that integer vector
>> bit-masks do not have excess bits set. It fixes a bug, observed on
>> amdgcn, in which the mask
29.464277919 +0100
> @@ -0,0 +1,17 @@
> +/* PR rtl-optimization/114054 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-Og -fwhole-program -fno-tree-ccp -fprofile-use
> -fno-tree-copy-prop -w" } */
> +
> +int x;
> +
> +void
> +foo (int i, u
"-flto" } { "" } } */
> +
> +unsigned a;
> +signed char b;
> +short c;
> +long d;
> +__int128 e;
> +int f;
> +
> +#if __BITINT_MAXWIDTH__ >= 511
> +__attribute__((noinline)) void
> +foo (_BitInt(3) x, unsigned _BitInt(511) y, unsigned *z)
> +
The following documents obsoleting of ia64*-*-*.
Pushed.
* gcc-14/changes.html: Document ia64*-*-* obsoleting.
---
htdocs/gcc-14/changes.html | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index
The following deprecates ia64*-*-* for GCC 14. Since we plan to
force LRA for GCC 15 and the target only has slim chances of getting
updated this notifies people in advance. Given both Linux and
glibc have axed the target further development is also made difficult.
"Tested" for ia64-elf and
The following adds another omission to the assert verifying we're
not running into spurious off == -1.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/114048
* tree-ssa-sccvn.cc (copy_reference_ops_from_ref): MEM_REF
can also produce -1
When we classify a conditional reduction chain as CONST_COND_REDUCTION
we fail to verify all involved conditionals have the same constant.
That's a quite unlikely situation so the following simply disables
such classification when there's more than one reduction statement.
Bootstrapped and tested
On Thu, Feb 22, 2024 at 10:07 AM Jakub Jelinek wrote:
>
> Hi!
>
> The profile_count::dump (char *, struct function * = NULL) const;
> method has a single caller, the
> profile_count::dump (FILE *f, struct function *fun) const;
> method and for that going through a temporary buffer is just slower
FLT128_MANT_DIG__
> +void
> +flt128 (_Float128 f1, _Float128 f2, _Float128 f3, _Float128 f4, _Float128 f5,
> + _Float128 f6, _Float128 f7, _Float128 f8, _Float128 f9)
> +{
> + if (!(f1 >= -1.0f128 && f1 <= 1.0f128)) __builtin_unreachable ();
> + __builtin_acosf
On Tue, Feb 20, 2024 at 3:33 PM Lewis Hyatt wrote:
>
> On Mon, Feb 19, 2024 at 11:36 PM Alexandre Oliva wrote:
> >
> > This backport for gcc-13 is the first of two required for the
> > g++.dg/pch/line-map-3.C test to stop hitting a variant of the known
> > problem mentioned in that testcase: on
run_expensive_tests } { "*" } { "-O0" "-O2" } } */
> +/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
> +
> +#if __BITINT_MAXWIDTH__ >= 129
> +int
> +foo (unsigned _BitInt(63) x, unsigned _BitInt
> Am 21.02.2024 um 13:34 schrieb Thomas Schwinge :
>
> Hi!
>
>> On 2024-02-01T15:49:02+0100, Richard Biener wrote:
>>> On Thu, 1 Feb 2024, Thomas Schwinge wrote:
>>> On 2024-01-26T10:45:10+0100, Richard Biener wrote:
>>>> On Fri, 26 Jan 2024,
On Tue, Feb 20, 2024 at 11:27 AM Iain Sandoe wrote:
>
> Tested on aarch64-linux-gnu, aarch64-darwin by me and on aarch64-linux-musl
> by Sam James (thanks!). OK for trunk?
OK
> thanks
> Iain
>
> --- 8< ---
>
>
> This allows the same trampoline pattern to be used on all linux variants
> rather
On Tue, 20 Feb 2024, Jakub Jelinek wrote:
> On Tue, Feb 20, 2024 at 09:01:10AM +0100, Richard Biener wrote:
> > I'm not sure those would be really equivalent (MEM_REF vs. V_C_E
> > as well as combined vs. split). It really depends how RTL expansion
> > handles this (as yo
On Tue, 20 Feb 2024, Thomas Schwinge wrote:
> Hi Richard!
>
> On 2024-02-20T08:44:35+0100, Richard Biener wrote:
> > On Mon, 19 Feb 2024, Thomas Schwinge wrote:
> >> On 2024-02-19T17:31:20+0100, I wrote:
> >> > On 2024-02-19T11:52:55+0100, Richard Biener
On Tue, 20 Feb 2024, Jakub Jelinek wrote:
> On Tue, Feb 20, 2024 at 12:12:11AM +, Jason Merrill wrote:
> > On 2/19/24 02:55, Jakub Jelinek wrote:
> > > On Fri, Feb 16, 2024 at 01:51:54PM +, Jonathan Wakely wrote:
> > > > Ah, although __atomic_compare_exchange only takes pointers, the
> >
On Mon, 19 Feb 2024, Thomas Schwinge wrote:
> Hi!
>
> On 2024-02-19T17:31:20+0100, I wrote:
> > On 2024-02-19T11:52:55+0100, Richard Biener wrote:
> >> On Mon, 19 Feb 2024, Thomas Schwinge wrote:
> >>> On 2024-02-16T14:53:04+0100, I wrote:
> >>&
a-prop.h
> index 9c78dc9f486..ee3c0006add 100644
> --- a/gcc/ipa-prop.h
> +++ b/gcc/ipa-prop.h
> @@ -627,7 +627,7 @@ public:
>vec *descriptors;
>/* Pointer to an array of structures describing individual formal
> parameters. */
> - class ipcp_param_lattices * G
a+sve conflicts
> with -mcpu=neoverse-n2 in previous gcc versions.
Yes.
Thanks,
Richard.
> Kind Regards,
> Andre
>
> On 20/12/2023 14:30, Richard Biener wrote:
> > On Wed, 20 Dec 2023, Andre Vieira (lists) wrote:
> >
> >> Thanks, fully agree with all comm
On Mon, 19 Feb 2024, Richard Sandiford wrote:
> Richard Biener writes:
> >> I suppose that's better than the first version when a block has a
> >> large number of dominance frontiers. But I can't remember whether
> >> that was the case in PR98863. I have a
On Mon, 19 Feb 2024, Richard Sandiford wrote:
> Richard Biener writes:
> > On Mon, 19 Feb 2024, Richard Sandiford wrote:
> >
> >> Richard Biener writes:
> >> > The following tries to address the PHI insertion compile-time hog in
> >> > RTL
On Mon, 19 Feb 2024, Richard Sandiford wrote:
> Richard Biener writes:
> > The following tries to address the PHI insertion compile-time hog in
> > RTL fwprop observed with the PR54052 testcase where the loop computing
> > the "unfiltered" set of variables poss
On Mon, 19 Feb 2024, Thomas Schwinge wrote:
> Hi!
>
> On 2024-02-16T14:53:04+0100, I wrote:
> > On 2024-02-16T12:41:06+, Andrew Stubbs wrote:
> >> On 16/02/2024 12:26, Richard Biener wrote:
> >>> On Fri, 16 Feb 2024, Andrew Stubbs wrote:
> >>
The following tries to address the PHI insertion compile-time hog in
RTL fwprop observed with the PR54052 testcase where the loop computing
the "unfiltered" set of variables possibly needing PHI nodes for each
block exhibits quadratic compile-time and memory-use.
Instead of only pruning the set
+/* PR tree-optimization/113967 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +typedef unsigned short W __attribute__((vector_size (4 * sizeof (short))));
> +
> +void
> +foo (W *p)
> +{
> + W x = *p;
> + W y = {};
> + __builtin_memc
On Sat, Feb 17, 2024 at 11:30 AM wrote:
>
> From: Pan Li
>
> This patch would like to add the middle-end presentation for the
> unsigned saturation add. Aka set the result of add to the max
> when overflow. It will take the pattern similar as below.
>
> SAT_ADDU (x, y) => (x + y) |
The following addresses the weak bitmap_hash function which results
in points-to analysis taking a long time because of a high collision
rate in one of its bitmap hash tables. Using a better hash function
like in the bitmap.cc hunk below doesn't help unless one also replaces
the hash function in
On Fri, 16 Feb 2024, Andrew Stubbs wrote:
> On 16/02/2024 10:17, Richard Biener wrote:
> > On Fri, 16 Feb 2024, Thomas Schwinge wrote:
> >
> >> Hi!
> >>
> >> On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote:
> >>
The following addresses consistency check fails in copy_reference_ops_from_ref
when we are handling out-of-bound array accesses (it's almost impossible
to identically mimic the get_ref_base_and_extent behavior). It also
addresses the case where an out-of-bound constant offset computes to a
-1 off
On Fri, 16 Feb 2024, Thomas Schwinge wrote:
> Hi!
>
> On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote:
> > I've committed this patch
>
> ... as commit c7ec7bd1c6590cf4eed267feab490288e0b8d691
> "amdgcn: add -march=gfx1030 EXPERIMENTAL", which the later RDNA3/gfx1100
> support builds on top
On Thu, Feb 15, 2024 at 7:38 PM Patrick Palka wrote:
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
> OK for trunk?
Btw, there's the "bitpack" streaming support in data-streamer.h also
added for exactly the same reason, it's likely not easily re-usable
but this kind of
> Am 15.02.2024 um 18:06 schrieb Richard Sandiford :
>
> Richard Biener writes:
>>> On Wed, 14 Feb 2024, Richard Biener wrote:
>>>
>>> For the testcase in PR113910 we spend a lot of time in PTA comparing
>>> bitmaps for looking up equivalence cla
}
> }
> }
> --- gcc/testsuite/gcc.target/i386/pr113921.c.jj 2024-02-14
> 21:21:15.194178515 +0100
> +++ gcc/testsuite/gcc.target/i386/pr113921.c 2024-02-14 21:20:52.745476040
> +0100
> @@ -0,0 +1,20 @@
> +/* PR middle-end/113921 */
> +/* { dg-do run } */
&g
The following fixes the omission of failing to look at pattern
stmts when we need to dissolve SLP only groups.
Bootstrapped and tested on x86-64-unknown-linux-gnu, pushed.
PR tree-optimization/56
* tree-vect-loop.cc (vect_dissolve_slp_only_groups): Look
at the pattern
On Thu, 15 Feb 2024, Andrew Stubbs wrote:
> On 15/02/2024 10:21, Richard Biener wrote:
> [snip]
> >>> I suppse if RDNA really only has 32 lane vectors (it sounds like it,
> >>> even if it can "simulate" 64 lane ones?) then it might make sense to
On Thu, 15 Feb 2024, Andrew Stubbs wrote:
> On 15/02/2024 07:49, Richard Biener wrote:
> > On Wed, 14 Feb 2024, Andrew Stubbs wrote:
> >
> >> On 14/02/2024 13:43, Richard Biener wrote:
> >>> On Wed, 14 Feb 2024, Andrew Stubbs wrote:
> >>>
On Thu, 15 Feb 2024, Richard Sandiford wrote:
> Richard Biener writes:
> > On Wed, 14 Feb 2024, Richard Sandiford wrote:
> >
> >> Richard Biener writes:
> >> > On Wed, 14 Feb 2024, Richard Sandiford wrote:
> >> >
> >> >> Richa
On Thu, Feb 15, 2024 at 12:16 AM Andrew Pinski wrote:
>
> In some of the standard pattern names, it is not obvious which mode is being
> used in the pattern
> name. Is it operand 0, 1, or 2? Is it the wider mode or the narrower mode?
> This fixes that so there is no confusion by adding a
On Thu, Feb 15, 2024 at 12:16 AM Andrew Pinski wrote:
>
> Currently these use `@var{m3}` but the 3 here is a literal 3
> and not part of the mode itself so it should not be inside
> the var. Fixed as such.
>
> Built the documentation to make sure it looks correct now.
OK
> gcc/ChangeLog:
>
>
The following avoids recording BB dependences for debug stmt uses.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
It's unlikely a dependence is just because of debug stmts so
actual compare-debug issues are very unlikely. Still spotted
while investigating a CI regression mail (for
> @@ -0,0 +1,23 @@
> +/* PR tree-optimization/113567 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O2" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 129
> +_BitInt(129) v;
> +
> +void
> +foo (_BitInt(129) a, int i)
> +{
> + __label__
On Wed, 14 Feb 2024, Richard Sandiford wrote:
> Richard Biener writes:
> > On Wed, 14 Feb 2024, Richard Sandiford wrote:
> >
> >> Richard Biener writes:
> >> > The following avoids accessing out-of-bound vector elements when
> >> > native
On Wed, 14 Feb 2024, Andrew Stubbs wrote:
> On 14/02/2024 13:43, Richard Biener wrote:
> > On Wed, 14 Feb 2024, Andrew Stubbs wrote:
> >
> >> On 14/02/2024 13:27, Richard Biener wrote:
> >>> On Wed, 14 Feb 2024, Andrew Stubbs wrote:
> >>>
> Am 14.02.2024 um 16:22 schrieb Jakub Jelinek :
>
> On Wed, Feb 14, 2024 at 04:13:51PM +0100, Richard Biener wrote:
>> The following removes the TBAA violation present in iterative_hash.
>> As we eventually LTO that it's important to fix. This also improves
>> co
> Am 14.02.2024 um 16:16 schrieb Tamar Christina :
>
>
>>
>>
>> I think this isn't entirely good. For simple cases for
>> do {} while the condition ends up in the latch while for while () {}
>> loops it ends up in the header. In your case the latch isn't empty
>> so it doesn't end up
The following tries to improve the actual hash function for hashing
bitmaps. We're still getting collision rates as high as 23 for the
testcase in the PR. The following improves this by properly mixing
in the bitmap element starting bit number. This brings down the
collision rate below 1.4,
The following removes the TBAA violation present in iterative_hash.
As we eventually LTO that it's important to fix. This also improves
code generation for the >= 12 bytes loop by using | to compose the
4 byte words as at least GCC 7 and up can recognize that pattern
and perform a 4 byte load
On Wed, 14 Feb 2024, Richard Biener wrote:
> For the testcase in PR113910 we spend a lot of time in PTA comparing
> bitmaps for looking up equivalence class members. This points to
> the very weak bitmap_hash function which effectively hashes set
> and a subset of not set bits. T
On Wed, 14 Feb 2024, Andrew Stubbs wrote:
> On 14/02/2024 13:27, Richard Biener wrote:
> > On Wed, 14 Feb 2024, Andrew Stubbs wrote:
> >
> >> On 13/02/2024 08:26, Richard Biener wrote:
> >>> On Mon, 12 Feb 2024, Thomas Schwinge wrote:
> >>>
>
e7bc33654ffa027b493f23d278ac..a29681bffb902d2d05e3f18764ab519aacb3c5bc
> 100644
> --- a/gcc/tree-cfg.cc
> +++ b/gcc/tree-cfg.cc
> @@ -327,6 +327,10 @@ replace_loop_annotate (void)
>if (loop->latch)
> replace_loop_annotate_in_block (loop->latch, loop);
>
&g
On Wed, 14 Feb 2024, Andrew Stubbs wrote:
> On 13/02/2024 08:26, Richard Biener wrote:
> > On Mon, 12 Feb 2024, Thomas Schwinge wrote:
> >
> >> Hi!
> >>
> >> On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote:
> >>
On Wed, 14 Feb 2024, Richard Sandiford wrote:
> Richard Biener writes:
> > The following avoids accessing out-of-bound vector elements when
> > native encoding a boolean vector with sub-BITS_PER_UNIT precision
> > elements. The error was basing the number o
On Wed, 14 Feb 2024, Richard Sandiford wrote:
> Richard Biener writes:
> > The following zeros paddings of vector bools when expanding compares
> > and the mode used for the compare is an integer mode. In that case
> > targets cannot distinguish between a 4 element
For the testcase in PR113910 we spend a lot of time in PTA comparing
bitmaps for looking up equivalence class members. This points to
the very weak bitmap_hash function which effectively hashes set
and a subset of not set bits. The following improves it by mixing
that weak result with the
The following fixes eliding of the permutation of a BB reduction
of an existing vector which breaks materialization of live lanes
as we fail to permute the SLP_TREE_SCALAR_STMTS vector.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/113896
*
On Tue, Feb 13, 2024 at 10:46 PM Andrew Pinski wrote:
>
> The vect testsuite will chose the dg-do default based on if it knows the
> running target does not support running with the vector extensions enabled
> (for easy of testing). The problem is when it is decided the default is
> compile
>
The recent enhancement to discover constant array indices by range
info used by get_ref_base_and_extent doesn't work when the outermost
component reference is to a bitfield because we track the running
offset in the reference ops as bytes. The following does as
ao_ref_init_from_vn_reference and
The SLP permute optimization rewrite fixed this.
Tested on x86_64-unknown-linux-gnu, pushed to trunk and 13 branch.
PR tree-optimization/113896
* g++.dg/torture/pr113896.C: New testcase.
---
gcc/testsuite/g++.dg/torture/pr113896.C | 35 +
1 file changed,
Pushed.
PR tree-optimization/113831
* tree-ssa-sccvn.cc (ao_ref_init_from_vn_reference): Fix
typo in comment.
---
gcc/tree-ssa-sccvn.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index
The following adjusts move_early_exit_stmts to track the last seen
VUSE instead of getting it from the last store which could be a PHI
where gimple_vuse doesn't work.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/113902
* tree-vect-loop.cc
= LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo);
> += LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)
> + || LOOP_VINFO_EARLY_BREAKS (loop_vinfo);
>/* The minimum number of iterations performed by the epilogue. This
> is 1 when peeling for gaps because we always need a final scalar
> iteration. */
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
The following fixes a missing add to the accumulated offset when
adjusting an ARRAY_REF op for value-ranges applied to by
get_ref_base_and_extent.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/113898
* tree-ssa-sccvn.cc
__CHAR_BIT__ + 1 \
> + - __builtin_clzll (~msb)); \
> if (BIL_TYPE_SIZE > DI##_BITS && n > DI##_BITS) \
> { \
> iv = msb >> (n - DI##_BITS
> -#if SIZE_MAX <= INT_MAX
> +#if SIZE_MAX <= UINT_MAX
> # define GCC_PRISZ ""
> # define fmt_size_t unsigned int
> -#elif SIZE_MAX <= LONG_MAX
> +#elif SIZE_MAX <= ULONG_MAX
> # define GCC_PRISZ HOST_LONG_FORMAT
> # define fmt_size_t unsigned long int
&g
On Mon, 12 Feb 2024, Thomas Schwinge wrote:
> Hi!
>
> On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote:
> > I've committed this patch
>
> ... as commit c7ec7bd1c6590cf4eed267feab490288e0b8d691
> "amdgcn: add -march=gfx1030 EXPERIMENTAL".
>
> The RDNA2 ISA variant doesn't support certain
> Am 12.02.2024 um 18:47 schrieb Jakub Jelinek :
>
> Hi!
>
> handle_cast handles the simple way all narrowing large/huge bitint to
> large/huge bitint conversions and also such widening conversions if we can
> assume that the most significant limb is processed using constant index
> and both
> Am 12.02.2024 um 18:14 schrieb Jakub Jelinek :
>
> Hi!
>
> This patch depends on the libiberty/vprintf-support.c change.
>
> Ok for trunk if that one is approved?
Ok
> 2024-02-12 Jakub Jelinek
>
>* gengtype.cc (adjust_field_rtx_def): Use HOST_SIZE_T_PRINT_UNSIGNED
>and cast
> Am 12.02.2024 um 18:13 schrieb Jakub Jelinek :
>
> Hi!
>
> As I wrote earlier and we've discussed on IRC, with the ia32 _BitInt
> enablement patch this testcase FAILs on ia32, there is nothing vectorized in
> there, even with -mavx512{vl,bw,dq}, so no dbgcnt messages are emitted.
>
> The
On Mon, Feb 12, 2024 at 2:35 PM Siddhesh Poyarekar wrote:
>
> On 2024-02-12 08:16, Martin Jambor wrote:
> >> This probably ties in somewhat with an idea David Malcolm had riffed on
> >> with me earlier, of caching files for diagnostics. If we could unify
> >> file accesses somehow, we could make
When we use get_ref_base_and_extent during VN and that ends up using
global ranges to restrict the range of a ref we have to take care
of not using the same expression in the hashtable as for a ref that
could not use that global range. The following attempts to ensure
this by applying similar
This makes sure to elide degenerate virtual PHIs when moving stores
across early exits.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
Richard.
PR tree-optimization/113863
* tree-vect-data-refs.cc (vect_analyze_early_break_dependences):
Record crossed
r me, as for you, it works for x86_64-linux-gnu:
>
> https://gcc.gnu.org/pipermail/gcc-testresults/2024-February/807609.html
>
> I hope this helps.
>
> Kind regards,
> Toon Moene.
>
> On 2/9/24 11:26, Richard Biener wrote:
> > The following allows a base term to be d
On Sat, Feb 10, 2024 at 1:55 PM Anbazhagan, Karthiban
wrote:
>
> [Public]
>
>
> Hi all,
>
>
>
> PFA, the patch that enables support for the next generation AMD Zen5 CPU via
> -march=znver5 with basic znver5 scheduler Model.
>
> We may update the scheduler model going forward.
>
>
>
> Good for
> Am 10.02.2024 um 11:03 schrieb Jakub Jelinek :
>
> Hi!
>
> torture/bitint-37.c test FAILed on i686-linux e.g. on
> signed _BitInt(575) + unsigned _BitInt(575) -> signed _BitInt(575)
> __builtin_add_overflow. With 64-bit limbs, we use 4 .UADDC calls in
> the IL, 2 in a loop (which handles
> Am 10.02.2024 um 10:56 schrieb Jakub Jelinek :
>
> Hi!
>
> The ia32 _BitInt support revealed a bug in floatbitint?d.c.
> As can be even guessed from how the code is written in the loop,
> the intention was to set inexact to non-zero whenever the remainder
> after division wasn't zero, but
> Am 10.02.2024 um 10:50 schrieb Jakub Jelinek :
>
> Hi!
>
> I've tried last night to enable _BitInt support for i?86-linux, and
> a few spots in libgcc emitted -Wshift-count-overflow warnings and clearly
> didn't do what it was supposed to do.
>
> Fixed thusly, bootstrapped/regtested on
> Am 10.02.2024 um 10:41 schrieb Jakub Jelinek :
>
> Hi!
>
> In the previous patch I haven't touched the gcc diagnostic routines,
> using HOST_SIZE_T_PRINT* for those is obviously undesirable because we
> want the strings to be translatable. We already have %w[diox] for
> HOST_WIDE_INT
> Am 10.02.2024 um 10:46 schrieb Jakub Jelinek :
>
> Hi!
>
> As the following testcases show, the gimple-low outlining of assume
> magic functions handled volatile automatic vars (including
> parameters/results) like non-volatile ones except it copied volatile
> to the new PARM_DECL, which
> Am 10.02.2024 um 10:39 schrieb Jakub Jelinek :
>
> Hi!
>
> I went through suspicios %l in format strings of *printf family functions
> combined with casts to (long) or (unsigned long) and tried to find out the
> types of the original expressions that were cast.
> Quite a few had size_t
ree-optimization/113783 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O2" } */
> +/* { dg-additional-options "-mavx512f" { target i?86-*-* x86_64-*-* } } */
> +
> +int i;
> +
> +#if __BITINT_MAXWIDTH__ >= 246
> +void
> +
On Thu, Feb 1, 2024 at 4:26 PM Jakub Jelinek wrote:
>
> On Thu, Feb 01, 2024 at 03:55:51PM +0100, Jakub Jelinek wrote:
> > No, besides the formatting being incorrect both in ChangeLog and in the
> > patch, this pessimizes ILP32 hosts unnecessarily.
>
> So like this instead?
OK.
Thanks,
Richard.
dg/vect/tsvc/vect-tsvc-s000.c (test for excess errors)
>
>
> Kind regards,
> Torbjörn
>
> On 2023-05-24 11:02, Richard Biener via Gcc-patches wrote:
> > On Wed, May 24, 2023 at 7:17 AM Alexandre Oliva via Gcc-patches
> > wrote:
> >>
> >>
> >&g
501 - 600 of 25289 matches
Mail list logo