> Am 22.05.2024 um 17:30 schrieb Uros Bizjak :
>
> On Wed, May 22, 2024 at 5:15 PM Roger Sayle
> wrote:
>>
>> This single line patch fixes a strange quirk/glitch in i386's rtx_costs,
>> which considers an instruction loading a 64-bit constant to be significantly
>> cheaper than loading a 32
On Tue, 21 May 2024, Richard Sandiford wrote:
> Richard Biener writes:
> > The following avoids splitting store dataref groups during SLP
> > discovery but instead forces (eventually single-lane) consecutive
> > lane SLP discovery for all lanes of the group, creating V
On Mon, May 20, 2024 at 1:50 PM Tamar Christina wrote:
>
> Hi Pan,
>
> > -Original Message-
> > From: pan2...@intel.com
> > Sent: Monday, May 20, 2024 12:01 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> > ; richard.guent...@gmail.
On Sun, May 19, 2024 at 8:37 AM wrote:
>
> From: Pan Li
>
> This patch would like to support the branchless form for unsigned
> SAT_ADD when leverage __builtin_add_overflow. For example as below:
>
> uint64_t sat_add_u(uint64_t x, uint64_t y)
> {
> uint64_t ret;
> uint64_t overflow = __built
On Wed, May 22, 2024 at 3:17 AM wrote:
>
> From: Pan Li
>
> This patch would like to support the __builtin_add_overflow branch form for
> unsigned SAT_ADD. For example as below:
>
> uint64_t
> sat_add (uint64_t x, uint64_t y)
> {
> uint64_t ret;
> return __builtin_add_overflow (x, y, &ret) ?
On Mon, May 20, 2024 at 1:00 PM wrote:
>
> From: Pan Li
>
> There are sorts of match pattern for SAT related cases, there will be
> some duplicated code to check the dest, op_0, op_1 are same tree types.
> Aka ternary tree type matches. Thus, extract one helper function to
> do this and avoid m
The gcc.dg/vect/slp-12a.c case is interesting as we currently split
the 8 store group into lanes 0-5 which we SLP with an unroll factor
of two (on x86-64 with SSE) and the remaining two lanes are using
interleaving vectorization with a final unroll factor of four. Thus
we're using hybrid SLP withi
The following avoids splitting store dataref groups during SLP
discovery but instead forces (eventually single-lane) consecutive
lane SLP discovery for all lanes of the group, creating VEC_PERM
SLP nodes merging them so the store will always cover the whole group.
With this for example
int x[1024
On Tue, 21 May 2024, Richard Biener wrote:
> The gcc.dg/vect/slp-12a.c case is interesting as we currently split
> the 8 store group into lanes 0-5 which we SLP with an unroll factor
> of two (on x86-64 with SSE) and the remaining two lanes are using
> interleaving vectorization
When sinking code closer to its uses we already try to minimize the
distance we move by inserting at the start of the basic-block. The
following makes sure to sink closest to the control dependence
check of the region we want to sink to as well as make sure to
ignore control dependences that are o
On Wed, 22 May 2024, Richard Sandiford wrote:
> Richard Sandiford writes:
> > Richard Biener writes:
> >> When change_vec_perm_layout runs into a permute combining two
> >> nodes where one is invariant and one internal the partition of
> >> one input can be
The following fixes a reported typo.
Pushed.
* doc/invoke.texi (C++ Modules): Fix typo.
---
gcc/doc/invoke.texi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 218901c0b20..0625a5ede6f 100644
--- a/gcc/doc/invoke.texi
++
On Wed, May 22, 2024 at 9:21 AM Roger Sayle wrote:
>
>
> A number of testcases currently fail on nvptx with the ICE:
>
> during RTL pass: final
> openmp-simd-2.c: In function 'foo':
> openmp-simd-2.c:28:1: internal compiler error: in get_personality_function,
> at expr.cc:14037
>28 | }
>
On Wed, May 22, 2024 at 3:58 AM liuhongt wrote:
>
> According to IEEE standard, for conversions from floating point to
> integer. When a NaN or infinite operand cannot be represented in the
> destination format and this cannot otherwise be indicated, the invalid
> operation exception shall be sign
On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote:
>
> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote:
> > Thanks for the comments and suggestions.
> >
> > > On May 15, 2024, at 10:00, David Malcolm
> > > wrote:
> > >
> > > O
an-tree-dump-not "\.ASAN_CHECK " "asan1" } } */
> +
> +#ifdef __x86_64__
> +#define SEG __seg_gs
> +#else
> +#define SEG __seg_fs
> +#endif
> +
> +extern struct S { _Bool b; } s;
> +void bar (void);
> +
> +void
> +foo (void)
> +{
> + if (*(v
ions "-O3 -fno-tree-fre -fno-tree-dominator-opts
> -fno-tree-loop-im" } */
> +
> +int a, b, c, d;
> +signed char e[1] = { 1 };
> +
> +int
> +main ()
> +{
> + for (a = 0; a < 3; a++)
> +for (b = 0; b < 2; b++)
> + c = e[0] = e[0] ^ d;
> + if (!c)
> +__builtin_abort ();
> + return 0;
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
When change_vec_perm_layout runs into a permute combining two
nodes where one is invariant and one internal the partition of
one input can be -1 but the other might not be. The following
supports this case by simply ignoring inputs with input partiton -1.
I'm not sure this is correct but it avoid
On Tue, May 21, 2024 at 6:21 PM Jeff Law wrote:
>
>
>
> On 5/21/24 8:02 AM, Paul Koning wrote:
> >
> >
> >> On May 21, 2024, at 9:57 AM, Jeff Law wrote:
> >>
> >>
> >>
> >> On 5/21/24 12:05 AM, Richard Biener via G
On Tue, May 21, 2024 at 3:35 PM Andi Kleen wrote:
>
> > I can't see how this triggers on the IL above, the loop should have
> > ignored both the return and the clobber and when recursing to
> > the predecessor stop before the above check when runnig into the
> > call?
>
> Yes, I tracked that down
The gcc.dg/vect/slp-12a.c case is interesting as we currently split
the 8 store group into lanes 0-5 which we SLP with an unroll factor
of two (on x86-64 with SSE) and the remaining two lanes are using
interleaving vectorization with a final unroll factor of four. Thus
we're using hybrid SLP withi
The following avoids splitting store dataref groups during SLP
discovery but instead forces (eventually single-lane) consecutive
lane SLP discovery for all lanes of the group, creating VEC_PERM
SLP nodes merging them so the store will always cover the whole group.
With this for example
int x[1024
SLP permute nodes can end up without a SLP_REPRESENTATIVE now,
the following avoids touching it in this case in vect_schedule_slp_node.
* tree-vect-slp.cc (vect_schedule_slp_node): Avoid looking
at SLP_REPRESENTATIVE for VEC_PERM nodes.
---
gcc/tree-vect-slp.cc | 28 ++
The following plugs one hole where we require a VEC_PERM node
representative unnecessarily. This is for vect_check_store_rhs
which looks at the RHS and checks whether a constant can be
native encoded. The fix is to guard that with vect_constant_def
additionally and making vect_is_simple_use forgi
The following fixes the omission of const-pool included in NONLOCAL.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/115137
* tree-ssa-structalias.cc (pt_solution_includes_const_pool): NONLOCAL
also includes constant pool entries.
On Tue, May 21, 2024 at 11:03 AM Richard Sandiford
wrote:
>
> While reviewing Andrew's fix for PR114843, it seemed like it would
> be convenient to have a HARD_REG_SET of EH_RETURN_DATA_REGNOs.
> This patch adds one and uses it to simplify a couple of use sites.
>
> Tested on aarch64-linux-gnu & x
The following fixes a bug in vop-live get_live_in which was using
NULL to indicate the first processed edge but at the same time
using it for the case the live-in virtual operand cannot be computed.
The following fixes this, avoiding sinking a load to a place where
we'd have to insert virtual PHIs
On Mon, May 20, 2024 at 6:53 AM Andi Kleen wrote:
>
> On Tue, May 14, 2024 at 04:15:08PM +0200, Richard Biener wrote:
> > On Sun, May 5, 2024 at 8:16 PM Andi Kleen wrote:
> > >
> > > - Give error messages for all causes of non sibling call generation
> > >
On Tue, May 21, 2024 at 4:35 AM Hongtao Liu wrote:
>
> On Wed, May 15, 2024 at 5:24 PM Richard Biener
> wrote:
> >
> > On Wed, May 15, 2024 at 4:15 AM Hongtao Liu wrote:
> > >
> > > On Mon, May 13, 2024 at 3:40 PM Richard Biener
> > > wrote:
On Tue, May 21, 2024 at 12:02 AM Andrew Pinski wrote:
>
> The problem here is the pattern added in r13-1162-g9991d84d2a8435
> assumes that it is well defined to multiply zero_one_valuep by the truncated
> converted integer constant. It is well defined for all types except for
> signed 1bit types.
On Mon, May 20, 2024 at 4:45 PM Gerald Pfeifer wrote:
>
> On Wed, 5 Jul 2023, Joern Rennecke wrote:
> > I haven't worked with these targets in years and can't really do
> > sensible maintenance or reviews of patches for them. I am currently
> > working on optimizations for other ports like RISC-V.
On Mon, May 20, 2024 at 11:37 PM Andrew Pinski (QUIC)
wrote:
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Sunday, May 19, 2024 11:55 AM
> > To: Andrew Pinski (QUIC)
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH] PHIOPT:
> Am 19.05.2024 um 01:12 schrieb Andrew Pinski :
>
> The problem here is even if last_and_only_stmt returns a statement,
> the bb might still contain a phi node which defines a ssa name
> which is used in that statement so we need to add a check to make sure
> that the phi nodes are empty for
On Fri, 17 May 2024, Manolis Tsamis wrote:
> On Fri, May 17, 2024 at 12:22 PM Richard Biener wrote:
> >
> > On Fri, 17 May 2024, Manolis Tsamis wrote:
> >
> > > Hi Richard,
> > >
> > > While I was re-testing the latest version of this pat
On Fri, May 17, 2024 at 11:56 AM Tamar Christina
wrote:
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Friday, May 17, 2024 10:46 AM
> > To: Tamar Christina
> > Cc: Victor Do Nascimento ; gcc-
> > patc...@gcc.gnu.org; Richard Sandi
On Fri, May 17, 2024 at 11:05 AM Tamar Christina
wrote:
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Friday, May 17, 2024 6:51 AM
> > To: Victor Do Nascimento
> > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford ;
> > Richard Earnshaw
-fwrapv because then
we do _not_ perform this premature optimization. Without -fwrapv
the optimization is valid but as you note we do not perform it
consistently - otherwise we wouldn't regress.
Richard.
> Thanks,
> Manolis
>
>
>
> On Thu, May 16, 2024 at 11:15 AM Richard Bie
view_converted_memref_p was checking the reference type against the
pointer type of the offset operand rather than its pointed-to type
which leads to all refs being subject to view-convert treatment
in get_alias_set causing numerous testsuite fails but with its
new uses from r15-512-g9b7cad5884f21c
The ptr-vs-ptr compare folding using points-to info was missing a
check for const_pool being included in the escaped solution. The
following fixes that, fixing the observed execute FAIL of
experimental/functional/searchers.cc
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
On Thu, May 16, 2024 at 11:19 PM Tamar Christina
wrote:
>
> Hi,
>
> > -Original Message-
> > From: Victor Do Nascimento
> > Sent: Thursday, May 16, 2024 2:57 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Richard Sandiford ; Richard Earnshaw
> > ; Victor Do Nascimento
> >
> > Subject: [PATCH
On Thu, May 16, 2024 at 4:40 PM Victor Do Nascimento
wrote:
>
> From: Victor Do Nascimento
>
> At present, the compiler offers the `{u|s|us}dot_prod_optab' direct
> optabs for dealing with vectorizable dot product code sequences. The
> consequence of using a direct optab for this is that backend
On Thu, 16 May 2024, Jeff Law wrote:
>
>
> On 5/16/24 6:03 AM, Richard Biener wrote:
> > Now that we handle pt.null conservatively we can implement the missing
> > tracking of constant pool entries (aka STRING_CST) and handle
> > ptr-ptr compares using points-to i
On Wed, 3 Apr 2024, Chung-Lin Tang wrote:
> Hi Richard, Thomas,
>
> On 2023/10/30 8:46 PM, Richard Biener wrote:
> >>
> >> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
> >> 'x' decl itself!) as 'reado
On Fri, Apr 12, 2024 at 5:07 AM HAO CHEN GUI wrote:
>
> Hi,
> This patch adds an optab for __builtin_isfinite. The finite check can be
> implemented on rs6000 by a single instruction. It needs an optab to be
> expanded to the certain sequence of instructions.
>
> The subsequent patches will im
On Fri, Apr 12, 2024 at 10:10 AM HAO CHEN GUI wrote:
>
> Hi,
> This patch adds an optab for __builtin_isnormal. The normal check can be
> implemented on rs6000 by a single instruction. It needs an optab to be
> expanded to the certain sequence of instructions.
>
> The subsequent patches will i
On Thu, May 16, 2024 at 8:50 AM Tamar Christina wrote:
>
> > -Original Message-
> > From: pan2...@intel.com
> > Sent: Thursday, May 16, 2024 5:06 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> > ; richard.guent...@gmail.com; Richar
Now that we handle pt.null conservatively we can implement the missing
tracking of constant pool entries (aka STRING_CST) and handle
ptr-ptr compares using points-to info in ptrs_compare_unequal.
Bootstrapped on x86_64-unknown-linux-gnu, (re-)testing in progress.
Richard.
PR tree-optimiz
iscv specific part of course needs riscv approval.
> Pan
>
> -Original Message-
> From: Richard Biener
> Sent: Thursday, May 16, 2024 4:10 PM
> To: Li, Pan2
> Cc: Tamar Christina ; gcc-patches@gcc.gnu.org;
> juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Liu, Hongtao
The following fixes points-to analysis which ignores the fact that
volatile qualified refs can result in any pointer.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
Btw, I noticed this working on ptr-vs-ptr compare simplification
using points-to info and running into gcc.c-torture/e
On Fri, Apr 5, 2024 at 8:14 PM Andrew Pinski wrote:
>
> On Fri, Apr 5, 2024 at 5:28 AM Manolis Tsamis wrote:
> >
> > If we consider code like:
> >
> > if (bar1 == x)
> > return foo();
> > if (bar2 != y)
> > return foo();
> > return 0;
> >
> > We would like the ifcombine pa
On Sun, May 12, 2024 at 3:40 PM Peter Damianov wrote:
>
> Currently, commands like:
> gcc -o file.c -lm
> will delete the user's code.
>
> This patch makes the linker write executables to a temp file, and then renames
> the temp file if successful. This fixes the case above, but has limitations.
>
On Tue, May 14, 2024 at 10:58 AM Manolis Tsamis wrote:
>
> New patch with the requested changes can be found below.
>
> I don't know how much this affects SCEV, but I do believe that we
> should incorporate this change somehow. I've seen various cases of
> suboptimal address calculation codegen th
On Wed, May 15, 2024 at 1:36 PM Li, Pan2 wrote:
>
> > LGTM but you'll need an OK from Richard,
> > Thanks for working on this!
>
> Thanks Tamar for help and coaching, let's wait Richard for a while,😊!
OK.
Thanks for the patience,
Richard.
> Pan
>
> -Original Message-
> From: Tamar Chris
DSE currently gives up when the path we analyze forks. This leads
to multiple missed dead store elimination PRs. The following fixes
this by recursing for each path and maintaining the visited bitmap
to avoid visiting CFG re-merges multiple times. The overall cost
is still limited by the same bo
On Thu, May 16, 2024 at 8:25 AM Hongyu Wang wrote:
>
> Hi,
>
> In ix86_override_options_after_change, calls to ix86_default_align
> and ix86_recompute_optlev_based_flags will cause mismatched target
> opt_set when doing cl_optimization_restore. Move them back to
> ix86_option_override_internal to
DSE currently gives up when the path we analyze forks. This leads
to multiple missed dead store elimination PRs. The following fixes
this by recursing for each path and maintaining the visited bitmap
to avoid visiting CFG re-merges multiple times. The overall cost
is still limited by the same bo
On Wed, 15 May 2024, Jakub Jelinek wrote:
> On Wed, May 15, 2024 at 01:41:04PM +0200, Richard Biener wrote:
> > PR middle-end/111422
> > * cfgexpand.cc (add_scope_conflicts_2): Handle PHIs
> > by recursing to their arguments.
> > ---
The gcc.c-torture/execute/pr111422.c testcase after installing the
sink pass improvement reveals that we also need to handle
_65 = &g + _58; _44 = &g + _43;
# _59 = PHI <_65, _44>
*_59 = 8;
g = {v} {CLOBBER(eos)};
...
n[0] = &f;
*_59 = 8;
g = {v} {CLOBBER(eos)};
where we fail
On Wed, May 15, 2024 at 12:29 PM Tamar Christina
wrote:
>
> Hi All,
>
> Some Neoverse Software Optimization Guides (SWoG) have a clause that state
> that for predicated operations that also produce a predicate it is preferred
> that the codegen should use a different register for the destination t
and then invoke
'autoconf' from each directory.
At least that's how I do it. But my question was whether upstream
libtool has your fix or
whether this is a downstream patch against libtool.m4 which we need to carry.
Richard.
>
> From: Richard Biener
> Sent:
The following removes the profile based heuristic limiting sinking
and instead uses post-dominators to avoid sinking to places that
are executed under the same conditions as the earlier location which
the profile based heuristic should have guaranteed as well.
To avoid regressing this moves the em
time since we updated libtool, is this fixed in libtool
upstream in the
same way? You are missing a ChangeLog entry which should indicate which
files were just re-generated and which ones you edited (and what part).
Richard.
> ____
> From: Richard Biener
> Sent:
On Wed, May 15, 2024 at 4:15 AM Hongtao Liu wrote:
>
> On Mon, May 13, 2024 at 3:40 PM Richard Biener
> wrote:
> >
> > On Mon, May 13, 2024 at 4:29 AM liuhongt wrote:
> > >
> > > As testcase in the PR, O3 cunrolli may prevent vectorization for the
>
On Tue, May 14, 2024 at 5:52 PM Andrew Pinski wrote:
>
> When I was checking to making sure that all of the bugs dealing
> with the case where gimple_can_duplicate_bb_p would return false was fixed,
> I noticed that the code which was checking if a call statement was
> returns_twice was checking a
On Tue, May 14, 2024 at 10:27 PM trcrsired wrote:
>
> From: trcrsired
>
> When building native GCC for the x86_64-w64-mingw32 host, the compiler copies
> its library DLLs to the `bin` directory. However, in the case of a multilib
> configuration, both 32-bit and 64-bit libraries end up in the s
On Tue, 14 May 2024, Qing Zhao wrote:
>
>
> > On May 14, 2024, at 13:14, Richard Biener wrote:
> >
> > On Tue, 14 May 2024, Qing Zhao wrote:
> >
> >>
> >>
> >>> On May 14, 2024, at 10:29, Richard Biener wrote:
> >>>
On Tue, 14 May 2024, Kees Cook wrote:
> On Tue, May 14, 2024 at 02:17:16PM +, Qing Zhao wrote:
> > The current major issue with the warning is: the constant index value 4
> > is not in the source code, it’s a compiler generated intermediate value
> > (even though it’s a correct value -:)). Su
On Tue, 14 May 2024, Qing Zhao wrote:
>
>
> > On May 14, 2024, at 10:29, Richard Biener wrote:
> >
[...]
> > It would of course
> > need experimenting since we can end up moving stmts and merging blocks
> > though the linear traces created by jump thre
On Tue, May 14, 2024 at 6:30 PM Andi Kleen wrote:
>
> > Looks generally OK though does this mean people can debug
> > programs using [[gnu::musttail]] only with optimized builds? It
> > seems to me we should try harder to make [[gnu::musttail]] work
> > at -O0 and generally behave the same at all
The following revisits the fix for PR99954 which was observed as
causing missed memcpy recognition and instead using memmove for
non-aliasing copies. While the original fix mitigated bogus
recognition of memcpy the root cause was not properly identified.
The root cause is dr_analyze_indices "faili
On Tue, 14 May 2024, Qing Zhao wrote:
>
>
> > On May 14, 2024, at 09:08, Richard Biener wrote:
> >
> > On Mon, 13 May 2024, Qing Zhao wrote:
> >
> >> -Warray-bounds is an important option to enable linux kernal to keep
> >> the
On Sat, May 4, 2024 at 5:06 PM Ben Boeckel wrote:
>
> The initial P1689 patches were written in 2019 and ended up having code
> move around over time ended up introducing a `struct` keyword to the
> implementation of `cpp_finish`. Remove it to match the rest of the file
> and its declaration in th
On Sun, May 5, 2024 at 8:16 PM Andi Kleen wrote:
>
> gcc/ChangeLog:
>
> * doc/extend.texi: Document [[musttail]]
> ---
> gcc/doc/extend.texi | 22 --
> 1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index e
On Sun, May 5, 2024 at 8:16 PM Andi Kleen wrote:
>
> - Give error messages for all causes of non sibling call generation
> - Don't override choices of other non sibling call checks with
> must tail. This causes ICEs. The must tail attribute now only
> overrides flag_optimize_sibling_calls locally.
On Wed, May 8, 2024 at 9:37 PM Iain Sandoe wrote:
>
> Hi Folks,
>
> I’d like to land a viable solution to this issue if possible, (it is a show-
> stopper for the aarch64-darwin development branch).
I was looking as to how we handle __builtin_trap (whether we have an
optab for it) - we seem to us
On Mon, May 6, 2024 at 4:49 PM wrote:
>
> From: Pan Li
>
> This patch depends on below scalar enabling patch:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html
>
> For vectorize, we leverage the existing vect pattern recog to find
> the pattern similar to scalar and let the vecto
On Mon, May 6, 2024 at 4:48 PM wrote:
>
> From: Pan Li
>
> This patch would like to add the middle-end presentation for the
> saturation add. Aka set the result of add to the max when overflow.
> It will take the pattern similar as below.
>
> SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) <
+
> +void sparx5_set (int *ptr, struct nums *sg, int index)
> +{
> + int *val = &sg->vals[index]; /* { dg-bogus "is above array bounds" } */
> +
> + assign(0,ptr, index);
> + assign(*val, ptr, index);
> +}
> diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-threadupdate.cc
> index fa61ba9512b7..9f338dd4d54d 100644
> --- a/gcc/tree-ssa-threadupdate.cc
> +++ b/gcc/tree-ssa-threadupdate.cc
> @@ -2371,6 +2371,17 @@ back_jt_path_registry::adjust_paths_after_duplication
> (unsigned curr_path_num)
> }
> }
>
> +/* Set all the stmts in the basic block BB as IS_SPLITTED. */
> +
> +static void
> +set_stmts_in_bb_is_splitted (basic_block bb)
> +{
> + gimple_stmt_iterator gsi;
> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +gimple_set_is_splitted (gsi_stmt (gsi), true);
> + return;
> +}
> +
> /* Duplicates a jump-thread path of N_REGION basic blocks.
> The ENTRY edge is redirected to the duplicate of the region.
>
> @@ -2418,6 +2429,10 @@ back_jt_path_registry::duplicate_thread_path (edge
> entry,
>basic_block *region_copy = XNEWVEC (basic_block, n_region);
>copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop,
> split_edge_bb_loc (entry), false);
> + /* Mark all the stmts in both original and copied basic blocks
> + as IS_SPLITTED. */
> + set_stmts_in_bb_is_splitted (*region);
> + set_stmts_in_bb_is_splitted (*region_copy);
>
>/* Fix up: copy_bbs redirects all edges pointing to copied blocks. The
> following code ensures that all the edges exiting the jump-thread path
> are
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Mon, May 13, 2024 at 8:28 PM Jan-Benedict Glaw wrote:
>
> On Mon, 2024-05-13 20:19:42 +0200, Jan-Benedict Glaw
> wrote:
> > On Tue, 2024-04-30 17:24:15 -0400, Andrew MacLeod
> > wrote:
> > > Bootstrapped on x86_64-pc-linux-gnu with no regressions. pushed.
> >
> > Starting with this patch (
_ref1))
> - != TYPE_MAIN_VARIANT (TREE_TYPE (end_struct_ref2)))
> + && same_type_for_tbaa (TREE_TYPE (end_struct_ref1),
> + TREE_TYPE (end_struct_ref2)) != 1)
> return flags | ACCESS_PATH;
>
>/* Now compare all handled components of the access path.
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
ctor_element_bits (ret_type); unsigned
> + int arg_elt_bits = vector_element_bits (arg_type); if (ret_elt_bits <
> + arg_elt_bits)
> +modifier = NARROW;
> + else if (ret_elt_bits > arg_elt_bits)
> +modifier = WIDEN;
> +
> + if (((code == FIX_TRUNC_EXPR &&
The following revisits the fix for PR99954 which was observed as
causing missed memcpy recognition and instead using memmove for
non-aliasing copies. While the original fix mitigated bogus
recognition of memcpy the root cause was not properly identified.
The root cause is dr_analyze_indices "fail
On Mon, May 13, 2024 at 4:14 PM Robin Dapp wrote:
>
> > What happens if we simply remove all of the force_reg here?
>
> On x86 I bootstrapped and tested the attached without fallout
> (gcc188, so it's no avx512-native machine and therefore limited
> coverage). riscv regtest is unchanged.
> For aa
When enabling single-lane SLP and not splitting groups the fix for
PR60276 is no longer effective since it for unknown reason exempted
pure SLP. The following removes this exemption, making
gcc.dg/vect/pr60276.c PASS even with --param vect-single-lane-slp=1
Bootstrapped and tested on x86_64-unkno
The following refactors a bit how we perform SLP reduction group
discovery possibly making it easier to have multiple reduction
groups later, esp. with single-lane SLP.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
* tree-vect-slp.cc (vect_analyze_slp_instance): Remove
On Mon, 13 May 2024, Tamar Christina wrote:
> > -Original Message-
> > From: Richard Biener
> > Sent: Friday, May 10, 2024 2:07 PM
> > To: Richard Biener
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH] Allow patterns in SLP reductions
>
struct S) { z >> B, (T) z };
> +}
> +
> +struct S
> +bar (T x)
> +{
> + W z = (W) x + 132;
> + return (struct S) { z >> B, (T) z };
> +}
> +
> +struct S
> +baz (T x, unsigned short y)
> +{
> + W z = (W) x + y;
> + return (struct S) { z >> B, (T) z };
> +}
> +
> +struct S
> +qux (unsigned short x, T y)
> +{
> + W z = (W) x + y;
> + return (struct S) { z >> B, (T) z };
> +}
> +
> +struct S
> +corge (T x, T y)
> +{
> + T w = x + y;
> + W z = (W) x + y;
> + return (struct S) { z >> B, w };
> +}
> +
> +struct S
> +garple (T x, T y)
> +{
> + W z = (W) x + y;
> + T w = x + y;
> + return (struct S) { z >> B, w };
> +}
> +
> +/* { dg-final { scan-tree-dump-times "ADD_OVERFLOW" 6 "widening_mul" {
> target { i?86-*-* x86_64-*-* } } } } */
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Mon, May 13, 2024 at 8:18 AM Robin Dapp wrote:
>
> > How does this make a difference in the end? I'd expect say forwprop to
> > fix things?
>
> In general we try to only add the masking "boilerplate" of our
> instructions at split time so fwprop, combine et al. can do their
> work uninhibited
On Mon, May 13, 2024 at 4:29 AM liuhongt wrote:
>
> As testcase in the PR, O3 cunrolli may prevent vectorization for the
> innermost loop and increase register pressure.
> The patch removes the 1/3 reduction of unr_insn for innermost loop for UL_ALL.
> ul != UR_ALL is needed since some small loop
On Mon, May 13, 2024 at 3:39 AM Kewen.Lin wrote:
>
> Hi Joseph and Richi,
>
> Thanks for the suggestions and comments!
>
> on 2024/5/10 14:31, Richard Biener wrote:
> > On Thu, May 9, 2024 at 9:12 PM Joseph Myers wrote:
> >>
> >> On Wed, 8 May 2024, Kewe
When loop distribution releases a PHI node of the original IL it
can end up clobbering memory that's re-used when it upon releasing
its RDG resets all stmt UIDs back to -1, even those that got released.
The fix is to avoid resetting UIDs based on stmts in the RDG but
instead reset only those still
On Fri, May 10, 2024 at 3:18 PM Robin Dapp wrote:
>
> Hi,
>
> this only forces the first comparison operator into a register if it is
> not already suitable.
>
> Bootstrap and regtest is running on x86 and aarch64, successful on p10.
> Regtested on riscv.
How does this make a difference in the en
On Fri, Mar 1, 2024 at 10:21 AM Richard Biener wrote:
>
> The following removes the over-broad rejection of patterns for SLP
> reductions which is done by removing them from LOOP_VINFO_REDUCTIONS
> during pattern detection. That's also insufficient in case the
> pattern
On Fri, May 10, 2024 at 12:55 PM Di Zhao OS
wrote:
>
> This patch tries to fix pr114760 by checking for the
> variants explicitly. When recognizing bit counting idiom,
> include pattern "x * 2" for "x << 1", and "x / 2" for
> "x >> 1" (given x is unsigned).
>
> Bootstrapped and tested on x86_64-li
On Fri, May 10, 2024 at 12:54 PM Segher Boessenkool
wrote:
>
> On Fri, May 10, 2024 at 12:19:35PM +0200, Richard Biener wrote:
> > On Fri, May 10, 2024 at 11:06 AM Segher Boessenkool
> > wrote:
> > > *All* code using a cost will have to be inspected and possibly adju
On Fri, May 10, 2024 at 11:24 AM Aldy Hernandez wrote:
>
> There are various calls into fold_range() that have the wrong type
> associated with the range temporary used to hold the result. This
> used to work, because we could store either integers or pointers in a
> Value_Range, but is no longer
On Fri, May 10, 2024 at 11:06 AM Segher Boessenkool
wrote:
>
> On Fri, May 10, 2024 at 04:50:10PM +0800, HAO CHEN GUI wrote:
> > Hi Richard,
> > Thanks for your comments.
> >
> > 在 2024/5/10 15:16, Richard Biener 写道:
> > > But if targets return sth <
On Fri, May 10, 2024 at 10:54 AM John Paul Adrian Glaubitz
wrote:
>
> Hello Rainer,
>
> On Fri, 2024-05-10 at 10:20 +0200, Rainer Orth wrote:
> > > > Support for Solaris 11.3 had already been obsoleted in GCC 13. However,
> > > > since the only Solaris system in the cfarm was running 11.3, I've k
On Fri, May 10, 2024 at 4:25 AM HAO CHEN GUI wrote:
>
> Hi,
>The cost return from set_src_cost might be zero. Zero for
> pattern_cost means unknown cost. So the regularization converts the zero
> to COSTS_N_INSNS (1).
>
>// pattern_cost
>cost = set_src_cost (SET_SRC (set), GET_MODE (SE
On Thu, 9 May 2024, Jakub Jelinek wrote:
> On Thu, May 09, 2024 at 12:14:43PM +0200, Jakub Jelinek wrote:
> > On Thu, May 09, 2024 at 12:04:38PM +0200, Rainer Orth wrote:
> > > I just noticed that gcc/DATESTAMP wasn't updated yesterday and today,
> > > staying at 20240507.
> >
> > I think it is b
1001 - 1100 of 4124 matches
Mail list logo