[PATCH 5/5] RISC-V: tree-optimization/65518 - extend fix to SLP

2024-05-24 Thread Richard Biener
This extends the PR65518 workaround to also apply for single-lane SLP. * tree-vect-stmts.cc (get_group_load_store_type): For SLP also check for the PR65518 single-element interleaving case as done in vect_grouped_load_supported. --- gcc/tree-vect-stmts.cc | 17

[PATCH 4/5] Allow optimized SLP reduction epilog with single-lane reductions

2024-05-24 Thread Richard Biener
This extends optimized reduction epilog handling to cover the trivial single-lane SLP reduction case. * tree-vect-loop.cc (vect_create_epilog_for_reduction): Allow direct opcode and shift reduction also for SLP reductions with a single lane. --- gcc/tree-vect-loop.cc | 4

[PATCH 3/5] Reduce single-lane SLP testresult noise

2024-05-24 Thread Richard Biener
The following avoids dumping 'vectorizing stmts using SLP' for single-lane instances since that causes extra testsuite fallout. * tree-vect-slp.cc (vect_schedule_slp): Gate dumping 'vectorizing stmts using SLP' on > 1 lanes. --- gcc/tree-vect-slp.cc | 3 ++- 1 file changed, 2

[PATCH 2/5] Avoid bogus SLP outer loop vectorization

2024-05-24 Thread Richard Biener
This fixes the check for multiple types which go wrong I think because of bogus pointer IV increments when there are multiple copies of vector stmts in the inner loop. * tree-vect-stmts.cc (vectorizable_load): Avoid outer loop SLP vectorization with multi-copy vector stmts in the

[PATCH 1/5] Do single-lane SLP discovery for reductions

2024-05-24 Thread Richard Biener
This is the second merge proposed from the SLP vectorizer branch. I have again managed without adding and using --param vect-single-lane-slp but instead this provides always enabled functionality. This makes us use SLP reductions (a group of reductions) for the case where the group size is one.

Re: [PATCH] vect: Fix access size alignment assumption [PR115192]

2024-05-24 Thread Richard Biener
On Fri, May 24, 2024 at 2:35 PM Richard Sandiford wrote: > > create_intersect_range_checks checks whether two access ranges > a and b are alias-free using something equivalent to: > > end_a <= start_b || end_b <= start_a > > It has two ways of doing this: a "vanilla" way that calculates > the

Re: [PATCH] tree-ssa-pre.c/1071140(ICE in find_or_generate_expression, at tree-ssa-pre.c:2780): Return NULL_TREE if no equal.

2024-05-24 Thread Richard Biener
On Fri, May 24, 2024 at 1:49 PM Jiawei wrote: > > An ICE bug reported in > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071140. > https://godbolt.org/z/WE9aGYvoo > > Return NULL_TREE when TREE_CODE(op) not equal to SSA_NAME. The assert is on purpose. Can you open a GCC bug for this

Re: [RFC/PATCH] Replace {FLOAT, {, LONG_}DOUBLE}_TYPE_SIZE with new hook

2024-05-24 Thread Richard Biener
On Fri, May 24, 2024 at 12:20 PM Kewen.Lin wrote: > > Hi Joseph and Richi, > > on 2024/5/13 21:18, Joseph Myers wrote: > > On Mon, 13 May 2024, Kewen.Lin wrote: > > > >>> In fact replacing all of X_TYPE_SIZE with a single hook might be > >>> worthwhile > >>> though this removes the "convenient"

[PATCH] Fix gcc.dg/vect/vect-gather-4.c for cascadelake

2024-05-24 Thread Richard Biener
There's not really a good way to test what the testcase wants to test, the following exchanges one dump scan for another (imperfect) one. Pushed. * gcc.dg/vect/vect-gather-4.c: Scan for not vectorizing using SLP. --- gcc/testsuite/gcc.dg/vect/vect-gather-4.c | 2 +- 1 file

Re: [PATCH v2] MATCH: Look through VIEW_CONVERT when folding VEC_PERM_EXPRs.

2024-05-24 Thread Richard Biener
iltin_shufflevector (*a, *b, 0, 5, 2, 7); > + vecu r2 = __builtin_convertvector (r1, vecu); > + vecu r3 = __builtin_shufflevector (r2, r2, 2, 3, 1, 0); > + *c = __builtin_convertvector (r3, veci); > +} > + > +/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 2, 7, 5, 0 }" "fre1" } } */ > +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "fre1" } } */ > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH][v2] tree-optimization/115144 - improve sinking destination choice

2024-05-24 Thread Richard Biener
When sinking code closer to its uses we already try to minimize the distance we move by inserting at the start of the basic-block. The following makes sure to sink closest to the control dependence check of the region we want to sink to as well as make sure to ignore control dependences that are

Re: [PATCH] MATCH: Look through VIEW_CONVERT when folding VEC_PERM_EXPRs.

2024-05-24 Thread Richard Biener
On Fri, 24 May 2024, Manolis Tsamis wrote: > On Fri, May 24, 2024 at 10:46 AM Richard Biener wrote: > > > > On Fri, 24 May 2024, Manolis Tsamis wrote: > > > > > On Fri, May 24, 2024 at 9:31 AM Richard Biener wrote: > > > > > &g

Re: [PATCH] MATCH: Look through VIEW_CONVERT when folding VEC_PERM_EXPRs.

2024-05-24 Thread Richard Biener
On Fri, 24 May 2024, Manolis Tsamis wrote: > On Fri, May 24, 2024 at 9:31 AM Richard Biener wrote: > > > > On Wed, 22 May 2024, Manolis Tsamis wrote: > > > > > The match.pd patterns to merge two vector permutes into one fail when a > > > potentially no

Re: [PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-24 Thread Richard Biener
On Fri, May 24, 2024 at 8:56 AM Richard Biener wrote: > > On Fri, May 24, 2024 at 8:37 AM Li, Pan2 wrote: > > > > Thanks Jeff and Richard for suggestion and reviewing. > > > > Have another try in phiopt to do the convert from PHI to stmt = cond ? a : > > b.

Re: [PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-24 Thread Richard Biener
n NULL if nothing can be simplified or the resulting simplified value > with parts pushed if EARLY_P was true. Also rejects non allowed tree code > @@ -826,6 +908,9 @@ match_simplify_replacement (basic_block cond_bb, > basic_block middle_bb, > So, given the conditio

Re: [PATCH] MATCH: Look through VIEW_CONVERT when folding VEC_PERM_EXPRs.

2024-05-24 Thread Richard Biener
shufflevector (*a, *b, 0, 5, 2, 7); > + vecu r2 = __builtin_convertvector (r1, vecu); > + vecu r3 = __builtin_shufflevector (r2, r2, 2, 3, 1, 0); > + *c = __builtin_convertvector (r3, veci); > +} > + > +/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 2, 7, 5, 0 }" "fre1" } } */ > +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "fre1" } } */ > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] [RFC] Target-independent store forwarding avoidance. [PR48696] Target-independent store forwarding avoidance.

2024-05-24 Thread Richard Biener
00..cd81aa248fe > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/avoid-store-forwarding-2.c > @@ -0,0 +1,39 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-rtl-avoid_store_forwarding" } */ > + > +typedef union { > +char arr_8[8]; > +int long_value; > +} DataUnion1; > + > +long no_ssll_1 (DataUnion1 *data, char x) > +{ > + data->arr_8[4] = x; > + return data->long_value; > +} > + > +long no_ssll_2 (DataUnion1 *data, char x) > +{ > + data->arr_8[5] = x; > + return data->long_value; > +} > + > +typedef union { > +char arr_8[8]; > +short long_value[4]; > +} DataUnion2; > + > +long no_ssll_3 (DataUnion2 *data, char x) > +{ > + data->arr_8[4] = x; > + return data->long_value[1]; > +} > + > +long no_ssll_4 (DataUnion2 *data, char x) > +{ > + data->arr_8[0] = x; > + return data->long_value[1]; > +} > + > +/* { dg-final { scan-tree-dump-times "Store forwarding detected" 0 } } */ > +/* { dg-final { scan-tree-dump-times "Store forwarding avoided" 0 } } */ > diff --git a/gcc/testsuite/gcc.dg/avoid-store-forwarding-3.c > b/gcc/testsuite/gcc.dg/avoid-store-forwarding-3.c > new file mode 100644 > index 000..3175f882c86 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/avoid-store-forwarding-3.c > @@ -0,0 +1,31 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-rtl-avoid_store_forwarding" } */ > + > +typedef union { > +char arr_8[8]; > +long long_value; > +} DataUnion; > + > +long ssll_multi_1 (DataUnion **data, char x) > +{ > + (*data)->arr_8[0] = x; > + (*data)->arr_8[2] = x; > + return (*data)->long_value; > +} > + > +long ssll_multi_2 (DataUnion **data, char x) > +{ > + (*data)->arr_8[0] = x; > + (*data)->arr_8[1] = 11; > + return (*data)->long_value; > +} > + > +long ssll_multi_3 (DataUnion **data, char x, short y) > +{ > + (*data)->arr_8[1] = x; > + __builtin_memcpy((*data)->arr_8 + 4, , sizeof(short)); > + return (*data)->long_value; > +} > + > +/* { dg-final { scan-tree-dump-times "Store forwardings detected" 3 } } */ > +/* { dg-final { scan-tree-dump-times "Store forwardings avoided" 3 } } */ > diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h > index 29267589eeb..49957ba3373 100644 > --- a/gcc/tree-pass.h > +++ b/gcc/tree-pass.h > @@ -570,6 +570,7 @@ extern rtl_opt_pass *make_pass_rtl_dse3 (gcc::context > *ctxt); > extern rtl_opt_pass *make_pass_rtl_cprop (gcc::context *ctxt); > extern rtl_opt_pass *make_pass_rtl_pre (gcc::context *ctxt); > extern rtl_opt_pass *make_pass_rtl_hoist (gcc::context *ctxt); > +extern rtl_opt_pass *make_pass_rtl_avoid_store_forwarding (gcc::context > *ctxt); > extern rtl_opt_pass *make_pass_rtl_store_motion (gcc::context *ctxt); > extern rtl_opt_pass *make_pass_cse_after_global_opts (gcc::context *ctxt); > extern rtl_opt_pass *make_pass_rtl_ifcvt (gcc::context *ctxt); > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] Use simple_dce_from_worklist in phiprop

2024-05-24 Thread Richard Biener
On Thu, May 23, 2024 at 10:55 PM Andrew Pinski wrote: > > I noticed that phiprop leaves around phi nodes which > defines a ssa name which is unused. This just adds a > bitmap to mark those ssa names and then calls > simple_dce_from_worklist at the very end to remove > those phi nodes and all of

Re: [C PATCH]: allow aliasing of compatible types derived from enumeral types [PR115157]

2024-05-23 Thread Richard Biener
On Thu, 23 May 2024, Ian Lance Taylor wrote: > On Thu, May 23, 2024 at 2:48 PM Martin Uecker wrote: > > > > Am Donnerstag, dem 23.05.2024 um 14:30 -0700 schrieb Ian Lance Taylor: > > > On Thu, May 23, 2024 at 2:00 PM Joseph Myers wrote: > > > > > > > > On Tue, 21 May 2024, Martin Uecker wrote:

Re: [PATCH] RISC-V: Avoid splitting store dataref groups during SLP discovery

2024-05-23 Thread Richard Biener
On Thu, 23 May 2024, Richard Biener wrote: > The following avoids splitting store dataref groups during SLP > discovery but instead forces (eventually single-lane) consecutive > lane SLP discovery for all lanes of the group, creating VEC_PERM > SLP nodes merging them so the store

[PATCH] RISC-V: Avoid splitting store dataref groups during SLP discovery

2024-05-23 Thread Richard Biener
The following avoids splitting store dataref groups during SLP discovery but instead forces (eventually single-lane) consecutive lane SLP discovery for all lanes of the group, creating VEC_PERM SLP nodes merging them so the store will always cover the whole group. With this for example int

[PATCH] tree-optimization/115197 - fix ICE w/ constant in LC PHI and loop distribution

2024-05-23 Thread Richard Biener
Forgot a check for an SSA name before trying to replace a PHI arg with its current definition. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/115197 * tree-loop-distribution.cc (copy_loop_before): Constant PHI args remain the same.

Re: [PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-23 Thread Richard Biener
t; _2 = phi_cond_6 ? _1 : 255; > return _2; > > } > > -Original Message- > From: Li, Pan2 > Sent: Thursday, May 23, 2024 12:17 PM > To: Richard Biener > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > tamar.christ...@arm.com; pins...@gmail.

Re: [PATCH] .gitattributes: disable crlf translation

2024-05-23 Thread Richard Biener
On Thu, May 23, 2024 at 5:50 AM Peter Damianov wrote: > > By default, git has the "autocrlf" """feature""" enabled. This causes the > files > to have CRLF line endings when checked out on windows, which in the case of > configure, causes confusing errors like: > > ./gcc/configure: line 14:

Re: [V2 PATCH] Don't reduce estimated unrolled size for innermost loop at cunrolli.

2024-05-23 Thread Richard Biener
On Wed, May 22, 2024 at 7:07 AM liuhongt wrote: > > >> Hard to find a default value satisfying all testcases. > >> some require loop unroll with 7 insns increment, some don't want loop > >> unroll w/ 5 insn increment. > >> The original 2/3 reduction happened to meet all those testcases(or the >

Re: [PATCH v4] Match: Add overloaded types_match to avoid code dup [NFC]

2024-05-23 Thread Richard Biener
On Thu, May 23, 2024 at 2:24 AM wrote: > > From: Pan Li > > There are sorts of match pattern for SAT related cases, there will be > some duplicated code to check the dest, op_0, op_1 are same tree types. > Aka ternary tree type matches. Thus, add overloaded types_match func > do this and

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-23 Thread Richard Biener
On Wed, May 22, 2024 at 8:53 PM Qing Zhao wrote: > > > > > On May 22, 2024, at 03:38, Richard Biener > > wrote: > > > > On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote: > >> > >> On Tue, 2024-05-21 at 15:13 +, Qing Zhao w

[PATCH] tree-optimization/115199 - fix PTA constraint processing for LHS

2024-05-23 Thread Richard Biener
When processing a = X constraint we treat it as *ANYTHING = X during constraint processing but then end up recording it as = X anyway, breaking constraint graph building. This is because we only update the local copy of the LHS and not the constraint itself. Bootstrap and regtest running on

[PATCH] tree-optimization/115138 - ptr-vs-ptr and FUNCTION_DECLs

2024-05-23 Thread Richard Biener
I failed to realize we do not represent FUNCTION_DECLs or LABEL_DECLs in vars explicitly and thus have to compare pt.vars_contains_nonlocal. Bootstrapped and tested with bootstrap-O3 and D to verify the comparison fail is fixed. I'm now doing a regular bootstrap and regtest with the volatile fix

Re: [x86_64 PATCH] Correct insn_cost of movabsq.

2024-05-22 Thread Richard Biener
> Am 22.05.2024 um 17:30 schrieb Uros Bizjak : > > On Wed, May 22, 2024 at 5:15 PM Roger Sayle > wrote: >> >> This single line patch fixes a strange quirk/glitch in i386's rtx_costs, >> which considers an instruction loading a 64-bit constant to be significantly >> cheaper than loading a

Re: [PATCH 3/4] Avoid splitting store dataref groups during SLP discovery

2024-05-22 Thread Richard Biener
On Tue, 21 May 2024, Richard Sandiford wrote: > Richard Biener writes: > > The following avoids splitting store dataref groups during SLP > > discovery but instead forces (eventually single-lane) consecutive > > lane SLP discovery for all lanes of the group, creating V

Re: [PATCH v1 1/2] Match: Support branch form for unsigned SAT_ADD

2024-05-22 Thread Richard Biener
On Mon, May 20, 2024 at 1:50 PM Tamar Christina wrote: > > Hi Pan, > > > -Original Message- > > From: pan2...@intel.com > > Sent: Monday, May 20, 2024 12:01 PM > > To: gcc-patches@gcc.gnu.org > > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina > > ;

Re: [PATCH v1 1/2] Match: Support __builtin_add_overflow for branchless unsigned SAT_ADD

2024-05-22 Thread Richard Biener
On Sun, May 19, 2024 at 8:37 AM wrote: > > From: Pan Li > > This patch would like to support the branchless form for unsigned > SAT_ADD when leverage __builtin_add_overflow. For example as below: > > uint64_t sat_add_u(uint64_t x, uint64_t y) > { > uint64_t ret; > uint64_t overflow =

Re: [PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-22 Thread Richard Biener
On Wed, May 22, 2024 at 3:17 AM wrote: > > From: Pan Li > > This patch would like to support the __builtin_add_overflow branch form for > unsigned SAT_ADD. For example as below: > > uint64_t > sat_add (uint64_t x, uint64_t y) > { > uint64_t ret; > return __builtin_add_overflow (x, y, ) ? -1

Re: [PATCH v2] Match: Extract integer_types_ternary_match helper to avoid code dup [NFC]

2024-05-22 Thread Richard Biener
On Mon, May 20, 2024 at 1:00 PM wrote: > > From: Pan Li > > There are sorts of match pattern for SAT related cases, there will be > some duplicated code to check the dest, op_0, op_1 are same tree types. > Aka ternary tree type matches. Thus, extract one helper function to > do this and avoid

[PATCH 2/2][v2] RISC-V: Testsuite updates

2024-05-22 Thread Richard Biener
The gcc.dg/vect/slp-12a.c case is interesting as we currently split the 8 store group into lanes 0-5 which we SLP with an unroll factor of two (on x86-64 with SSE) and the remaining two lanes are using interleaving vectorization with a final unroll factor of four. Thus we're using hybrid SLP

[PATCH 1/2][v2] Avoid splitting store dataref groups during SLP discovery

2024-05-22 Thread Richard Biener
The following avoids splitting store dataref groups during SLP discovery but instead forces (eventually single-lane) consecutive lane SLP discovery for all lanes of the group, creating VEC_PERM SLP nodes merging them so the store will always cover the whole group. With this for example int

Re: [PATCH 4/4] Testsuite updates

2024-05-22 Thread Richard Biener
On Tue, 21 May 2024, Richard Biener wrote: > The gcc.dg/vect/slp-12a.c case is interesting as we currently split > the 8 store group into lanes 0-5 which we SLP with an unroll factor > of two (on x86-64 with SSE) and the remaining two lanes are using > interleaving vectorization

[PATCH] tree-optimization/115144 - improve sinking destination choice

2024-05-22 Thread Richard Biener
When sinking code closer to its uses we already try to minimize the distance we move by inserting at the start of the basic-block. The following makes sure to sink closest to the control dependence check of the region we want to sink to as well as make sure to ignore control dependences that are

Re: [PATCH] Fix mixed input kind permute optimization

2024-05-22 Thread Richard Biener
On Wed, 22 May 2024, Richard Sandiford wrote: > Richard Sandiford writes: > > Richard Biener writes: > >> When change_vec_perm_layout runs into a permute combining two > >> nodes where one is invariant and one internal the partition of > >> one

[PATCH] web/115183 - fix typo in C++ docs

2024-05-22 Thread Richard Biener
The following fixes a reported typo. Pushed. * doc/invoke.texi (C++ Modules): Fix typo. --- gcc/doc/invoke.texi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 218901c0b20..0625a5ede6f 100644 --- a/gcc/doc/invoke.texi

Re: [PATCH] Avoid ICE in except.cc on targets that don't support exceptions.

2024-05-22 Thread Richard Biener
On Wed, May 22, 2024 at 9:21 AM Roger Sayle wrote: > > > A number of testcases currently fail on nvptx with the ICE: > > during RTL pass: final > openmp-simd-2.c: In function 'foo': > openmp-simd-2.c:28:1: internal compiler error: in get_personality_function, > at expr.cc:14037 >28 | } >

Re: [PATCH] Don't simplify NAN/INF or out-of-range constant for FIX/UNSIGNED_FIX.

2024-05-22 Thread Richard Biener
On Wed, May 22, 2024 at 3:58 AM liuhongt wrote: > > According to IEEE standard, for conversions from floating point to > integer. When a NaN or infinite operand cannot be represented in the > destination format and this cannot otherwise be indicated, the invalid > operation exception shall be

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-22 Thread Richard Biener
On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote: > > On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > > Thanks for the comments and suggestions. > > > > > On May 15, 2024, at 10:00, David Malcolm > > > wrote: > > > > > > O

Re: [PATCH] ubsan: Use right address space for MEM_REF created for bool/enum sanitization [PR115172]

2024-05-22 Thread Richard Biener
ee-dump-not "\.ASAN_CHECK " "asan1" } } */ > + > +#ifdef __x86_64__ > +#define SEG __seg_gs > +#else > +#define SEG __seg_fs > +#endif > + > +extern struct S { _Bool b; } s; > +void bar (void); > + > +void > +foo (void) > +{ > + if (*(vo

Re: [PATCH] strlen: Fix up !si->full_string_p handling in count_nonzero_bytes_addr [PR115152]

2024-05-22 Thread Richard Biener
ominator-opts > -fno-tree-loop-im" } */ > + > +int a, b, c, d; > +signed char e[1] = { 1 }; > + > +int > +main () > +{ > + for (a = 0; a < 3; a++) > +for (b = 0; b < 2; b++) > + c = e[0] = e[0] ^ d; > + if (!c) > +__builtin_abort (); > + return 0; > +} > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] Fix mixed input kind permute optimization

2024-05-21 Thread Richard Biener
When change_vec_perm_layout runs into a permute combining two nodes where one is invariant and one internal the partition of one input can be -1 but the other might not be. The following supports this case by simply ignoring inputs with input partiton -1. I'm not sure this is correct but it

Re: [committed] PATCH for Re: Stepping down as maintainer for ARC and Epiphany

2024-05-21 Thread Richard Biener
On Tue, May 21, 2024 at 6:21 PM Jeff Law wrote: > > > > On 5/21/24 8:02 AM, Paul Koning wrote: > > > > > >> On May 21, 2024, at 9:57 AM, Jeff Law wrote: > >> > >> > >> > >> On 5/21/24 12:05 AM, Richard Biener via G

Re: [PATCH v5 1/5] Improve must tail in RTL backend

2024-05-21 Thread Richard Biener
On Tue, May 21, 2024 at 3:35 PM Andi Kleen wrote: > > > I can't see how this triggers on the IL above, the loop should have > > ignored both the return and the clobber and when recursing to > > the predecessor stop before the above check when runnig into the > > call? > > Yes, I tracked that down

[PATCH 4/4] Testsuite updates

2024-05-21 Thread Richard Biener
The gcc.dg/vect/slp-12a.c case is interesting as we currently split the 8 store group into lanes 0-5 which we SLP with an unroll factor of two (on x86-64 with SSE) and the remaining two lanes are using interleaving vectorization with a final unroll factor of four. Thus we're using hybrid SLP

[PATCH 3/4] Avoid splitting store dataref groups during SLP discovery

2024-05-21 Thread Richard Biener
The following avoids splitting store dataref groups during SLP discovery but instead forces (eventually single-lane) consecutive lane SLP discovery for all lanes of the group, creating VEC_PERM SLP nodes merging them so the store will always cover the whole group. With this for example int

[PATCH 2/4] Avoid SLP_REPRESENTATIVE access for VEC_PERM in SLP scheduling

2024-05-21 Thread Richard Biener
SLP permute nodes can end up without a SLP_REPRESENTATIVE now, the following avoids touching it in this case in vect_schedule_slp_node. * tree-vect-slp.cc (vect_schedule_slp_node): Avoid looking at SLP_REPRESENTATIVE for VEC_PERM nodes. --- gcc/tree-vect-slp.cc | 28

[PATCH 1/4] Avoid requiring VEC_PERM represenatives

2024-05-21 Thread Richard Biener
The following plugs one hole where we require a VEC_PERM node representative unnecessarily. This is for vect_check_store_rhs which looks at the RHS and checks whether a constant can be native encoded. The fix is to guard that with vect_constant_def additionally and making vect_is_simple_use

[PATCH] tree-optimization/115137 - more ptr-vs-ptr compare fixes

2024-05-21 Thread Richard Biener
The following fixes the omission of const-pool included in NONLOCAL. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/115137 * tree-ssa-structalias.cc (pt_solution_includes_const_pool): NONLOCAL also includes constant pool entries.

Re: [PATCH] Cache the set of EH_RETURN_DATA_REGNOs

2024-05-21 Thread Richard Biener
On Tue, May 21, 2024 at 11:03 AM Richard Sandiford wrote: > > While reviewing Andrew's fix for PR114843, it seemed like it would > be convenient to have a HARD_REG_SET of EH_RETURN_DATA_REGNOs. > This patch adds one and uses it to simplify a couple of use sites. > > Tested on aarch64-linux-gnu &

[PATCH] tree-optimization/115149 - VOP live and missing PHIs

2024-05-21 Thread Richard Biener
The following fixes a bug in vop-live get_live_in which was using NULL to indicate the first processed edge but at the same time using it for the case the live-in virtual operand cannot be computed. The following fixes this, avoiding sinking a load to a place where we'd have to insert virtual PHIs

Re: [PATCH v5 1/5] Improve must tail in RTL backend

2024-05-21 Thread Richard Biener
On Mon, May 20, 2024 at 6:53 AM Andi Kleen wrote: > > On Tue, May 14, 2024 at 04:15:08PM +0200, Richard Biener wrote: > > On Sun, May 5, 2024 at 8:16 PM Andi Kleen wrote: > > > > > > - Give error messages for all causes of non sibling call generation > > >

Re: [PATCH] Don't reduce estimated unrolled size for innermost loop.

2024-05-21 Thread Richard Biener
On Tue, May 21, 2024 at 4:35 AM Hongtao Liu wrote: > > On Wed, May 15, 2024 at 5:24 PM Richard Biener > wrote: > > > > On Wed, May 15, 2024 at 4:15 AM Hongtao Liu wrote: > > > > > > On Mon, May 13, 2024 at 3:40 PM Richard Biener > > > wrote:

Re: [PATCH] match: Disable `(type)zero_one_valuep*CST` for 1bit signed types [PR115154]

2024-05-21 Thread Richard Biener
On Tue, May 21, 2024 at 12:02 AM Andrew Pinski wrote: > > The problem here is the pattern added in r13-1162-g9991d84d2a8435 > assumes that it is well defined to multiply zero_one_valuep by the truncated > converted integer constant. It is well defined for all types except for > signed 1bit

Re: [committed] PATCH for Re: Stepping down as maintainer for ARC and Epiphany

2024-05-21 Thread Richard Biener
On Mon, May 20, 2024 at 4:45 PM Gerald Pfeifer wrote: > > On Wed, 5 Jul 2023, Joern Rennecke wrote: > > I haven't worked with these targets in years and can't really do > > sensible maintenance or reviews of patches for them. I am currently > > working on optimizations for other ports like

Re: [PATCH] PHIOPT: Don't transform minmax if middle bb contains a phi [PR115143]

2024-05-21 Thread Richard Biener
On Mon, May 20, 2024 at 11:37 PM Andrew Pinski (QUIC) wrote: > > > -Original Message- > > From: Richard Biener > > Sent: Sunday, May 19, 2024 11:55 AM > > To: Andrew Pinski (QUIC) > > Cc: gcc-patches@gcc.gnu.org > > Subject: Re: [PATCH] PHIOPT:

Re: [PATCH] PHIOPT: Don't transform minmax if middle bb contains a phi [PR115143]

2024-05-19 Thread Richard Biener
> Am 19.05.2024 um 01:12 schrieb Andrew Pinski : > > The problem here is even if last_and_only_stmt returns a statement, > the bb might still contain a phi node which defines a ssa name > which is used in that statement so we need to add a check to make sure > that the phi nodes are empty for

Re: [PATCH] MATCH: Maybe expand (T)(A + C1) * C2 and (T)(A + C1) * C2 + C3 [PR109393]

2024-05-17 Thread Richard Biener
On Fri, 17 May 2024, Manolis Tsamis wrote: > On Fri, May 17, 2024 at 12:22 PM Richard Biener wrote: > > > > On Fri, 17 May 2024, Manolis Tsamis wrote: > > > > > Hi Richard, > > > > > > While I was re-testing the latest version of this pat

Re: [PATCH] middle-end: Expand {u|s}dot product support in autovectorizer

2024-05-17 Thread Richard Biener
On Fri, May 17, 2024 at 11:56 AM Tamar Christina wrote: > > > -Original Message- > > From: Richard Biener > > Sent: Friday, May 17, 2024 10:46 AM > > To: Tamar Christina > > Cc: Victor Do Nascimento ; gcc- > > patc...@gcc.gnu.org; Richard Sandi

Re: [PATCH] middle-end: Expand {u|s}dot product support in autovectorizer

2024-05-17 Thread Richard Biener
On Fri, May 17, 2024 at 11:05 AM Tamar Christina wrote: > > > -Original Message- > > From: Richard Biener > > Sent: Friday, May 17, 2024 6:51 AM > > To: Victor Do Nascimento > > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford ; > > Richard Earnshaw

Re: [PATCH] MATCH: Maybe expand (T)(A + C1) * C2 and (T)(A + C1) * C2 + C3 [PR109393]

2024-05-17 Thread Richard Biener
ause then we do _not_ perform this premature optimization. Without -fwrapv the optimization is valid but as you note we do not perform it consistently - otherwise we wouldn't regress. Richard. > Thanks, > Manolis > > > > On Thu, May 16, 2024 at 11:15 AM Richard Biener > w

[PATCH] middle-end/115110 - Fix view_converted_memref_p

2024-05-17 Thread Richard Biener
view_converted_memref_p was checking the reference type against the pointer type of the offset operand rather than its pointed-to type which leads to all refs being subject to view-convert treatment in get_alias_set causing numerous testsuite fails but with its new uses from

[PATCH] Add missing check for const_pool in the escaped solutions

2024-05-17 Thread Richard Biener
The ptr-vs-ptr compare folding using points-to info was missing a check for const_pool being included in the escaped solution. The following fixes that, fixing the observed execute FAIL of experimental/functional/searchers.cc Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Re: [PATCH] middle-end: Drop __builtin_pretech calls in autovectorization [PR114061]'

2024-05-17 Thread Richard Biener
On Thu, May 16, 2024 at 11:19 PM Tamar Christina wrote: > > Hi, > > > -Original Message- > > From: Victor Do Nascimento > > Sent: Thursday, May 16, 2024 2:57 PM > > To: gcc-patches@gcc.gnu.org > > Cc: Richard Sandiford ; Richard Earnshaw > > ; Victor Do Nascimento > > > > Subject:

Re: [PATCH] middle-end: Expand {u|s}dot product support in autovectorizer

2024-05-16 Thread Richard Biener
On Thu, May 16, 2024 at 4:40 PM Victor Do Nascimento wrote: > > From: Victor Do Nascimento > > At present, the compiler offers the `{u|s|us}dot_prod_optab' direct > optabs for dealing with vectorizable dot product code sequences. The > consequence of using a direct optab for this is that

Re: [PATCH] tree-optimization/13962 - handle ptr-ptr compares in ptrs_compare_unequal

2024-05-16 Thread Richard Biener
On Thu, 16 May 2024, Jeff Law wrote: > > > On 5/16/24 6:03 AM, Richard Biener wrote: > > Now that we handle pt.null conservatively we can implement the missing > > tracking of constant pool entries (aka STRING_CST) and handle > > ptr-ptr compares using points-to i

Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis

2024-05-16 Thread Richard Biener
On Wed, 3 Apr 2024, Chung-Lin Tang wrote: > Hi Richard, Thomas, > > On 2023/10/30 8:46 PM, Richard Biener wrote: > >> > >> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the > >> 'x' decl itself!) as 'readonly', via a new

Re: [PATCH] Optab: add isfinite_optab for __builtin_isfinite

2024-05-16 Thread Richard Biener
On Fri, Apr 12, 2024 at 5:07 AM HAO CHEN GUI wrote: > > Hi, > This patch adds an optab for __builtin_isfinite. The finite check can be > implemented on rs6000 by a single instruction. It needs an optab to be > expanded to the certain sequence of instructions. > > The subsequent patches will

Re: [PATCH] Optab: add isnormal_optab for __builtin_isnormal

2024-05-16 Thread Richard Biener
On Fri, Apr 12, 2024 at 10:10 AM HAO CHEN GUI wrote: > > Hi, > This patch adds an optab for __builtin_isnormal. The normal check can be > implemented on rs6000 by a single instruction. It needs an optab to be > expanded to the certain sequence of instructions. > > The subsequent patches will

Re: [PATCH v2 1/3] Vect: Support loop len in vectorizable early exit

2024-05-16 Thread Richard Biener
On Thu, May 16, 2024 at 8:50 AM Tamar Christina wrote: > > > -Original Message- > > From: pan2...@intel.com > > Sent: Thursday, May 16, 2024 5:06 AM > > To: gcc-patches@gcc.gnu.org > > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina > > ; richard.guent...@gmail.com;

[PATCH] tree-optimization/13962 - handle ptr-ptr compares in ptrs_compare_unequal

2024-05-16 Thread Richard Biener
Now that we handle pt.null conservatively we can implement the missing tracking of constant pool entries (aka STRING_CST) and handle ptr-ptr compares using points-to info in ptrs_compare_unequal. Bootstrapped on x86_64-unknown-linux-gnu, (re-)testing in progress. Richard. PR

Re: [PATCH v5 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-16 Thread Richard Biener
e riscv specific part of course needs riscv approval. > Pan > > -Original Message- > From: Richard Biener > Sent: Thursday, May 16, 2024 4:10 PM > To: Li, Pan2 > Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; > juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Liu, Hongtao

[PATCH] wrong code with points-to and volatile

2024-05-16 Thread Richard Biener
The following fixes points-to analysis which ignores the fact that volatile qualified refs can result in any pointer. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. Btw, I noticed this working on ptr-vs-ptr compare simplification using points-to info and running into

Re: [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]

2024-05-16 Thread Richard Biener
On Fri, Apr 5, 2024 at 8:14 PM Andrew Pinski wrote: > > On Fri, Apr 5, 2024 at 5:28 AM Manolis Tsamis wrote: > > > > If we consider code like: > > > > if (bar1 == x) > > return foo(); > > if (bar2 != y) > > return foo(); > > return 0; > > > > We would like the ifcombine

Re: [PATCH v3] driver: Output to a temp file; rename upon success [PR80182]

2024-05-16 Thread Richard Biener
On Sun, May 12, 2024 at 3:40 PM Peter Damianov wrote: > > Currently, commands like: > gcc -o file.c -lm > will delete the user's code. > > This patch makes the linker write executables to a temp file, and then renames > the temp file if successful. This fixes the case above, but has limitations.

Re: [PATCH] MATCH: Maybe expand (T)(A + C1) * C2 and (T)(A + C1) * C2 + C3 [PR109393]

2024-05-16 Thread Richard Biener
On Tue, May 14, 2024 at 10:58 AM Manolis Tsamis wrote: > > New patch with the requested changes can be found below. > > I don't know how much this affects SCEV, but I do believe that we > should incorporate this change somehow. I've seen various cases of > suboptimal address calculation codegen

Re: [PATCH v5 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-16 Thread Richard Biener
On Wed, May 15, 2024 at 1:36 PM Li, Pan2 wrote: > > > LGTM but you'll need an OK from Richard, > > Thanks for working on this! > > Thanks Tamar for help and coaching, let's wait Richard for a while,! OK. Thanks for the patience, Richard. > Pan > > -Original Message- > From: Tamar

[PATCH][v2] tree-optimization/79958 - make DSE track multiple paths

2024-05-16 Thread Richard Biener
DSE currently gives up when the path we analyze forks. This leads to multiple missed dead store elimination PRs. The following fixes this by recursing for each path and maintaining the visited bitmap to avoid visiting CFG re-merges multiple times. The overall cost is still limited by the same

Re: [PATCH] i386: Fix ix86_option override after change [PR 113719]

2024-05-16 Thread Richard Biener
On Thu, May 16, 2024 at 8:25 AM Hongyu Wang wrote: > > Hi, > > In ix86_override_options_after_change, calls to ix86_default_align > and ix86_recompute_optlev_based_flags will cause mismatched target > opt_set when doing cl_optimization_restore. Move them back to > ix86_option_override_internal to

[PATCH] tree-optimization/79958 - make DSE track multiple paths

2024-05-15 Thread Richard Biener
DSE currently gives up when the path we analyze forks. This leads to multiple missed dead store elimination PRs. The following fixes this by recursing for each path and maintaining the visited bitmap to avoid visiting CFG re-merges multiple times. The overall cost is still limited by the same

Re: [PATCH] middle-end/111422 - wrong stack var coalescing, handle PHIs

2024-05-15 Thread Richard Biener
On Wed, 15 May 2024, Jakub Jelinek wrote: > On Wed, May 15, 2024 at 01:41:04PM +0200, Richard Biener wrote: > > PR middle-end/111422 > > * cfgexpand.cc (add_scope_conflicts_2): Handle PHIs > > by recursing to their arguments. > > ---

[PATCH] middle-end/111422 - wrong stack var coalescing, handle PHIs

2024-05-15 Thread Richard Biener
The gcc.c-torture/execute/pr111422.c testcase after installing the sink pass improvement reveals that we also need to handle _65 = + _58; _44 = + _43; # _59 = PHI <_65, _44> *_59 = 8; g = {v} {CLOBBER(eos)}; ... n[0] = *_59 = 8; g = {v} {CLOBBER(eos)}; where we fail to

Re: [PATCH 0/4]AArch64: support conditional early clobbers on certain operations.

2024-05-15 Thread Richard Biener
On Wed, May 15, 2024 at 12:29 PM Tamar Christina wrote: > > Hi All, > > Some Neoverse Software Optimization Guides (SWoG) have a clause that state > that for predicated operations that also produce a predicate it is preferred > that the codegen should use a different register for the destination

Re: [PATCH] [PATCH] Correct DLL Installation Path for x86_64-w64-mingw32 Multilib [PR115094]

2024-05-15 Thread Richard Biener
and then invoke 'autoconf' from each directory. At least that's how I do it. But my question was whether upstream libtool has your fix or whether this is a downstream patch against libtool.m4 which we need to carry. Richard. > > From: Richard Biener > Sent: Wednesday, May

[PATCH] tree-optimization/114589 - remove profile based sink heuristics

2024-05-15 Thread Richard Biener
The following removes the profile based heuristic limiting sinking and instead uses post-dominators to avoid sinking to places that are executed under the same conditions as the earlier location which the profile based heuristic should have guaranteed as well. To avoid regressing this moves the

Re: [PATCH] [PATCH] Correct DLL Installation Path for x86_64-w64-mingw32 Multilib [PR115094]

2024-05-15 Thread Richard Biener
since we updated libtool, is this fixed in libtool upstream in the same way? You are missing a ChangeLog entry which should indicate which files were just re-generated and which ones you edited (and what part). Richard. > ____ > From: Richard Biener > Sent: Wedne

Re: [PATCH] Don't reduce estimated unrolled size for innermost loop.

2024-05-15 Thread Richard Biener
On Wed, May 15, 2024 at 4:15 AM Hongtao Liu wrote: > > On Mon, May 13, 2024 at 3:40 PM Richard Biener > wrote: > > > > On Mon, May 13, 2024 at 4:29 AM liuhongt wrote: > > > > > > As testcase in the PR, O3 cunrolli may prevent vectorization for the >

Re: [PATCH] tree-cfg: Move the returns_twice check to be last statement only [PR114301]

2024-05-15 Thread Richard Biener
On Tue, May 14, 2024 at 5:52 PM Andrew Pinski wrote: > > When I was checking to making sure that all of the bugs dealing > with the case where gimple_can_duplicate_bb_p would return false was fixed, > I noticed that the code which was checking if a call statement was > returns_twice was checking

Re: [PATCH] [PATCH] Correct DLL Installation Path for x86_64-w64-mingw32 Multilib [PR115094]

2024-05-15 Thread Richard Biener
On Tue, May 14, 2024 at 10:27 PM trcrsired wrote: > > From: trcrsired > > When building native GCC for the x86_64-w64-mingw32 host, the compiler copies > its library DLLs to the `bin` directory. However, in the case of a multilib > configuration, both 32-bit and 64-bit libraries end up in the

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-15 Thread Richard Biener
On Tue, 14 May 2024, Qing Zhao wrote: > > > > On May 14, 2024, at 13:14, Richard Biener wrote: > > > > On Tue, 14 May 2024, Qing Zhao wrote: > > > >> > >> > >>> On May 14, 2024, at 10:29, Richard Biener wrote: > >>>

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Richard Biener
On Tue, 14 May 2024, Kees Cook wrote: > On Tue, May 14, 2024 at 02:17:16PM +, Qing Zhao wrote: > > The current major issue with the warning is: the constant index value 4 > > is not in the source code, it’s a compiler generated intermediate value > > (even though it’s a correct value -:)).

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Richard Biener
On Tue, 14 May 2024, Qing Zhao wrote: > > > > On May 14, 2024, at 10:29, Richard Biener wrote: > > [...] > > It would of course > > need experimenting since we can end up moving stmts and merging blocks > > though the linear traces created by jump

Re: [PATCH v5 5/5] Add documentation for musttail attribute

2024-05-14 Thread Richard Biener
On Tue, May 14, 2024 at 6:30 PM Andi Kleen wrote: > > > Looks generally OK though does this mean people can debug > > programs using [[gnu::musttail]] only with optimized builds? It > > seems to me we should try harder to make [[gnu::musttail]] work > > at -O0 and generally behave the same at

[PATCH][v2] tree-optimization/99954 - redo loop distribution memcpy recognition fix

2024-05-14 Thread Richard Biener
The following revisits the fix for PR99954 which was observed as causing missed memcpy recognition and instead using memmove for non-aliasing copies. While the original fix mitigated bogus recognition of memcpy the root cause was not properly identified. The root cause is dr_analyze_indices

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Richard Biener
On Tue, 14 May 2024, Qing Zhao wrote: > > > > On May 14, 2024, at 09:08, Richard Biener wrote: > > > > On Mon, 13 May 2024, Qing Zhao wrote: > > > >> -Warray-bounds is an important option to enable linux kernal to keep > >> the

  1   2   3   4   5   6   7   8   9   10   >