On Thu, Jan 11, 2024 at 10:52 AM Robin Dapp wrote:
>
> On 1/11/24 10:46, juzhe.zh...@rivai.ai wrote:
> > Oh. I see I think I have done wrong here.
> >
> > I should adjust cost for VEC_EXTRACT not VEC_SET.
> >
> > But it's odd, I didn't see loop vectorizer is scanning scalar_to_vec
> > cost in
On Thu, Jan 11, 2024 at 10:46 AM Richard Biener
wrote:
>
> On Fri, Dec 29, 2023 at 11:29 AM Feng Xue OS
> wrote:
> >
> > This patch is meant to fix over-estimation about SLP vector-to-scalar cost
> > for
> > STMT_VINFO_LIVE_P statement. When pattern recogni
On Fri, Dec 29, 2023 at 11:29 AM Feng Xue OS
wrote:
>
> This patch is meant to fix over-estimation about SLP vector-to-scalar cost for
> STMT_VINFO_LIVE_P statement. When pattern recognition is involved, a
> statement whose definition is consumed in some pattern, may not be
> included in the
On Thu, Jan 11, 2024 at 9:50 AM wrote:
>
> From: Pan Li
>
> The insert_var_expansion_initialization depends on the
> HONOR_SIGNED_ZEROS to initialize the unrolling variables
> to +0.0f when -0.0f and no-signed-option. Unfortunately,
> we should always keep the -0.0f here because:
>
> * The
On Thu, Jan 11, 2024 at 9:30 AM HAO CHEN GUI wrote:
>
> Hi,
> This patch eliminates unnecessary byte swaps for block clear on P8
> LE. For block clear, all the bytes are set to zero. The byte order
> doesn't make sense. So the alignment of destination could be set to
> the store mode size in
On Thu, Jan 11, 2024 at 9:24 AM Juzhe-Zhong wrote:
>
> This patch fixes the following inefficient vectorized codes:
>
> vsetvli a5,zero,e8,mf2,ta,ma
> li a2,17
> vid.v v1
> li a4,-32768
> vsetvli zero,zero,e16,m1,ta,ma
> addiw
def = gimple_get_lhs (vec_stmts[j]);
> - SET_PHI_ARG_DEF (phi, loop_exit->dest_idx, def);
> + if (LOOP_VINFO_IV_EXIT (loop_vinfo) == loop_exit)
> + SET_PHI_ARG_DEF (phi, loop_exit->dest_idx, def);
> + else
> + {
> + for (unsigned k = 0; k < gimple_phi_num_args (phi); k++)
> + SET_PHI_ARG_DEF (phi, k, def);
> + }
> new_def = gimple_convert (, vectype, new_def);
> reduc_inputs.quick_push (new_def);
> }
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
ORD == 4
> #define BIL_TYPE_SIZE (4 * __CHAR_BIT__)
> #define BILtype SItype
> -#define UBILtype USItype
> +typedef USItype __attribute__ ((__may_alias__)) UBILtype;
> #elif BIL_UNITS_PER_WORD == 2
> #define BIL_TYPE_SIZE (2 * __CHAR_BIT__)
> #define BI
On Thu, Jan 11, 2024 at 2:39 AM wrote:
>
> From: Pan Li
>
> The insert_var_expansion_initialization depends on the
> HONOR_SIGNED_ZEROS to initialize the unrolling variables
> to +0.0f when -0.0f and no-signed-option. Unfortunately,
> we should always keep the -0.0f here because:
>
> * The
On Thu, Jan 11, 2024 at 2:16 AM Liu, Hongtao wrote:
>
>
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Wednesday, January 10, 2024 5:44 PM
> > To: Liu, Hongtao
> > Cc: Jiang, Haochen ; gcc-patches@gcc.gnu.org;
> > ubiz...@gmail.
Testcase for fixed PR.
Pushed.
PR tree-optimization/111003
gcc/testsuite/
* gcc.dg/tree-ssa/pr111003.c: New testcase.
---
gcc/testsuite/gcc.dg/tree-ssa/pr111003.c | 34
1 file changed, 34 insertions(+)
create mode 100644
The optimization to expand uniform boolean vectors by sign-extension
works only for dense masks but it failed to check that.
Bootstrap and regtest running on x86_64-unknown-linux-gnu, I've
checked aarch64 RTL expansion for the testcase. Will push tomorrow.
Richard.
PR middle-end/112740
+ we are generating a `forall` or an `exist` condition. */
>auto new_code = NE_EXPR;
>auto reduc_optab = ior_optab;
>auto reduc_op = BIT_IOR_EXPR;
>tree cst = build_zero_cst (vectype);
> + edge exit_true_edge = EDGE_SUCC (gimple_bb (cond_stmt), 0);
> + if (exit_true_edge->flag
When if-conversion was changed to use .COND_ADD/SUB for conditional
reduction it was forgotten to update reduction path handling to
canonicalize .COND_SUB to .COND_ADD for vectorizable_reduction
similar to what we do for MINUS_EXPR. The following adds this
and testcases exercising this at runtime
On Wed, 10 Jan 2024, Richard Sandiford wrote:
> Just a note that, following discussion on IRC, I'll pull this for
> GCC 14 and resubmit for GCC 15.
>
> There was also pushback on IRC about making the pass opt-in.
> Enabling it for x86_64 would mean fixing RPAD to use a representation
> that is
.f);
> }
> -/* { dg-final { scan-tree-dump-times "ABS" 4 "gimple"} } */
> -/* { dg-final { scan-tree-dump-times "\.COPYSIGN" 4 "gimple"} } */
> +
> +/* { dg-final { scan-tree-dump-times "ABS" 8 "gimple" } } */
> diff --git a/gcc/testsuite/lib/target-supports.exp
> b/gcc/testsuite/lib/target-supports.exp
> index
> 7f13ff0ca565efdf19065811f3301db897329073..f0765a14fb78f2267f54f5ae79a86f4ab644152b
> 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -7830,6 +7830,30 @@ proc check_effective_target_xorsign { } {
>|| [istarget aarch64*-*-*] || [istarget arm*-*-*] }}]
> }
>
> +# Return 1 if the target plus current options supports folding of
> +# copysign into IFN_COPYSIGN.
> +#
> +# This won't change for different subtargets so cache the result.
> +
> +proc check_effective_target_ifn_copysign { } {
> +return [check_cached_effective_target_indexed ifn_copysign {
> + expr {
> + (([istarget i?86-*-*] || [istarget x86_64-*-*])
> +&& [is-effective-target sse])
> + || ([istarget loongarch*-*-*]
> + && [check_effective_target_hard_float])
> + || ([istarget powerpc*-*-*]
> + && ![istarget powerpc-*-linux*paired*])
> + || [istarget alpha*-*-*]
> + || [istarget aarch64*-*-*]
> + || [is-effective-target arm_neon]
> + || ([istarget s390*-*-*]
> + && [check_effective_target_s390_vx])
> + || ([istarget riscv*-*-*]
> + && [check_effective_target_hard_float])
> + }}]
> +}
> +
> # Return 1 if the target plus current options supports a vector
> # widening summation of *short* args into *int* result, 0 otherwise.
> #
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Tue, Jan 9, 2024 at 11:48 AM liuhongt wrote:
>
> > I wonder if you can amend the existing patterns instead by iterating
> > over cond/vec_cond. There are quite some (look for uses of
> > minmax_from_comparison) that could be adapted to vectors.
> >
> > The ones matching the simple form you
On Sat, Dec 23, 2023 at 7:35 PM Andrew Pinski wrote:
>
> Like r14-2293-g11350734240dba and r14-2289-gb083203f053f16,
> reassociation can combine across a few bb and one of the usage
> can be an uninitializated variable and if going from an conditional
> usage to an unconditional usage can cause
On Wed, Jan 10, 2024 at 12:53 PM Eric Botcazou wrote:
>
> > Can you elaborate on the DIE order constraint and why it was chosen? That
> > is,
> >
> > + /* The DIE with DW_AT_endianity is placed right after the naked DIE.
> > */ + if (reverse)
> > + {
> > + gcc_assert
> +/* { dg-options "-std=c23 -O -fno-tree-fre --param=large-stack-frame=1024
> -fstack-check=generic" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 513
> +typedef _BitInt(513) B;
> +#else
> +typedef int B;
> +#endif
> +
> +static inline __attribute__((__always_inline__)) void
> +bar (B x)
> +{
> + B y = x;
> + if (y)
> +__builtin_abort ();
> +}
> +
> +void
> +foo (void)
> +{
> + bar (0);
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Wed, Jan 10, 2024 at 9:01 AM Liu, Hongtao wrote:
>
>
>
> > -Original Message-
> > From: Jiang, Haochen
> > Sent: Wednesday, January 10, 2024 3:35 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Liu, Hongtao ; ubiz...@gmail.com; burnus@net-
> > b.de; san...@codesourcery.com
> > Subject:
On Wed, Jan 10, 2024 at 3:35 AM Haochen Jiang wrote:
>
> Hi Richard,
>
> It seems that I send out a not updated patch. This patch should what
> I want to send.
OK
> Thx,
> Haochen
>
> gcc/ChangeLog:
>
> * doc/invoke.texi: Add -mevex512.
> ---
> gcc/doc/invoke.texi | 7 ++-
> 1 file
On Tue, Jan 9, 2024 at 9:18 PM Eric Botcazou wrote:
>
> Hi,
>
> this is not really a regression but the patch was written last week and is
> quite straightforward, so hopefully can nevertheless be OK. It implements the
> support of DW_AT_endianity for enumeration types because they are scalar
> Am 09.01.2024 um 16:13 schrieb Siddhesh Poyarekar :
>
> On 2023-12-18 09:35, Siddhesh Poyarekar wrote:
>> The "exploitable vulnerability" may lead to a misunderstanding that missed
>> hardening issues are considered vulnerabilities, just that they're not
>> exploitable. This is not true,
On Tue, 9 Jan 2024, Tamar Christina wrote:
> > -Original Message-
> > From: Richard Biener
> > Sent: Tuesday, January 9, 2024 1:51 PM
> > To: Tamar Christina
> > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> > Subject: RE: [PATCH]mid
On Tue, 9 Jan 2024, Richard Biener wrote:
> On Tue, 9 Jan 2024, Tamar Christina wrote:
>
> >
> >
> > > -Original Message-
> > > From: Richard Biener
> > > Sent: Tuesday, January 9, 2024 12:26 PM
> > > To: Tamar Christina
> &g
On Tue, 9 Jan 2024, Tamar Christina wrote:
>
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Tuesday, January 9, 2024 12:26 PM
> > To: Tamar Christina
> > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> > Subject: RE:
ct me if any
> misunderstanding.
>
> Pan
>
> -Original Message-----
> From: Li, Pan2
> Sent: Tuesday, January 9, 2024 9:22 AM
> To: Richard Biener
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang
> ; kito.ch...@gmail.com; jeffreya...@gmail.com
> Sub
On Tue, Jan 9, 2024 at 8:04 AM Haochen Jiang wrote:
>
> Hi all,
>
> In invoke.texi, -mevex512 is missing. This patch adds that.
>
> Ok for trunk?
You add it to the "index" but not document its semantics?
> Thx,
> Haochen
>
> gcc/ChangeLog:
>
> * doc/invoke.texi: Add -mevex512.
> ---
>
On Tue, 9 Jan 2024, haochen.jiang wrote:
> On Linux/x86_64,
>
> b3cc5a1efead520bc977b4ba51f1328d01b3e516 is the first bad commit
> commit b3cc5a1efead520bc977b4ba51f1328d01b3e516
> Author: Richard Biener
> Date: Fri Dec 15 10:32:29 2023 +0100
>
> tree-optimizati
The late amendment with a limit based on VF was redundant and wrong
for peeled early exits. The following moves the adjustment done
when we don't have a skip edge down to the place where the already
existing VF based max iter check is done and removes the amendment.
Bootstrapped and tested on
On Tue, 9 Jan 2024, Tamar Christina wrote:
> > This makes it quadratic in the number of vectorized early exit loops
> > in a function. The vectorizer CFG manipulation operates in a local
> > enough bubble that programmatic updating of dominators should be
> > possible (after all we manage to
/* Then permute the vector contents. */
> - tree perm_elem = perm_mask_for_reverse (vectype);
> - perm_dest = copy_ssa_name (vec_lhs_phi);
> - perm_stmt
> - = gimple_build_assign (perm_dest, VEC_PERM_EXPR, vec_lhs_phi,
> -vec_lhs_phi, perm_elem);
> - vect_finish_stmt_generation (loop_vinfo, stmt_info, perm_stmt,
> -);
> - vec_lhs_phi = perm_dest;
> - }
> -
>gimple_seq_add_seq (, tem);
>
>scalar_res = gimple_build (, CFN_EXTRACT_LAST, scalar_type,
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Tue, 9 Jan 2024, Uros Bizjak wrote:
> On Tue, Jan 9, 2024 at 10:44?AM Richard Biener wrote:
> >
> > On Tue, 9 Jan 2024, Uros Bizjak wrote:
> >
> > > On Tue, Jan 9, 2024 at 9:58?AM Richard Biener wrote:
> > > >
> > > > On Mon, 8 Jan 20
ariable thread through the original value otherwise
> + skip it and directly use the final value. */
> + if (tree *res = new_phi_args.get (tmp_arg))
> + new_arg = *res;
> + else
> new_arg = tmp_arg;
> }
>
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Tue, 9 Jan 2024, Uros Bizjak wrote:
> On Tue, Jan 9, 2024 at 9:58?AM Richard Biener wrote:
> >
> > On Mon, 8 Jan 2024, Uros Bizjak wrote:
> >
> > > On Mon, Jan 8, 2024 at 5:57?PM Andrew Pinski wrote:
> > > >
> > >
r_of_iterations = niter;
> }
>
> --- gcc/testsuite/gcc.c-torture/compile/pr113210.c.jj 2024-01-08
> 16:17:16.672620793 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr113210.c2024-01-08
> 16:17:16.671620807 +0100
> @@ -0,0 +1,13 @@
> +/* PR tree-optimization/1132
On Mon, 8 Jan 2024, Uros Bizjak wrote:
> On Mon, Jan 8, 2024 at 5:57?PM Andrew Pinski wrote:
> >
> > On Mon, Jan 8, 2024 at 6:44?AM Uros Bizjak wrote:
> > >
> > > Instead of converting XOR or PLUS of two values, ANDed with two constants
> > > that
> > > have no bits in common, to IOR
On Mon, 8 Jan 2024, Jeff Law wrote:
>
>
> On 1/8/24 09:57, Andrew Pinski wrote:
> > On Mon, Jan 8, 2024 at 6:44?AM Uros Bizjak wrote:
> >>
> >> Instead of converting XOR or PLUS of two values, ANDed with two constants
> >> that
> >> have no bits in common, to IOR expression, convert IOR or XOR
+#endif
This looks close to the handling of user-specified alignment
(factor sth out?), but I wonder if for -falign-all-functions
we should only allow a hard alignment (no max_skip) and also
not allow (but diagnose?) conflicts with limit-function-alignment?
The interaction with the other flags also doesn't seem to be
well documented? The main docs suggest it should be
-fmin-function-alignment= which to me then suggests
-flimit-function-alignment should not have an effect on it
and even very small functions should be aligned.
Richard.
> +}
> +
>/* Handle a user-specified function alignment.
> Note that we still need to align to DECL_ALIGN, as above,
> because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
(void)
> +{
> + foo (1, 4);
> +}
> +#endif
> +
> +#if __BITINT_MAXWIDTH__ >= 6928
> +_BitInt(6928)
> +baz (int x, _BitInt(6928) y)
> +{
> + if (x)
> +return y;
> + else
> +return 0;
> +}
> +#endif
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
23-12-23 10:46:17.808658852
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-61.c 2023-12-23 10:46:02.482874865 +0100
> @@ -0,0 +1,17 @@
> +/* PR tree-optimization/113119 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23 -O2" } */
> +
> +_BitInt(8)
G_MISSED_OPTIMIZATION, vect_location,
> + "can't operate on partial vectors "
> + "because the target doesn't support extract "
> + "first reduction.\n");
> + LOOP_VIN
37f1be1101ffae779214056a0886411e0683e887
> 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -976,7 +976,10 @@ vec_init_loop_exit_info (class loop *loop)
>if (number_of_iterations_exit_assumptions (loop, exit, _desc,
> NULL)
> && !chrec_contains_undetermined (niter_desc.niter))
> {
> - if (!niter_desc.may_be_zero || !candidate)
> + tree may_be_zero = niter_desc.may_be_zero;
> + if ((may_be_zero && integer_zerop (may_be_zero))
niter_desc.may_be_zero is never NULL
The rest looks OK. Is it possible to split out parts that not
require the vect_analyze_early_break_dependences changes?
Richard.
> + && (!candidate
> + || dominated_by_p (CDI_DOMINATORS, exit->src, candidate->src)))
> candidate = exit;
> }
> }
> @@ -1913,15 +1916,14 @@ vect_create_loop_vinfo (class loop *loop,
> vec_info_shared *shared,
>STMT_VINFO_DEF_TYPE (loop_cond_info) = vect_condition_def;
> }
>
> - for (unsigned i = 1; i < info->conds.length (); i ++)
> -LOOP_VINFO_LOOP_CONDS (loop_vinfo).safe_push (info->conds[i]);
> + LOOP_VINFO_LOOP_CONDS (loop_vinfo).safe_splice (info->conds);
>LOOP_VINFO_LOOP_IV_COND (loop_vinfo) = info->conds[0];
>
>LOOP_VINFO_IV_EXIT (loop_vinfo) = info->loop_exit;
>
>/* Check to see if we're vectorizing multiple exits. */
>LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
> -= !LOOP_VINFO_LOOP_CONDS (loop_vinfo).is_empty ();
> += LOOP_VINFO_LOOP_CONDS (loop_vinfo).length () > 1;
>
>if (info->inner_loop_cond)
> {
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index
> 785fc99ca27a4caf26b1fca887e6262108f515b2..d0ee95146b2907132bf68962e3f16d43c36afded
> 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -893,7 +893,7 @@ public:
> the loop for the break finding loop. */
>bool early_breaks;
>
> - /* List of loop additional IV conditionals found in the loop. */
> + /* List of loop all IV conditionals found in the loop. */
>auto_vec conds;
>
>/* Main loop IV cond. */
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
> + while (!workset.is_empty ())
> + {
> + auto bb = workset.pop ()->dest;
> + if (visited.add (bb))
> + continue;
> + doms.safe_push (bb);
> + FOR_EACH_EDGE (ev, ei, bb->succs)
> + workset.safe_push (ev);
> + }
> + visited.empty ();
> doms.safe_push (exit_dest);
>
> /* Likely a fall-through edge, so update if needed. */
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
P (loop_vinfo)
> + && induction_type != vect_step_op_neg)
> +{
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> + "Peeling for epilogue is not supported"
> + " f
On Mon, Jan 8, 2024 at 10:58 AM Nathaniel Shead
wrote:
>
> On Thu, Jan 04, 2024 at 03:39:15PM -0500, Patrick Palka wrote:
> > On Sun, 3 Dec 2023, Nathaniel Shead wrote:
> >
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > >
> > > -- >8 --
> > >
> > > The
On Mon, Jan 8, 2024 at 3:35 AM Kewen.Lin wrote:
>
> Hi,
>
> As PR113100 shows, the unbiasing introduced by r14-6737 can
> cause the scrubbing to overrun and screw some critical data
> on stack like saved toc base consequently cause segfault on
> Power.
>
> By checking PR112917, IMHO we should
t; 13:27:42.876330301 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr113228.c2024-01-05
> 13:27:22.503609458 +0100
> @@ -0,0 +1,17 @@
> +/* PR tree-optimization/113228 */
> +
> +int a, b, c, d, i;
> +
> +void
> +foo (void)
> +{
> + int k[3] = {};
> + int *l
The following avoids creating a niter peeling epilog more consistently,
matching what peeling later uses for the skip_vector condition, in
particular when versioning is required which then also ensures the
vector loop is entered unless the epilog is vectorized. This should
ideally match
On Tue, Jan 2, 2024 at 2:37 PM wrote:
>
> From: Pan Li
>
> According to the sematics of no-signed-zeros option, the backend
> like RISC-V should treat the minus zero -0.0f as plus zero 0.0f.
>
> Consider below example with option -fno-signed-zeros.
>
> void
> test (float *a)
> {
> *a = -0.0;
>
It was noticed that -mmovbe doesn't use movbe for __builtin_bswap{32,64}
when not optimizing. The follownig adjusts the documentation to
say it will be used for optimizing and applies to all byte swaps,
not just those carried out via builtin function calls.
OK?
Thanks,
Richard.
*
> Am 31.12.2023 um 11:20 schrieb Jørgen Kvalsvik :
>
> On 31/12/2023 10:40, Jan Hubicka wrote:
This seems good. Profile-arcs is rarely used by itself - most of time it
is implied by -fprofile-generate and -ftest-coverage and since
condition coverage is more associated to the
>> Sent: Sunday, December 17, 2023 8:31 PM
>> To: Thomas Schwinge ; gcc-patches@gcc.gnu.org
>> Cc: Richard Biener
>> Subject: RE: [PATCH v4] [tree-optimization/110279] Consider FMA in
>> get_reassociation_width
>>
>> Hello Thomas,
>>
>>>
> Am 22.12.2023 um 09:26 schrieb Jakub Jelinek :
>
> Hi!
>
> Large/huge _BitInt types are returned in memory and the bitint lowering
> pass right now relies on that.
> The gimplification etc. use aggregate_value_p to see if it should be
> returned in memory or not and use
> = _123;
>
> Am 22.12.2023 um 09:17 schrieb Jakub Jelinek :
>
> Hi!
>
> On the following testcase earlier passes leave around an unreleased
> SSA_NAME - non-GIMPLE_NOP SSA_NAME_DEF_STMT which isn't in any bb.
> The following patch makes bitint lowering resistent against those,
> the first hunk is where
> Am 22.12.2023 um 09:12 schrieb Jakub Jelinek :
>
> Hi!
>
> My recent change to use m_data[save_data_cnt] instead of
> m_data[save_data_cnt + 1] when inside of a loop (m_bb is non-NULL)
> broke the following testcase. When we create a PHI node on the loop
> using prepare_data_in_out, both
> Am 21.12.2023 um 21:11 schrieb Andrew Pinski :
>
> This adds the documentation for cond_copysign and cond_len_copysign optabs.
> Also reorders the optabs.def to be in the similar order as how the internal
> function was done.
Ok
> gcc/ChangeLog:
>
>PR middle-end/112951
>*
\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*shift exponent \[0-9a-fx]* is too large for
> \[0-9]*-bit type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*shift exponent \[0-9a-fx-]* is
> negative\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*shift exponent \[0-9a-fx-]* is negative" } */
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
ite/gcc.dg/bitint-57.c.jj 2023-12-20 12:42:12.691772991
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-57.c 2023-12-20 12:42:49.900250015 +0100
> @@ -0,0 +1,21 @@
> +/* PR tree-optimization/112941 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23 -O1 -fno-tree-forwprop" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 6384
> +unsigned _BitInt(2049)
> +foo (unsigned _BitInt(6384) x, _BitInt(8) y)
> +{
> + unsigned _BitInt(6384) z = y;
> + return x * z;
> +}
> +
> +_BitInt(2049)
> +bar (unsigned _BitInt(6384) x, _BitInt(1023) y)
> +{
> + unsigned _BitInt(6384) z = y;
> + return x * z;
> +}
> +#else
> +int i;
> +#endif
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Wed, 20 Dec 2023, Andre Vieira (lists) wrote:
> This patch fixes an issue introduced by:
> commit ea4a3d08f11a59319df7b750a955ac613a3f438a
> Author: Andre Vieira
> Date: Wed Nov 1 17:02:41 2023 +
>
> omp: Reorder call for TARGET_SIMD_CLONE_ADJUST
>
> The problem was that after
> Am 20.12.2023 um 19:30 schrieb Richard Sandiford :
>
> If cse sees:
>
> (set (reg R) (const_vector [A B ...]))
>
> it creates fake sets of the form:
>
> (set R[0] A)
> (set R[1] B)
> ...
>
> (with R[n] replaced by appropriate rtl) and then adds them to the tables
> in the same way
ctor mode has at most
>the same number of elements.
> (get_compute_type): Pass original vector type rather than the element
> type to type_for_widest_vector_mode and remove now obsolete check
> for the number of elements.
OK.
Richard.
> On 07
On Wed, 20 Dec 2023, Richard Biener wrote:
> On Wed, 20 Dec 2023, Thomas Schwinge wrote:
>
> > Hi!
> >
> > On 2023-12-19T13:30:58+0100, Richard Biener wrote:
> > > The PR112736 testcase fails on RISC-V because the aligned exception
> > > uses the wron
gcc.dg/vect/bb-slp-pr78205.c is reported to have regressed with
the PR113073 change and in the end it's due to the DCE performed
by vect_transform_slp_perm_load_1 being imperfect. The following
enhances it to also cover the CTOR and VIEW_CONVERT operations that
might be involved.
Bootstrapped
ctorizer.h
> @@ -66,6 +66,7 @@ enum vect_def_type {
>vect_double_reduction_def,
>vect_nested_cycle,
>vect_first_order_recurrence,
> + vect_condition_def,
>vect_unknown_def_type
> };
>
> @@ -888,6 +889,10 @@ public:
> we need to peel off iterations at the end to form an epilogue loop. */
>bool peeling_for_niter;
>
> + /* When the loop has early breaks that we can vectorize we need to peel
> + the loop for the break finding loop. */
> + bool early_breaks;
> +
>/* List of loop additional IV conditionals found in the loop. */
>auto_vec conds;
>
> @@ -942,6 +947,20 @@ public:
>/* The controlling loop IV for the scalar loop being vectorized. This IV
> controls the natural exits of the loop. */
>edge scalar_loop_iv_exit;
> +
> + /* Used to store the list of stores needing to be moved if doing early
> + break vectorization as they would violate the scalar loop semantics if
> + vectorized in their current location. These are stored in order that
> they
> + need to be moved. */
> + auto_vec early_break_stores;
> +
> + /* The final basic block where to move statements to. In the case of
> + multiple exits this could be pretty far away. */
> + basic_block early_break_dest_bb;
> +
> + /* Statements whose VUSES need updating if early break vectorization is to
> + happen. */
> + auto_vec early_break_vuses;
> } *loop_vec_info;
>
> /* Access Functions. */
> @@ -996,6 +1015,10 @@ public:
> #define LOOP_VINFO_REDUCTION_CHAINS(L) (L)->reduction_chains
> #define LOOP_VINFO_PEELING_FOR_GAPS(L) (L)->peeling_for_gaps
> #define LOOP_VINFO_PEELING_FOR_NITER(L)(L)->peeling_for_niter
> +#define LOOP_VINFO_EARLY_BREAKS(L) (L)->early_breaks
> +#define LOOP_VINFO_EARLY_BRK_STORES(L) (L)->early_break_stores
> +#define LOOP_VINFO_EARLY_BRK_DEST_BB(L)(L)->early_break_dest_bb
> +#define LOOP_VINFO_EARLY_BRK_VUSES(L) (L)->early_break_vuses
> #define LOOP_VINFO_LOOP_CONDS(L) (L)->conds
> #define LOOP_VINFO_LOOP_IV_COND(L) (L)->loop_iv_cond
> #define LOOP_VINFO_NO_DATA_DEPENDENCIES(L) (L)->no_data_dependencies
> @@ -2298,6 +2321,9 @@ extern opt_result vect_get_vector_types_for_stmt
> (vec_info *,
> tree *, unsigned int = 0);
> extern opt_tree vect_get_mask_type_for_stmt (stmt_vec_info, unsigned int =
> 0);
>
> +/* In tree-if-conv.cc. */
> +extern bool ref_within_array_bound (gimple *, tree);
> +
> /* In tree-vect-data-refs.cc. */
> extern bool vect_can_force_dr_alignment_p (const_tree, poly_uint64);
> extern enum dr_alignment_support vect_supportable_dr_alignment
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
+
> +#if __SIZEOF_INT128__
> +void
> +f9 (_BitInt(4096) *p, __int128 r)
> +{
> + p[0] += (unsigned _BitInt(2048)) r;
> +}
> +
> +void
> +f10 (_BitInt(4094) *p, __int128 r)
> +{
> + p[0] -= (unsigned _BitInt(2048)) r;
> +}
> +
> +void
> +f11 (_BitInt(4096)
On Wed, 20 Dec 2023, Thomas Schwinge wrote:
> Hi!
>
> On 2023-12-19T13:30:58+0100, Richard Biener wrote:
> > The PR112736 testcase fails on RISC-V because the aligned exception
> > uses the wrong check. The alignment support scheme can be
> > dr_aligned even wh
On Wed, 20 Dec 2023, Richard Sandiford wrote:
> Richard Biener writes:
> > On Tue, 19 Dec 2023, Andrew Pinski wrote:
> >
> >> On Tue, Dec 19, 2023 at 2:40?AM Richard Sandiford
> >> wrote:
> >> >
> >> > Richard Biener writes:
On Wed, Dec 20, 2023 at 3:54 AM Alexandre Oliva wrote:
>
>
> Builtin expanders for memset and memcpy may involve conditionals and
> loops, but their sequences may be end up emitted in edges. Alas,
> commit_one_edge_insertion rejects sequences that end with a jump, a
> requirement that makes
On Wed, Dec 20, 2023 at 12:51 AM Alexandre Oliva wrote:
>
> On Dec 15, 2023, Richard Biener wrote:
>
> > It might be worth amending the documentation in case this
> > is unexpected to users?
>
> Oh, yes indeed, thanks!
>
> Here's a patch that brings relevant pa
On Thu, Dec 14, 2023 at 2:23 AM wrote:
>
> From: Vladimir Mezentsev
>
> This is fixes for releases/gcc-13 for 31109 gprofng not built and installed
> in a combined binutils+gcc build
> I only cherry-picked 24552056fd5fc677c0d032f54a5cad1c4303d312 and tested my
> build.
I don't think a
On Tue, Dec 19, 2023 at 6:41 PM Jason Merrill wrote:
>
> On 12/11/23 22:00, Jason Merrill wrote:
> > OK for trunk?
>
> Ping. CCing Alex because this could plausibly be considered build
> machinery, and he's had useful feedback on my sh code before.
OK in case Alex doesn't have any comments.
On Tue, 19 Dec 2023, Andrew Pinski wrote:
> On Tue, Dec 19, 2023 at 2:40?AM Richard Sandiford
> wrote:
> >
> > Richard Biener writes:
> > > On Tue, 19 Dec 2023, juzhe.zh...@rivai.ai wrote:
> > >
> > >> Hi, Richard.
> > >>
> &g
> > +/* { dg-final { scan-tree-dump-not "vector operands from scalars" "slp2" {
> > target {
> > { vect_int && vect_bool_cmp } && { vect_unpack && vect_hw_misalign } } } }
> > } */
> > diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-sub
-- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -359,8 +359,8 @@ vect_stmt_relevant_p (stmt_vec_info stmt_info,
> loop_vec_info loop_vinfo,
>*live_p = false;
>
>/* cond stmt other than loop exit cond. */
> - if (is_ctrl_stmt (stmt_info->stmt)
> - && STMT_VINFO_TYPE (stmt_info) != loop_exit_ctrl_vec_info_type)
> + gimple *stmt = STMT_VINFO_STMT (stmt_info);
> + if (dyn_cast (stmt))
> *relevant = vect_used_in_scope;
>
>/* changing memory. */
> @@ -13530,6 +13530,9 @@ vect_is_simple_use (tree operand, vec_info *vinfo,
> enum vect_def_type *dt,
> case vect_first_order_recurrence:
> dump_printf (MSG_NOTE, "first order recurrence\n");
> break;
> + case vect_condition_def:
> + dump_printf (MSG_NOTE, "control flow\n");
> + break;
> case vect_unknown_def_type:
> dump_printf (MSG_NOTE, "unknown\n");
> break;
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index
> e4d7ab4567cef3c018b958f98eeff045d3477725..3c9478a3dc8750c71e0bf2a36a5b0815afc3fd94
> 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -66,6 +66,7 @@ enum vect_def_type {
>vect_double_reduction_def,
>vect_nested_cycle,
>vect_first_order_recurrence,
> + vect_condition_def,
>vect_unknown_def_type
> };
>
> @@ -888,6 +889,10 @@ public:
> we need to peel off iterations at the end to form an epilogue loop. */
>bool peeling_for_niter;
>
> + /* When the loop has early breaks that we can vectorize we need to peel
> + the loop for the break finding loop. */
> + bool early_breaks;
> +
>/* List of loop additional IV conditionals found in the loop. */
>auto_vec conds;
>
> @@ -942,6 +947,20 @@ public:
>/* The controlling loop IV for the scalar loop being vectorized. This IV
> controls the natural exits of the loop. */
>edge scalar_loop_iv_exit;
> +
> + /* Used to store the list of statements needing to be moved if doing early
> + break vectorization as they would violate the scalar loop semantics if
> + vectorized in their current location. These are stored in order that
> they need
> + to be moved. */
> + auto_vec early_break_conflict;
> +
> + /* The final basic block where to move statements to. In the case of
> + multiple exits this could be pretty far away. */
> + basic_block early_break_dest_bb;
> +
> + /* Statements whose VUSES need updating if early break vectorization is to
> + happen. */
> + auto_vec early_break_vuses;
> } *loop_vec_info;
>
> /* Access Functions. */
> @@ -996,6 +1015,10 @@ public:
> #define LOOP_VINFO_REDUCTION_CHAINS(L) (L)->reduction_chains
> #define LOOP_VINFO_PEELING_FOR_GAPS(L) (L)->peeling_for_gaps
> #define LOOP_VINFO_PEELING_FOR_NITER(L)(L)->peeling_for_niter
> +#define LOOP_VINFO_EARLY_BREAKS(L) (L)->early_breaks
> +#define LOOP_VINFO_EARLY_BRK_CONFLICT_STMTS(L) (L)->early_break_conflict
> +#define LOOP_VINFO_EARLY_BRK_DEST_BB(L)(L)->early_break_dest_bb
> +#define LOOP_VINFO_EARLY_BRK_VUSES(L) (L)->early_break_vuses
> #define LOOP_VINFO_LOOP_CONDS(L) (L)->conds
> #define LOOP_VINFO_LOOP_IV_COND(L) (L)->loop_iv_cond
> #define LOOP_VINFO_NO_DATA_DEPENDENCIES(L) (L)->no_data_dependencies
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Tue, Dec 19, 2023 at 6:39 AM liuhongt wrote:
>
> Similar for A < B ? B : A to MAX_EXPR.
> There're codes in the frontend to optimize such pattern but failed to
> handle testcase in the PR since it's exposed at gimple level when
> folding backend builtins.
>
> pr95906 now can be optimized to
928,8 +10935,7 @@ vectorizable_live_operation (vec_info *vinfo,
> stmt_vec_info stmt_info,
> if (restart_loop
> && STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def)
> {
> - tmp_vec_stmt = STMT_VINFO_VEC_STMTS (stmt_info)[0];
> - tmp_vec_lhs = gimple_get_lhs (tmp_vec_stmt);
> + tmp_vec_lhs = vec_lhs0;
> tmp_bitstart = build_zero_cst (TREE_TYPE (bitstart));
> }
>
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
When performing final value replacement we guard against exponential
(temporary) code growth due to unsharing of trees (SCEV heavily
relies on tree sharing). The following relaxes this a tiny bit
to cover some more optimizations and puts in comments as to what
the real fix would be.
Bootstrapped
The PR112736 testcase fails on RISC-V because the aligned exception
uses the wrong check. The alignment support scheme can be
dr_aligned even when the access isn't aligned to the vector size
but some targets are happy with element alignment. The following
fixes that.
Bootstrapped and tested on
; 2 "vect" { target {
> amdgcn-*-* riscv*-*-* } } } } */
> +/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target {
> vect_element_align && { ! { amdgcn-*-* } } } } } } */
> +/* { dg-final { scan-tree-dump-times "loop vect
}
> }
>
> I wonder whether we can simplify the codes as follows :?
> if (integer_zerop (arg1) || integer_zerop (arg2))
> step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR
> || code == BIT_XOR_EXPR);
Possibly. I'll let Richard S. commen
> && integer_zerop (VECTOR_CST_ELT (arg2, 0)))
> step_ok_p = (code == BIT_AND_EXPR || code == BIT_IOR_EXPR
> || code == BIT_XOR_EXPR);
>
>
>
> juzhe.zh...@rivai.ai
>
> From: Richard Biener
> Date: 2023-12-19 16:15
> To: ???
> CC: rd
On Tue, 19 Dec 2023, Alexandre Oliva wrote:
> On Dec 15, 2023, Richard Biener wrote:
>
> > You have to be generally careful when working within IPA
> > with function bodies without push/pop_cfun around that, several APIs
> > have variants with struct function sepcif
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c
> new file mode 100644
> index 000..816ebd3c493
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112971.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv
On Mon, 18 Dec 2023, Jakub Jelinek wrote:
> On Mon, Dec 18, 2023 at 03:08:49PM +0100, Richard Biener wrote:
> > The following improves the manual work needed to make a -gimple dump
> > valid input to the GIMPLE FE. First of all it recognizes the 'sizetype'
> > tree and dum
The following improves the manual work needed to make a -gimple dump
valid input to the GIMPLE FE. First of all it recognizes the 'sizetype'
tree and dumps it as __SIZETYPE__, then it changes dumping vector types
without name from 'vector(n) T' to 'T [[gnu::vector_size(n')]]' which
we can parse
The following adds dumping of TARGET_MEM_REF in -gimple form and
adds parsing of it to the GIMPLE FE.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR c/111975
gcc/c/
* gimple-parser.cc (c_parser_gimple_postfix_expression):
Parse TARGET_MEM_REF extended
On Mon, Dec 18, 2023 at 9:35 AM Jakub Jelinek wrote:
>
> Hi!
>
> The following testcase ICEs because we aren't careful enough with
> alloc_size attribute. We do check that such an argument exists
> (although wouldn't handle correctly functions with more than INT_MAX
> arguments), but didn't
> Am 17.12.2023 um 04:29 schrieb Jeff Law :
>
>
> So mcore-elf is the slowest target to test with a simulator. Not because
> it's simulator is particularly bad, but because some tests timeout as they've
> gotten into infinite loops. This causes the mcore-elf port to take about 2X
>
> Am 16.12.2023 um 16:56 schrieb H.J. Lu :
>
> Linux CET kernel places a restore token on shadow stack followed by
> optional additional information for signal handler to enhance security.
> The restore token is the previous shadow stack pointer with bit 63 set.
> It is usually transparent to
The following avoids creating a niter peeling epilog more consistently,
matching what peeling later uses for the skip_vector condition, in
particular when versioning is required which then also ensures the
vector loop is entered unless the epilog is vectorized. This should
ideally match
typefn {Target Hook} bool TARGET_C_BITINT_TYPE_INFO (int @var{n}, struct
> bitint_info *@var{info})
> This target hook returns true if @code{_BitInt(@var{N})} is supported and
> provides details on it. @code{_BitInt(@var{N})} is to be represented as
> -series of @code{info->limb_mode}
>
On Fri, Dec 15, 2023 at 2:25 AM haochen.jiang
wrote:
>
> On Linux/x86_64,
>
> 8afdbcdd7abe1e3c7a81e07f34c256e7f2dbc652 is the first bad commit
> commit 8afdbcdd7abe1e3c7a81e07f34c256e7f2dbc652
> Author: Di Zhao
> Date: Fri Dec 15 03:22:32 2023 +0800
>
> Consider fully pipelined FMA in
ned char x) { long long y = x; return y; }
> +unsigned int f9 (signed char x) { return (unsigned long long) x; }
> +unsigned int f10 (unsigned char x) { return (unsigned long long) x; }
> +unsigned int f11 (signed char x) { return (long long) x; }
> +unsigned int f12 (unsigned char x
:46.683512224
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-54.c 2023-12-14 13:47:20.191879500 +0100
> @@ -0,0 +1,21 @@
> +/* PR tree-optimization/113003 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23 -O2" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 131
> +int
> +foo (_BitInt(7) x)
> +{
> + return __builtin_mul_overflow_p (x,
> 1046555807606105294475452482332716433408wb, 0);
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +int
> +bar (unsigned __int128 x)
> +{
> + return __builtin_sub_overflow_p
> (340282366920938463463374607431768211457uwb, x, 0);
> +}
> +#endif
> +#else
> +int i;
> +#endif
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Thu, 14 Dec 2023, Richard Sandiford wrote:
> Richard Biener writes:
> > The following changes the unsigned group_size argument to a poly_uint64
> > one to avoid too much special-casing in callers for VLA vectors when
> > passing down the effective maximum desirable
On Thu, Dec 14, 2023 at 6:42 PM Andrew Pinski (QUIC)
wrote:
>
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Thursday, December 14, 2023 5:23 AM
> > To: Andrew Pinski (QUIC)
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [P
On Thu, Dec 14, 2023 at 10:29 PM Alexandre Oliva wrote:
>
>
> The stack pointer is biased by 2047 bytes on sparc64, so the range it
> delimits is way off. Unbias the addresses returned by
> __builtin_stack_address (), so that the strub builtins, inlined or
> not, can function correctly. I've
On Thu, Dec 14, 2023 at 9:55 PM Di Zhao OS
wrote:
>
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Wednesday, December 13, 2023 5:01 PM
> > To: Di Zhao OS
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH v
601 - 700 of 25078 matches
Mail list logo