> Am 31.05.2024 um 17:58 schrieb David Faust :
>
>
>
>> On 5/31/24 00:07, Richard Biener wrote:
>>> On Thu, May 30, 2024 at 11:34 PM David Faust wrote:
>>>
>>> This patch adds a new option, -fprune-btf, to control BTF debug info
>>>
On Thu, May 30, 2024 at 4:55 PM Feng Xue OS wrote:
>
> For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current
> vectorizer could only handle the pattern if the reduction chain does not
> contain other operation, no matter the other is normal or lane-reducing.
>
> Actually,
On Thu, May 30, 2024 at 4:53 PM Feng Xue OS wrote:
>
> The input vectype is an attribute of lane-reducing operation, instead of
> reduction PHI that it is associated to, since there might be more than one
> lane-reducing operations with different type in a loop reduction chain. So
> bind each
The following avoids dumping 'vectorizing stmts using SLP' for
single-lane instances since that causes extra testsuite fallout.
* tree-vect-slp.cc (vect_schedule_slp): Gate dumping
'vectorizing stmts using SLP' on > 1 lanes.
---
gcc/tree-vect-slp.cc | 3 ++-
1 file changed, 2
When vectorizing an early break loop with LENs (do we miss some
check here to disallow this?) we can end up deciding to insert
stmts after a GIMPLE_COND when doing SLP scheduling and trying
to be conservative with placing of stmts only dependent on
the implicit loop mask/len. The following avoids
The following performs single-lane SLP discovery for reductions.
It requires a fixup for outer loop vectorization where a check
for multiple types needs adjustments as otherwise bogus pointer
IV increments happen when there are multiple copies of vector stmts
in the inner loop.
For the reduction
The following adjusts dump scanning for something followed by
successful vector analysis to more specifically look for
'Analysis succeeded' and not 'Analysis failed' because the
previous look for just 'succeeded' or 'failed' is easily confused
by SLP discovery dumping those words.
*
There's another case where we can refer to neutral_op before
eventually converting it from pointer to integer so simply
do that unconditionally.
* tree-vect-loop.cc (get_initial_defs_for_reduction):
Always convert neutral_op.
---
gcc/tree-vect-loop.cc | 15 +++
1 file
On Fri, May 31, 2024 at 12:24 PM Arthur Cohen wrote:
>
> Hi Richard,
>
> On 4/30/24 09:55, Richard Biener wrote:
> > On Fri, Apr 19, 2024 at 11:49 AM Arthur Cohen
> > wrote:
> >>
> >> Hi everyone,
> >>
> >> This patch
" } } */
> diff --git a/gcc/tree.cc b/gcc/tree.cc
> index 6564b002dc1a..01572fe70f72 100644
> --- a/gcc/tree.cc
> +++ b/gcc/tree.cc
> @@ -13405,6 +13405,28 @@ component_ref_size (tree ref, special_array_member
> *sam /* = NULL */)
> ? NULL_TREE : size_zero_node
On Fri, 31 May 2024, Hu, Lin1 wrote:
> > -Original Message-
> > From: Richard Biener
> > Sent: Wednesday, May 29, 2024 5:41 PM
> > To: Hu, Lin1
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ;
> > ubiz...@gmail.com
> > Subject: Re: [PATCH 1/3] ve
The following adds the missing guard for volatile stores to the
embedded DSE in the loop if-conversion pass.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/115278
* tree-if-conv.cc (ifcvt_local_dce): Do not DSE volatile stores.
*
On Thu, May 30, 2024 at 5:48 AM Andrew Pinski wrote:
>
> While looking at the index I noticed that some options had
> `-` in the front for the index which is wrong. And then
> I noticed there was no index for `mcmodel=` for targets or had
> used `-mcmodel` incorrectly.
>
> This fixes both of
On Thu, May 30, 2024 at 2:11 AM Patrick O'Neill wrote:
>
> From: Greg McGary
Still a NACK. If remain ends up zero then
/* Try to use a single smaller load when we are about
to load excess elements compared to the unrolled
On Thu, May 30, 2024 at 4:48 PM Feng Xue OS wrote:
>
> This is a patch that is split out from
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652626.html.
>
> Partial vectorization checking for vectorizable_reduction is a piece of
> relatively isolated code, which may be reused by other
On Thu, May 30, 2024 at 4:45 PM Feng Xue OS wrote:
>
> This is a patch that is split out from
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652626.html.
>
> Check if an operation is lane-reducing requires comparison of code against
> three kinds (DOT_PROD_EXPR/WIDEN_SUM_EXPR/SAD_EXPR).
On Thu, May 30, 2024 at 3:28 PM Feng Xue OS wrote:
>
> >> Hi,
> >>
> >> The patch was updated with the newest trunk, and also contained some minor
> >> changes.
> >>
> >> I am working on another new feature which is meant to support pattern
> >> recognition
> >> of lane-reducing operations in
On Thu, May 30, 2024 at 11:34 PM David Faust wrote:
>
> This patch adds a new option, -fprune-btf, to control BTF debug info
> generation.
Can you name it -gprune-btf instead?
> As the name implies, this option enables a kind of "pruning" of the BTF
> information before it is emitted. When
> Am 30.05.2024 um 13:46 schrieb Eric Botcazou :
>
>
>>
>> Do function pointers inter-operate TBAA wise for this case and would this
>> possibly An issue?
>
> Do you mean in LTO mode? I must say I'm not sure of the way LTO performs TBAA
> for function pointers: does it require (strict)
> Am 30.05.2024 um 00:31 schrieb Jeff Law :
>
>
>
>> On 5/28/24 1:01 AM, Richard Biener wrote:
>>> On Fri, May 24, 2024 at 10:46 AM Mariam Arutunian
>>> wrote:
>>>
>>> This patch adds a new compiler pass aimed at identif
> Am 29.05.2024 um 15:30 schrieb Eric Botcazou :
>
> Hi,
>
> Ada doesn't have an equivalent to transparent union types in GNU C so, when it
> needs to interface a C function that takes a parameter of a transparent union
> type, GNAT uses the type of the first member of the union on the Ada
When vectorizing an early break loop with LENs (do we miss some
check here to disallow this?) we can end up deciding to insert
stmts after a GIMPLE_COND when doing SLP scheduling and trying
to be conservative with placing of stmts only dependent on
the implicit loop mask/len. The following avoids
The following avoids dumping 'vectorizing stmts using SLP' for
single-lane instances since that causes extra testsuite fallout.
* tree-vect-slp.cc (vect_schedule_slp): Gate dumping
'vectorizing stmts using SLP' on > 1 lanes.
---
gcc/tree-vect-slp.cc | 3 ++-
1 file changed, 2
The following performs single-lane SLP discovery for reductions.
It requires a fixup for outer loop vectorization where a check
for multiple types needs adjustments as otherwise bogus pointer
IV increments happen when there are multiple copies of vector stmts
in the inner loop.
For the reduction
On Mon, May 27, 2024 at 8:29 AM wrote:
>
> From: Pan Li
>
> After we support one gassign form of the unsigned .SAT_ADD, we
> would like to support more forms including both the branch and
> branchless. There are 5 other forms of .SAT_ADD, list as below:
>
> Form 1:
> #define SAT_ADD_U_1(T)
On Mon, May 27, 2024 at 2:48 AM Andrew Pinski wrote:
>
> While looking into something else, I noticed that `a ^ CST` needed to be
> special casing to bitwise_inverted_equal_p as it would simplify to `a ^ ~CST`
> for the bitwise not.
>
> Bootstrapped and tested on x86_64-linux-gnu with no
On Mon, May 27, 2024 at 2:47 AM Andrew Pinski wrote:
>
> While working on adding matching of negative expressions of `a - b`,
> I noticed that we started to have "duplicated" patterns due to not having
> a way to match maybe negative expressions. So I went back to what I did for
> bit_not and
On Fri, May 24, 2024 at 9:29 AM liuhongt wrote:
>
> Update in V3:
> > Since this was about vectorization can you instead add a testcase to
> > gcc.dg/vect/ and check for
> > vectorization to happen?
> Move to vect/pr112325.c.
> >
> > I believe the if (unr_insn <= 0) check can go as well.
>
On Wed, 29 May 2024, Richard Biener wrote:
> On Wed, 29 May 2024, Richard Sandiford wrote:
>
> > Richard Biener writes:
> > > Code generation for contiguous load vectorization can already deal
> > > with generalized avoidance of loading from a gap. The fol
The following arranges for the pre-SLP vectorization scalar cleanup
to be run when predictive commoning was applied to a loop in the
function. This is similar to the complete unroll situation and
facilitating SLP vectorization. Avoiding the SSA copies in predictive
commoning itself isn't easy
On Wed, 29 May 2024, Richard Sandiford wrote:
> Richard Biener writes:
> > Code generation for contiguous load vectorization can already deal
> > with generalized avoidance of loading from a gap. The following
> > extends detection of peeling for gaps requirement wi
ret_type = TREE_TYPE (lhs);
> + tree arg_type = TREE_TYPE (arg);
> + tree new_rhs;
> +
> + unsigned int ret_elt_bits = vector_element_bits (ret_type);
> + unsigned int arg_elt_bits = vector_element_bits (arg_type);
> + if (ret_elt_bits <= arg_elt_bits || code != FLOAT_EXPR)
> +return false;
> +
> + unsigned short target_size;
> + scalar_mode tmp_cvt_mode;
> + scalar_mode lhs_mode = GET_MODE_INNER (TYPE_MODE (ret_type));
> + scalar_mode rhs_mode = GET_MODE_INNER (TYPE_MODE (arg_type));
> + tree cvt_type = NULL_TREE;
> + target_size = GET_MODE_SIZE (lhs_mode);
> + int rhs_size = GET_MODE_BITSIZE (rhs_mode);
> + if (!int_mode_for_size (rhs_size, 0).exists (_cvt_mode))
> +return false;
> +
> + opt_scalar_mode mode_iter;
> + enum tree_code tc1, tc2;
> + unsigned HOST_WIDE_INT nelts
> += constant_lower_bound (TYPE_VECTOR_SUBPARTS (arg_type));
> +
> + FOR_EACH_2XWIDER_MODE (mode_iter, tmp_cvt_mode)
> +{
> + tmp_cvt_mode = mode_iter.require ();
> +
> + if (GET_MODE_SIZE (tmp_cvt_mode) > target_size)
> + break;
> +
> + scalar_mode cvt_mode;
> + int tmp_cvt_size = GET_MODE_BITSIZE (tmp_cvt_mode);
> + if (!int_mode_for_size (tmp_cvt_size, 0).exists (_mode))
> + break;
> +
> + int cvt_size = GET_MODE_BITSIZE (cvt_mode);
> + bool isUnsigned = TYPE_UNSIGNED (ret_type) || TYPE_UNSIGNED (arg_type);
> + cvt_type = build_nonstandard_integer_type (cvt_size, isUnsigned);
> +
> + cvt_type = build_vector_type (cvt_type, nelts);
> + if (cvt_type == NULL_TREE
> + || !supportable_convert_operation ((tree_code) code,
> + ret_type,
> + cvt_type, )
> + || !supportable_convert_operation ((tree_code) NOP_EXPR,
> + cvt_type,
> + arg_type, ))
> + continue;
> +
> + new_rhs = make_ssa_name (cvt_type);
> + g = vect_gimple_build (new_rhs, tc2, arg);
> + gsi_insert_before (gsi, g, GSI_SAME_STMT);
> + g = gimple_build_assign (lhs, tc1, new_rhs);
> + gsi_replace (gsi, g, false);
> + return true;
> +}
> + return false;
> +}
> +
So the above improve the situation where the target can handle
the two-step conversion. It doesn't really allow this to work
for too large vectors AFAICS (nor does it try pack/unpack for
any of the conversions). It also still duplicates code
that's in the vectorizer. I think you should be able to use
supportable_narrowing_operation and possibly even
supportable_widening_operation (though that needs refatoring to
avoid the vectorizer internal stmt_vec_info type - possibly
simply by gating the respective code on a non-NULL vinfo). Both
support multi-step conversions.
> /* Expand VEC_CONVERT ifn call. */
>
> static void
> @@ -1871,14 +2009,21 @@ expand_vector_conversion (gimple_stmt_iterator *gsi)
>else if (ret_elt_bits > arg_elt_bits)
> modifier = WIDEN;
>
> + if (supportable_convert_operation (code, ret_type, arg_type, ))
> +{
> + g = gimple_build_assign (lhs, code1, arg);
> + gsi_replace (gsi, g, false);
> + return;
> +}
> +
> + if (supportable_indirect_narrowing_operation(gsi, code, lhs, arg))
> +return;
> +
> + if (supportable_indirect_widening_operation(gsi, code, lhs, arg))
> +return;
> +
>if (modifier == NONE && (code == FIX_TRUNC_EXPR || code == FLOAT_EXPR))
> {
> - if (supportable_convert_operation (code, ret_type, arg_type, ))
> - {
> - g = gimple_build_assign (lhs, code1, arg);
> - gsi_replace (gsi, g, false);
> - return;
> - }
>/* Can't use get_compute_type here, as supportable_convert_operation
>doesn't necessarily use an optab and needs two arguments. */
>tree vec_compute_type
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Wed, May 29, 2024 at 10:39 AM Feng Xue OS
wrote:
>
> Ok. Then I will add a TODO comment on "bbs" field to describe it.
Fine with me.
Thanks,
Richard.
> Thanks,
> Feng
>
>
> ________
> From: Richard Biener
> Sent: Wedne
On Wed, May 29, 2024 at 1:39 AM Patrick O'Neill wrote:
>
> From: Greg McGary
>
> gcc/ChangeLog:
> * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent
> divide-by-zero.
> * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove xfail.
> ---
>
On Tue, May 28, 2024 at 8:57 PM Andrew MacLeod wrote:
>
> The original patch causing the PR made ranger's cache re-entrant to
> enable SCEV to use the current range_query when called from within ranger..
>
> SCEV uses the currently active range query (via get_range_query()) for
> picking up
vec_info, bool = true);
> @@ -679,9 +685,6 @@ public:
> /* The loop to which this info struct refers to. */
>class loop *loop;
>
> - /* The loop basic blocks. */
> - basic_block *bbs;
> -
>/* Number of latch executions. */
>tree num_itersm1;
>/* Number of itera
On Tue, May 28, 2024 at 11:10 PM Qing Zhao wrote:
>
>
>
> > On May 28, 2024, at 03:43, Richard Biener
> > wrote:
> >
> > On Fri, Apr 12, 2024 at 3:55 PM Qing Zhao wrote:
> >>
> >> to carry the TYPE of the flexible array.
> >>
On Tue, May 28, 2024 at 11:09 PM Qing Zhao wrote:
>
> Thank you for the comments. See my answers below:
>
> Joseph, please see the last question, I need your help on it. Thanks a lot
> for the help.
>
> Qing
>
> > On May 28, 2024, at 03:38, Richard Biener
>
On Tue, May 28, 2024 at 9:46 PM Harald Anlauf wrote:
>
> Hi Andre,
>
> On 5/28/24 14:10, Andre Vehreschild wrote:
> > Hi all,
> >
> > the attached patch fixes a memory leak with unlimited polymorphic return
> > types.
> > The leak occurred, because an expression with side-effects was evaluated
The dump scanning is supposed to check that we do not merge two
sligtly different gathers into one SLP node but since we now
SLP the store scanning for "ectorizing stmts using SLP" is no
longer good. Instead the following makes us look for
"stmt 1 .* = .MASK" which would be how the second lane of
The stored-to ANYTHING handling has more holes, uncovered by treating
volatile accesses as ANYTHING. We fail to properly build the
pred and succ graphs, in particular we may not elide direct nodes
from receiving from STOREDANYTHING.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
Code generation for contiguous load vectorization can already deal
with generalized avoidance of loading from a gap. The following
extends detection of peeling for gaps requirement with that,
gets rid of the old special casing of a half load and makes sure
when we do access the gap we have
We process asm memory input/outputs with constraints to ESCAPED
but for this temporarily build an ADDR_EXPR. The issue is that
the used build_fold_addr_expr ends up wrapping the ADDR_EXPR in
a conversion which ends up producing constraints which
is quite bad. The following uses
On Tue, May 28, 2024 at 9:09 AM Kewen.Lin wrote:
>
> Hi,
>
> on 2024/5/27 20:54, Richard Biener wrote:
> > On Mon, May 27, 2024 at 11:37 AM HAO CHEN GUI wrote:
> >>
> >> Hi,
> >> This patch adds an optab for __builtin_isfinite. The finite check
On Tue, May 28, 2024 at 1:38 PM Alexander Monakov wrote:
>
>
> On Tue, 28 May 2024, Richard Biener wrote:
>
> > > I am a bit confused what you mean by "cheaper". Could it be that we are
> > > not
> > > on the same page regarding the machine code
On Tue, May 28, 2024 at 11:46 AM Alexander Monakov wrote:
>
>
> On Tue, 28 May 2024, Richard Biener wrote:
>
> > On Wed, May 15, 2024 at 12:59 PM Alexander Monakov
> > wrote:
> > >
> > >
> > > Hello,
> > >
> > > I'd like to
The following avoids accounting single-lane SLP to the discovery
limit. As the two testcases show this makes discovery fail,
unfortunately even not the same across targets. The following
should fix two FAILs for GCN as a side-effect.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
On Sat, May 25, 2024 at 4:54 PM Feng Xue OS wrote:
>
> Both derived classes ( loop_vec_info/bb_vec_info) have their own "bbs"
> field, which have exactly same purpose of recording all basic blocks
> inside the corresponding vect region, while the fields are composed by
> different data type, one
On Sat, May 25, 2024 at 4:45 PM Feng Xue OS wrote:
>
> Some utility functions (such as vect_look_through_possible_promotion) that are
> to find out certain kind of direct or indirect definition SSA for a value, may
> return the original one of the SSA, not its pattern representative SSA, even
>
On Fri, May 24, 2024 at 11:27 AM Feng Xue OS
wrote:
>
> Hi,
>
> The patch was updated with the newest trunk, and also contained some minor
> changes.
>
> I am working on another new feature which is meant to support pattern
> recognition
> of lane-reducing operations in affine closure
On Mon, May 13, 2024 at 5:25 PM Andrew Pinski wrote:
>
> This is an expansion of the optimize `a == CST & a`
> to handle more than just casts. It adds optimization
> for unary.
> The patch for binary operators will come later.
>
> Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
When the neutral op is the initial value we might need to convert
it from pointer to integer.
Bootstrapped and tested no x86_64-unknown-linux-gnu, pushed.
This shows with the SLP single-lane reduction discovery.
* tree-vect-loop.cc (get_initial_defs_for_reduction): Convert
On Fri, Apr 12, 2024 at 3:55 PM Qing Zhao wrote:
>
> to carry the TYPE of the flexible array.
>
> Such information is needed during tree-object-size.cc.
>
> We cannot use the result type or the type of the 1st argument
> of the routine .ACCESS_WITH_SIZE to decide the element type
> of the
On Fri, Apr 12, 2024 at 3:54 PM Qing Zhao wrote:
>
I have no comments here, if Siddesh is OK with this I approve.
> gcc/ChangeLog:
>
> * tree-object-size.cc (access_with_size_object_size): New function.
> (call_object_size): Call the new function.
>
> gcc/testsuite/ChangeLog:
>
On Fri, Apr 12, 2024 at 3:54 PM Qing Zhao wrote:
>
> Including the following changes:
> * The definition of the new internal function .ACCESS_WITH_SIZE
> in internal-fn.def.
> * C FE converts every reference to a FAM with a "counted_by" attribute
> to a call to the internal function
On Wed, May 15, 2024 at 12:59 PM Alexander Monakov wrote:
>
>
> Hello,
>
> I'd like to ask if anyone has any new thoughts on this patch.
>
> Let me also point out that valgrind/memcheck.h is permissively
> licensed (BSD-style, rest of Valgrind is GPLv2), with the intention
> to allow importing
On Mon, May 27, 2024 at 9:48 AM Jiawei wrote:
>
> Return NULL_TREE when genop3 equal EXACT_DIV_EXPR.
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652641.html
>
> version log v3: remove additional POLY_INT_CST check.
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652795.html
OK.
On Tue, May 28, 2024 at 8:37 AM HAO CHEN GUI wrote:
>
> Hi,
> This patch adds an optab for __builtin_isnormal. The normal check can be
> implemented on rs6000 by a single instruction. It needs an optab to be
> expanded to the certain sequence of instructions.
>
> The subsequent patches will
On Tue, May 28, 2024 at 8:36 AM HAO CHEN GUI wrote:
>
> Hi,
> This patch adds an optab for __builtin_isfinite. The finite check can be
> implemented on rs6000 by a single instruction. It needs an optab to be
> expanded to the certain sequence of instructions.
>
> The subsequent patches will
On Fri, May 24, 2024 at 10:46 AM Mariam Arutunian
wrote:
>
> This patch adds a new compiler pass aimed at identifying naive CRC
> implementations,
> characterized by the presence of a loop calculating a CRC (polynomial long
> division).
> Upon detection of a potential CRC, the pass prints an
On Tue, May 28, 2024 at 1:24 AM Andrew MacLeod wrote:
>
> The strlen pass currently has a local ranger instance, but when it
> invokes SCEV or any other shared component, SCEV will not be able to
> access to this ranger as it uses get_range_query(). They will be stuck
> with global ranges.
>
>
On Mon, May 27, 2024 at 5:16 PM Jeff Law wrote:
>
>
>
> On 5/27/24 12:38 AM, Richard Biener wrote:
> > On Fri, May 24, 2024 at 10:44 AM Mariam Arutunian
> > wrote:
> >>
> >> This patch introduces new built-in functions to GCC for computi
When points-to analysis finds SCCs it marks the wrong node as being
part of a found cycle. It only wants to mark the node it collapses
to but marked the entry node found rather than the one it collapses
to. This causes fallout in the patch for PR115236 but generally
weakens the points-to
On Mon, May 27, 2024 at 11:37 AM HAO CHEN GUI wrote:
>
> Hi,
> This patch adds an optab for __builtin_isfinite. The finite check can be
> implemented on rs6000 by a single instruction. It needs an optab to be
> expanded to the certain sequence of instructions.
>
> The subsequent patches will
On Fri, 24 May 2024, Richard Biener wrote:
> This is the second merge proposed from the SLP vectorizer branch.
> I have again managed without adding and using --param vect-single-lane-slp
> but instead this provides always enabled functionality.
>
> This makes us use SLP redu
The following makes sure the virtual operand updating when sinking
stores works for the case we ignore paths to kills. The final
sink location might not post-dominate the original stmt location
which would require inserting of a virtual PHI which we do not support.
Bootstrapped and tested on
For the following testcase we fail to demangle
_ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnernwEm and
_ZZN5OuterIvE6methodIvEEvvQ3cstITL0__EEN5InnerdlEPv and in turn end
up building NULL references. The following puts in a safeguard for
faile demangling into -Waccess.
Bootstrapped and tested on
On Fri, May 24, 2024 at 5:33 PM Jiawei wrote:
>
> Return NULL_TREE when match the POLY_INT case.
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652641.html
>
> gcc/ChangeLog:
>
> * tree-ssa-pre.cc (create_component_ref_by_pieces_1): New
> * conditions.
>
>
On Sat, May 25, 2024 at 8:34 PM Jeff Law wrote:
>
>
>
> On 5/24/24 2:42 AM, Mariam Arutunian wrote:
> >gcc/testsuite/gcc.c-torture/compile/
> >
> > * crc-11.c: New test.
> > * crc-15.c: Likewise.
> > * crc-16.c: Likewise.
> > * crc-19.c: Likewise.
> > * crc-2.c:
On Fri, May 24, 2024 at 10:44 AM Mariam Arutunian
wrote:
>
> This patch introduces new built-in functions to GCC for computing bit-forward
> and bit-reversed CRCs.
> These builtins aim to provide efficient CRC calculation capabilities.
> When the target architecture supports CRC operations (as
On Mon, May 27, 2024 at 6:14 AM Andrew Pinski wrote:
>
> I noticed while working on the `a ^ CST` patch, that bitwise_inverted_equal_p
> would check INTEGER_CST directly and not handle vector csts that are uniform.
> This moves over to using uniform_integer_cst_p instead of checking INTEGER_CST
>
On Mon, May 27, 2024 at 4:10 AM HAO CHEN GUI wrote:
>
> Hi,
> Gently ping it.
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/652170.html
>
> Thanks
> Gui Haochen
>
> 在 2024/5/20 16:15, HAO CHEN GUI 写道:
> > Hi,
> > This patch adds an optab for __builtin_isfinite. The finite check can be
This extends the PR65518 workaround to also apply for single-lane SLP.
* tree-vect-stmts.cc (get_group_load_store_type): For SLP also
check for the PR65518 single-element interleaving case as done in
vect_grouped_load_supported.
---
gcc/tree-vect-stmts.cc | 17
This extends optimized reduction epilog handling to cover the
trivial single-lane SLP reduction case.
* tree-vect-loop.cc (vect_create_epilog_for_reduction): Allow
direct opcode and shift reduction also for SLP reductions
with a single lane.
---
gcc/tree-vect-loop.cc | 4
The following avoids dumping 'vectorizing stmts using SLP' for
single-lane instances since that causes extra testsuite fallout.
* tree-vect-slp.cc (vect_schedule_slp): Gate dumping
'vectorizing stmts using SLP' on > 1 lanes.
---
gcc/tree-vect-slp.cc | 3 ++-
1 file changed, 2
This fixes the check for multiple types which go wrong I think
because of bogus pointer IV increments when there are multiple
copies of vector stmts in the inner loop.
* tree-vect-stmts.cc (vectorizable_load): Avoid outer loop
SLP vectorization with multi-copy vector stmts in the
This is the second merge proposed from the SLP vectorizer branch.
I have again managed without adding and using --param vect-single-lane-slp
but instead this provides always enabled functionality.
This makes us use SLP reductions (a group of reductions) for the
case where the group size is one.
On Fri, May 24, 2024 at 2:35 PM Richard Sandiford
wrote:
>
> create_intersect_range_checks checks whether two access ranges
> a and b are alias-free using something equivalent to:
>
> end_a <= start_b || end_b <= start_a
>
> It has two ways of doing this: a "vanilla" way that calculates
> the
On Fri, May 24, 2024 at 1:49 PM Jiawei wrote:
>
> An ICE bug reported in
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071140.
> https://godbolt.org/z/WE9aGYvoo
>
> Return NULL_TREE when TREE_CODE(op) not equal to SSA_NAME.
The assert is on purpose. Can you open a GCC bug for this
On Fri, May 24, 2024 at 12:20 PM Kewen.Lin wrote:
>
> Hi Joseph and Richi,
>
> on 2024/5/13 21:18, Joseph Myers wrote:
> > On Mon, 13 May 2024, Kewen.Lin wrote:
> >
> >>> In fact replacing all of X_TYPE_SIZE with a single hook might be
> >>> worthwhile
> >>> though this removes the "convenient"
There's not really a good way to test what the testcase wants to
test, the following exchanges one dump scan for another (imperfect)
one.
Pushed.
* gcc.dg/vect/vect-gather-4.c: Scan for not vectorizing using
SLP.
---
gcc/testsuite/gcc.dg/vect/vect-gather-4.c | 2 +-
1 file
iltin_shufflevector (*a, *b, 0, 5, 2, 7);
> + vecu r2 = __builtin_convertvector (r1, vecu);
> + vecu r3 = __builtin_shufflevector (r2, r2, 2, 3, 1, 0);
> + *c = __builtin_convertvector (r3, veci);
> +}
> +
> +/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 2, 7, 5, 0 }" "fre1" } } */
> +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "fre1" } } */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
When sinking code closer to its uses we already try to minimize the
distance we move by inserting at the start of the basic-block. The
following makes sure to sink closest to the control dependence
check of the region we want to sink to as well as make sure to
ignore control dependences that are
On Fri, 24 May 2024, Manolis Tsamis wrote:
> On Fri, May 24, 2024 at 10:46 AM Richard Biener wrote:
> >
> > On Fri, 24 May 2024, Manolis Tsamis wrote:
> >
> > > On Fri, May 24, 2024 at 9:31 AM Richard Biener wrote:
> > > >
> &g
On Fri, 24 May 2024, Manolis Tsamis wrote:
> On Fri, May 24, 2024 at 9:31 AM Richard Biener wrote:
> >
> > On Wed, 22 May 2024, Manolis Tsamis wrote:
> >
> > > The match.pd patterns to merge two vector permutes into one fail when a
> > > potentially no
On Fri, May 24, 2024 at 8:56 AM Richard Biener
wrote:
>
> On Fri, May 24, 2024 at 8:37 AM Li, Pan2 wrote:
> >
> > Thanks Jeff and Richard for suggestion and reviewing.
> >
> > Have another try in phiopt to do the convert from PHI to stmt = cond ? a :
> > b.
n NULL if nothing can be simplified or the resulting simplified value
> with parts pushed if EARLY_P was true. Also rejects non allowed tree code
> @@ -826,6 +908,9 @@ match_simplify_replacement (basic_block cond_bb,
> basic_block middle_bb,
> So, given the conditio
shufflevector (*a, *b, 0, 5, 2, 7);
> + vecu r2 = __builtin_convertvector (r1, vecu);
> + vecu r3 = __builtin_shufflevector (r2, r2, 2, 3, 1, 0);
> + *c = __builtin_convertvector (r3, veci);
> +}
> +
> +/* { dg-final { scan-tree-dump "VEC_PERM_EXPR.*{ 2, 7, 5, 0 }" "fre1" } } */
> +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "fre1" } } */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
00..cd81aa248fe
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/avoid-store-forwarding-2.c
> @@ -0,0 +1,39 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-rtl-avoid_store_forwarding" } */
> +
> +typedef union {
> +char arr_8[8];
> +int long_value;
> +} DataUnion1;
> +
> +long no_ssll_1 (DataUnion1 *data, char x)
> +{
> + data->arr_8[4] = x;
> + return data->long_value;
> +}
> +
> +long no_ssll_2 (DataUnion1 *data, char x)
> +{
> + data->arr_8[5] = x;
> + return data->long_value;
> +}
> +
> +typedef union {
> +char arr_8[8];
> +short long_value[4];
> +} DataUnion2;
> +
> +long no_ssll_3 (DataUnion2 *data, char x)
> +{
> + data->arr_8[4] = x;
> + return data->long_value[1];
> +}
> +
> +long no_ssll_4 (DataUnion2 *data, char x)
> +{
> + data->arr_8[0] = x;
> + return data->long_value[1];
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Store forwarding detected" 0 } } */
> +/* { dg-final { scan-tree-dump-times "Store forwarding avoided" 0 } } */
> diff --git a/gcc/testsuite/gcc.dg/avoid-store-forwarding-3.c
> b/gcc/testsuite/gcc.dg/avoid-store-forwarding-3.c
> new file mode 100644
> index 000..3175f882c86
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/avoid-store-forwarding-3.c
> @@ -0,0 +1,31 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-rtl-avoid_store_forwarding" } */
> +
> +typedef union {
> +char arr_8[8];
> +long long_value;
> +} DataUnion;
> +
> +long ssll_multi_1 (DataUnion **data, char x)
> +{
> + (*data)->arr_8[0] = x;
> + (*data)->arr_8[2] = x;
> + return (*data)->long_value;
> +}
> +
> +long ssll_multi_2 (DataUnion **data, char x)
> +{
> + (*data)->arr_8[0] = x;
> + (*data)->arr_8[1] = 11;
> + return (*data)->long_value;
> +}
> +
> +long ssll_multi_3 (DataUnion **data, char x, short y)
> +{
> + (*data)->arr_8[1] = x;
> + __builtin_memcpy((*data)->arr_8 + 4, , sizeof(short));
> + return (*data)->long_value;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Store forwardings detected" 3 } } */
> +/* { dg-final { scan-tree-dump-times "Store forwardings avoided" 3 } } */
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index 29267589eeb..49957ba3373 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -570,6 +570,7 @@ extern rtl_opt_pass *make_pass_rtl_dse3 (gcc::context
> *ctxt);
> extern rtl_opt_pass *make_pass_rtl_cprop (gcc::context *ctxt);
> extern rtl_opt_pass *make_pass_rtl_pre (gcc::context *ctxt);
> extern rtl_opt_pass *make_pass_rtl_hoist (gcc::context *ctxt);
> +extern rtl_opt_pass *make_pass_rtl_avoid_store_forwarding (gcc::context
> *ctxt);
> extern rtl_opt_pass *make_pass_rtl_store_motion (gcc::context *ctxt);
> extern rtl_opt_pass *make_pass_cse_after_global_opts (gcc::context *ctxt);
> extern rtl_opt_pass *make_pass_rtl_ifcvt (gcc::context *ctxt);
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Thu, May 23, 2024 at 10:55 PM Andrew Pinski wrote:
>
> I noticed that phiprop leaves around phi nodes which
> defines a ssa name which is unused. This just adds a
> bitmap to mark those ssa names and then calls
> simple_dce_from_worklist at the very end to remove
> those phi nodes and all of
On Thu, 23 May 2024, Ian Lance Taylor wrote:
> On Thu, May 23, 2024 at 2:48 PM Martin Uecker wrote:
> >
> > Am Donnerstag, dem 23.05.2024 um 14:30 -0700 schrieb Ian Lance Taylor:
> > > On Thu, May 23, 2024 at 2:00 PM Joseph Myers wrote:
> > > >
> > > > On Tue, 21 May 2024, Martin Uecker wrote:
On Thu, 23 May 2024, Richard Biener wrote:
> The following avoids splitting store dataref groups during SLP
> discovery but instead forces (eventually single-lane) consecutive
> lane SLP discovery for all lanes of the group, creating VEC_PERM
> SLP nodes merging them so the store
The following avoids splitting store dataref groups during SLP
discovery but instead forces (eventually single-lane) consecutive
lane SLP discovery for all lanes of the group, creating VEC_PERM
SLP nodes merging them so the store will always cover the whole group.
With this for example
int
Forgot a check for an SSA name before trying to replace a PHI arg with
its current definition.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/115197
* tree-loop-distribution.cc (copy_loop_before): Constant PHI
args remain the same.
t; _2 = phi_cond_6 ? _1 : 255;
> return _2;
>
> }
>
> -Original Message-
> From: Li, Pan2
> Sent: Thursday, May 23, 2024 12:17 PM
> To: Richard Biener
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com;
> tamar.christ...@arm.com; pins...@gmail.
On Thu, May 23, 2024 at 5:50 AM Peter Damianov wrote:
>
> By default, git has the "autocrlf" """feature""" enabled. This causes the
> files
> to have CRLF line endings when checked out on windows, which in the case of
> configure, causes confusing errors like:
>
> ./gcc/configure: line 14:
On Wed, May 22, 2024 at 7:07 AM liuhongt wrote:
>
> >> Hard to find a default value satisfying all testcases.
> >> some require loop unroll with 7 insns increment, some don't want loop
> >> unroll w/ 5 insn increment.
> >> The original 2/3 reduction happened to meet all those testcases(or the
>
On Thu, May 23, 2024 at 2:24 AM wrote:
>
> From: Pan Li
>
> There are sorts of match pattern for SAT related cases, there will be
> some duplicated code to check the dest, op_0, op_1 are same tree types.
> Aka ternary tree type matches. Thus, add overloaded types_match func
> do this and
On Wed, May 22, 2024 at 8:53 PM Qing Zhao wrote:
>
>
>
> > On May 22, 2024, at 03:38, Richard Biener
> > wrote:
> >
> > On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote:
> >>
> >> On Tue, 2024-05-21 at 15:13 +, Qing Zhao w
When processing a = X constraint we treat it as *ANYTHING = X
during constraint processing but then end up recording it as
= X anyway, breaking constraint graph building. This is
because we only update the local copy of the LHS and not the constraint
itself.
Bootstrap and regtest running on
1 - 100 of 25361 matches
Mail list logo