[Bug tree-optimization/112325] Missed vectorization of reduction after unrolling

2024-02-26 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325

--- Comment #13 from rguenther at suse dot de  ---
On Tue, 27 Feb 2024, liuhongt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
> 
> --- Comment #11 from Hongtao Liu  ---
> 
> >Loop body is likely going to simplify further, this is difficult
> >to guess, we just decrease the result by 1/3.  */
> > 
> 
> This is introduced by r0-68074-g91a01f21abfe19
> 
> /* Estimate number of insns of completely unrolled loop.  We assume
> +   that the size of the unrolled loop is decreased in the
> +   following way (the numbers of insns are based on what
> +   estimate_num_insns returns for appropriate statements):
> +
> +   1) exit condition gets removed (2 insns)
> +   2) increment of the control variable gets removed (2 insns)
> +   3) All remaining statements are likely to get simplified
> +  due to constant propagation.  Hard to estimate; just
> +  as a heuristics we decrease the rest by 1/3.
> +
> +   NINSNS is the number of insns in the loop before unrolling.
> +   NUNROLL is the number of times the loop is unrolled.  */
> +
> +static unsigned HOST_WIDE_INT
> +estimated_unrolled_size (unsigned HOST_WIDE_INT ninsns,
> +unsigned HOST_WIDE_INT nunroll)
> +{
> +  HOST_WIDE_INT unr_insns = 2 * ((HOST_WIDE_INT) ninsns - 4) / 3;
> +  if (unr_insns <= 0)
> +unr_insns = 1;
> +  unr_insns *= (nunroll + 1);
> +
> +  return unr_insns;
> +}
> 
> And r0-93444-g08f1af2ed022e0 try do it more accurately by marking
> likely_eliminated stmt and minus that from total insns, But 2 / 3 is still
> keeped.
> 
> +/* Estimate number of insns of completely unrolled loop.
> +   It is (NUNROLL + 1) * size of loop body with taking into account
> +   the fact that in last copy everything after exit conditional
> +   is dead and that some instructions will be eliminated after
> +   peeling.
> 
> -   NINSNS is the number of insns in the loop before unrolling.
> -   NUNROLL is the number of times the loop is unrolled.  */
> +   Loop body is likely going to simplify futher, this is difficult
> +   to guess, we just decrease the result by 1/3.  */
> 
>  static unsigned HOST_WIDE_INT
> -estimated_unrolled_size (unsigned HOST_WIDE_INT ninsns,
> +estimated_unrolled_size (struct loop_size *size,
>  unsigned HOST_WIDE_INT nunroll)
>  {
> -  HOST_WIDE_INT unr_insns = 2 * ((HOST_WIDE_INT) ninsns - 4) / 3;
> +  HOST_WIDE_INT unr_insns = ((nunroll)
> +* (HOST_WIDE_INT) (size->overall
> +   -
> size->eliminated_by_peeling));
> +  if (!nunroll)
> +unr_insns = 0;
> +  unr_insns += size->last_iteration -
> size->last_iteration_eliminated_by_peeling;
> +
> +  unr_insns = unr_insns * 2 / 3;
>if (unr_insns <= 0)
>  unr_insns = 1;
> -  unr_insns *= (nunroll + 1);
> 
> It looks to me 1 / 3 overestimates the instructions that can be optimised 
> away,
> especially if we've subtracted eliminated_by_peeling

Yes, that 1/3 reduction is a bit odd - you could have the same effect
by increasing the instruction limit by 1/3, but that means it doesn't
really matter, does it?  It would be interesting to see if increasing
the limit by 1/3 and removing the above is neutral on SPEC?

Note this kind of "simplification guessing" is most important for
the 2nd stage unrolling an outer loop with an unrolled inner loop
as there are 2nd level recurrences to be optimized the "elmiminated by
peeling" heuristics do not get (but value-numbering would).  So another
thing to do would be not do the 1/3 reduction for innermost loops
but only for loops up from that.

[Bug target/114125] Support vcond_mask_qiqi and friends.

2024-02-26 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114125

Hongtao Liu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-02-27
 Target||x86_64-*-* i?86-*-*

[Bug tree-optimization/112325] Missed vectorization of reduction after unrolling

2024-02-26 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325

--- Comment #12 from rguenther at suse dot de  ---
On Tue, 27 Feb 2024, liuhongt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
> 
> --- Comment #10 from Hongtao Liu  ---
> (In reply to Hongtao Liu from comment #9)
> > The original case is a little different from the one in PR.
> But the issue is similar, after cunrolli, GCC failed to vectorize the outer
> loop.
> 
> The interesting thing is in estimated_unrolled_size, the original unr_insns is
> 288 which is bigger than param_max_completely_peeled_insns(200), but unr_insn
> is decreased by 1/3 due to
> 
>Loop body is likely going to simplify further, this is difficult
>to guess, we just decrease the result by 1/3.  */
> 
> In practice, this loop body is not simplied for 1/3 of the instructions.
> 
> Considering the unroll factor is 16, the unr_insn is large(192), I was
> wondering if we could add some heuristic algorithm to avoid complete loop
> unroll, because usually for such a big loop, both loop and BB vectorizer may
> not perform well.

There were several attempts at making the unroller guess less (that 1/3
reduction) but work out what actually will be simplified to be able to
shrink those numbers.

My favorite (but never implemented) idea was to code-generate 
optimistically but while running value-numbering on-the-fly on the
code and cost the so simplified unrolled code, stopping when we
reach a limit (and scrap the sofar accumulated code).  While
reasonably "easy" for unrolled code that ends up without branches
it gets complicated for branches.

My most recent attempt at improving was only for tracking what
unrolling estimates as ending up constant.

I think what might be the least controversical thing to do is to split
the instruction limit between the early cunrolli and the late cunroll
passes and lower the ones for cunrolli a lot.

[Bug target/114125] New: Support vcond_mask_qiqi and friends.

2024-02-26 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114125

Bug ID: 114125
   Summary: Support vcond_mask_qiqi and friends.
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: liuhongt at gcc dot gnu.org
  Target Milestone: ---

Quote from https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646587.html

> On Linux/x86_64,
>
> af66ad89e8169f44db723813662917cf4cbb78fc is the first bad commit
> commit af66ad89e8169f44db723813662917cf4cbb78fc
> Author: Richard Biener 
> Date:   Fri Feb 23 16:06:05 2024 +0100
>
> middle-end/114070 - folding breaking VEC_COND expansion
>
> caused
>
> FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr"

This shows that the x86 backend is missing vcond_mask_qiqi and friends
(for AVX512 mask modes).  Either that or both expand_vec_cond_expr_p
and all the machinery behind it (ISEL pass, lowering) should handle
pure integer mode VEC_COND_EXPR via bit operations.  I think quite some
targets now implement patterns for these variants, whatever their
boolean vector modes are.

One complication with the change, which was

  (simplify
   (op @3 (vec_cond:s @0 @1 @2))
-  (vec_cond @0 (op! @3 @1) (op! @3 @2
+  (if (TREE_CODE_CLASS (op) != tcc_comparison
+   || types_match (type, TREE_TYPE (@1))
+   || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK))
+   (vec_cond @0 (op! @3 @1) (op! @3 @2)

is that expand_vec_cond_expr_p can also handle comparison defined
masks, but whether or not we have this isn't visible here so we
can only check whether vcond_mask expansion would work.

We have optimize_vectors_before_lowering_p but we shouldn't even there
turn supported into not supported ops and as said, what's supported or
not cannot be finally decided (if it's only vcond and not vcond_mask
that is supported).  Also optimize_vectors_before_lowering_p is set
for a short time between vectorization and vector lowering and we
definitely do not want to turn supported vectorizer emitted stmts
into ones that we need to lower.  For GCC 15 we should see to move
vector lowering before vectorization (before loop optimization I'd
say) to close this particula hole (and also reliably ICE when the
vectorizer creates unsupported IL).  We also definitely want to
retire vcond expanders (no target I know of supports single-instruction
compare-and-select).

So short term we either live with this regression (the testcase
verifies we perform constant folding to { 0, 0 }), implement
the four missing patterns (qi, hi, si and di missing value mode
vcond_mask patterns) or see to implement generic code for this.

Given precedent I'd tend towards adding the x86 patterns.

Hongtao, can you handle that?

[Bug tree-optimization/88492] SLP optimization generates ugly code

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492

--- Comment #9 from Andrew Pinski  ---
I noticed once I add V4QI and V2HI support to the aarch64 backend, this code
gets even worse.

[Bug tree-optimization/86530] Vectorization failure for a simple loop

2024-02-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86530

--- Comment #8 from Tamar Christina  ---
(In reply to Andrew Pinski from comment #6)
> With my patch for V4QI, we still don't get the best code:
>   vect_perm_even_271 = VEC_PERM_EXPR  4, 6 }>;
>   vect_perm_even_273 = VEC_PERM_EXPR  4, 6 }>;
>   vect_perm_even_275 = VEC_PERM_EXPR  vect_perm_even_273, { 0, 2, 4, 6 }>;
> 
> _275={_264[0], _264[2], _268[0], _268[2]} or
> VEC_PERM<_264, _268, {0, 2, 4, 6}>
> 
> but for some reason we don't reduce it to that perm
> 
> And there is still a lot of extra PERMS than there should be.

Because this loop is not something that can be fixed by using V4QI (we tried
before).

This loop requires improvements to SCEV and SLP. It's loading 16 sequential
bytes as there's no gap between the p1 and p2 values across iterations..

so this loop should vectorized with V16QI and widening additions. So I don't
think this is related to the other example.

So I'll take it back as it requires actual vectorizer work and part of things
we're trying to address in GCC 15.

[Bug tree-optimization/86530] Vectorization failure for a simple loop

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86530

--- Comment #7 from Andrew Pinski  ---
The whole PERM<0,2,1,3> shows up a few times in many other places too.

[Bug tree-optimization/86530] Vectorization failure for a simple loop

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86530

--- Comment #6 from Andrew Pinski  ---
With my patch for V4QI, we still don't get the best code:
  vect_perm_even_271 = VEC_PERM_EXPR ;
  vect_perm_even_273 = VEC_PERM_EXPR ;
  vect_perm_even_275 = VEC_PERM_EXPR ;

_275={_264[0], _264[2], _268[0], _268[2]} or
VEC_PERM<_264, _268, {0, 2, 4, 6}>

but for some reason we don't reduce it to that perm

And there is still a lot of extra PERMS than there should be.

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #30 from Jakub Jelinek  ---
Either tree parmdef = ssa_default_def (cfun, parm) is NULL, or has_zero_uses
(parmdef).
Not sure if has_zero_uses will work properly after some bbs are converted from
GIMPLE to RTL, but maybe it will, I think the expansion generally doesn't
gsi_remove statements it expands nor calls update_stmt on them.  One could
always also just compute in generic code at the start of expansion the number
of unused DECL_HIDDEN_STRING_LENGTH PARM_DECLs at the end of the argument list,
save that as a flag in struct function or where and let the backends use it
from there.

[Bug tree-optimization/112325] Missed vectorization of reduction after unrolling

2024-02-26 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325

--- Comment #11 from Hongtao Liu  ---

>Loop body is likely going to simplify further, this is difficult
>to guess, we just decrease the result by 1/3.  */
> 

This is introduced by r0-68074-g91a01f21abfe19

/* Estimate number of insns of completely unrolled loop.  We assume
+   that the size of the unrolled loop is decreased in the
+   following way (the numbers of insns are based on what
+   estimate_num_insns returns for appropriate statements):
+
+   1) exit condition gets removed (2 insns)
+   2) increment of the control variable gets removed (2 insns)
+   3) All remaining statements are likely to get simplified
+  due to constant propagation.  Hard to estimate; just
+  as a heuristics we decrease the rest by 1/3.
+
+   NINSNS is the number of insns in the loop before unrolling.
+   NUNROLL is the number of times the loop is unrolled.  */
+
+static unsigned HOST_WIDE_INT
+estimated_unrolled_size (unsigned HOST_WIDE_INT ninsns,
+unsigned HOST_WIDE_INT nunroll)
+{
+  HOST_WIDE_INT unr_insns = 2 * ((HOST_WIDE_INT) ninsns - 4) / 3;
+  if (unr_insns <= 0)
+unr_insns = 1;
+  unr_insns *= (nunroll + 1);
+
+  return unr_insns;
+}

And r0-93444-g08f1af2ed022e0 try do it more accurately by marking
likely_eliminated stmt and minus that from total insns, But 2 / 3 is still
keeped.

+/* Estimate number of insns of completely unrolled loop.
+   It is (NUNROLL + 1) * size of loop body with taking into account
+   the fact that in last copy everything after exit conditional
+   is dead and that some instructions will be eliminated after
+   peeling.

-   NINSNS is the number of insns in the loop before unrolling.
-   NUNROLL is the number of times the loop is unrolled.  */
+   Loop body is likely going to simplify futher, this is difficult
+   to guess, we just decrease the result by 1/3.  */

 static unsigned HOST_WIDE_INT
-estimated_unrolled_size (unsigned HOST_WIDE_INT ninsns,
+estimated_unrolled_size (struct loop_size *size,
 unsigned HOST_WIDE_INT nunroll)
 {
-  HOST_WIDE_INT unr_insns = 2 * ((HOST_WIDE_INT) ninsns - 4) / 3;
+  HOST_WIDE_INT unr_insns = ((nunroll)
+* (HOST_WIDE_INT) (size->overall
+   -
size->eliminated_by_peeling));
+  if (!nunroll)
+unr_insns = 0;
+  unr_insns += size->last_iteration -
size->last_iteration_eliminated_by_peeling;
+
+  unr_insns = unr_insns * 2 / 3;
   if (unr_insns <= 0)
 unr_insns = 1;
-  unr_insns *= (nunroll + 1);

It looks to me 1 / 3 overestimates the instructions that can be optimised away,
especially if we've subtracted eliminated_by_peeling

[Bug tree-optimization/86530] Vectorization failure for a simple loop

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86530

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113458
   Assignee|tnfchris at gcc dot gnu.org|pinskia at gcc dot 
gnu.org

--- Comment #5 from Andrew Pinski  ---
Actually I have a patch for this (PR 113458 also) which I will be submitting
for GCC 15.

[Bug ipa/70582] [11/12/13/14 regression] gcc.dg/attr-weakref-1.c FAILs

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70582

--- Comment #21 from GCC Commits  ---
The master branch has been updated by Rainer Orth :

https://gcc.gnu.org/g:8e8eac3dea017eae739eb79d540887bb2cf1dc9f

commit r14-9190-g8e8eac3dea017eae739eb79d540887bb2cf1dc9f
Author: Rainer Orth 
Date:   Tue Feb 27 08:20:25 2024 +0100

testsuite: Fix gcc.dg/attr-weakref-1.c on Solaris/x86 with as [PR70582]

gcc.dg/attr-weakref-1.c FAILs on 32 and 64-bit Solaris/x86 with the
native assembler:

FAIL: gcc.dg/attr-weakref-1.c (test for excess errors)
UNRESOLVED: gcc.dg/attr-weakref-1.c compilation failed to produce
executable

Excess errors:
Assembler: attr-weakref-1.c
"/var/tmp//ccUSaysF.s", line 171 : Multiply defined symbol: "Wv3a"

This is a bug in the native as, which isn't seeing fixes recently.

Since only a single subtest is affected, this patch omits that one.

Tested on i386-pc-solaris2.11 (as and gas) and x86_64-pc-linux-gnu.

2024-02-24  Rainer Orth  

gcc/testsuite:
PR ipa/70582
* gcc.dg/attr-weakref-1.c (dg-additional-options): Define
SOLARIS_X86_AS as appropriate.
(lv3, Wv3a, pv3a): Wrap in !SOLARIS_X86_AS.
(main): Likewise for chk (pv3a).

[Bug target/110411] ICE on simple memcpy test case when allowing generation of vector pair load/store insns

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110411

--- Comment #6 from GCC Commits  ---
The releases/gcc-12 branch has been updated by jeevitha :

https://gcc.gnu.org/g:e8c1c2b6c220bc3518c11e11af5a8c6ca1cdf7e8

commit r12-10181-ge8c1c2b6c220bc3518c11e11af5a8c6ca1cdf7e8
Author: Jeevitha 
Date:   Thu Aug 31 05:40:18 2023 -0500

rs6000: Don't allow AltiVec address in movoo & movxo pattern [PR110411]

There are no instructions that do traditional AltiVec addresses (i.e.
with the low four bits of the address masked off) for OOmode and XOmode
objects. The solution is to modify the constraints used in the movoo and
movxo pattern to disallow these types of addresses, which assists LRA in
resolving this issue. Furthermore, the mode size 16 check has been
removed in vsx_quad_dform_memory_operand to allow OOmode and XOmode, and
quad_address_p already handles less than size 16.

2023-08-31  Jeevitha Palanisamy  

gcc/
PR target/110411
* config/rs6000/mma.md (define_insn_and_split movoo): Disallow
AltiVec address operands.
(define_insn_and_split movxo): Likewise.
* config/rs6000/predicates.md (vsx_quad_dform_memory_operand):
Remove
redundant mode size check.

gcc/testsuite/
PR target/110411
* gcc.target/powerpc/pr110411-1.c: New testcase.
* gcc.target/powerpc/pr110411-2.c: New testcase.

(cherry picked from commit 9ea1248604d7b65009af32103814332f35bd33e2)

[Bug tree-optimization/112325] Missed vectorization of reduction after unrolling

2024-02-26 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325

--- Comment #10 from Hongtao Liu  ---
(In reply to Hongtao Liu from comment #9)
> The original case is a little different from the one in PR.
But the issue is similar, after cunrolli, GCC failed to vectorize the outer
loop.

The interesting thing is in estimated_unrolled_size, the original unr_insns is
288 which is bigger than param_max_completely_peeled_insns(200), but unr_insn
is decreased by 1/3 due to

   Loop body is likely going to simplify further, this is difficult
   to guess, we just decrease the result by 1/3.  */

In practice, this loop body is not simplied for 1/3 of the instructions.

Considering the unroll factor is 16, the unr_insn is large(192), I was
wondering if we could add some heuristic algorithm to avoid complete loop
unroll, because usually for such a big loop, both loop and BB vectorizer may
not perform well.

[Bug sanitizer/113728] libasan uses incorrect prctl prototype

2024-02-26 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113728

Florian Weimer  changed:

   What|Removed |Added

 Resolution|--- |MOVED
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Florian Weimer  ---
Then let's close it. We'll get the fix from LLVM if it ever gets implemented.

[Bug tree-optimization/112325] Missed vectorization of reduction after unrolling

2024-02-26 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325

--- Comment #9 from Hongtao Liu  ---
The original case is a little different from the one in PR.
It comes from ggml

#include 
#include 

typedef uint16_t ggml_fp16_t;
static float table_f32_f16[1 << 16];

inline static float ggml_lookup_fp16_to_fp32(ggml_fp16_t f) {
uint16_t s;
memcpy(, , sizeof(uint16_t));
return table_f32_f16[s];
}

typedef struct {
ggml_fp16_t d;
ggml_fp16_t m;
uint8_t qh[4];
uint8_t qs[32 / 2];
} block_q5_1;

typedef struct {
float d;
float s;
int8_t qs[32];
} block_q8_1;

void ggml_vec_dot_q5_1_q8_1(const int n, float * restrict s, const void *
restrict vx, const void * restrict vy) {
const int qk = 32;
const int nb = n / qk;

const block_q5_1 * restrict x = vx;
const block_q8_1 * restrict y = vy;

float sumf = 0.0;

for (int i = 0; i < nb; i++) {
uint32_t qh;
memcpy(, x[i].qh, sizeof(qh));

int sumi = 0;

for (int j = 0; j < qk/2; ++j) {
const uint8_t xh_0 = ((qh >> (j + 0)) << 4) & 0x10;
const uint8_t xh_1 = ((qh >> (j + 12)) ) & 0x10;

const int32_t x0 = (x[i].qs[j] & 0xF) | xh_0;
const int32_t x1 = (x[i].qs[j] >> 4) | xh_1;

sumi += (x0 * y[i].qs[j]) + (x1 * y[i].qs[j + qk/2]);
}

sumf += (ggml_lookup_fp16_to_fp32(x[i].d)*y[i].d)*sumi +
ggml_lookup_fp16_to_fp32(x[i].m)*y[i].s;
}

*s = sumf;
}

[Bug c++/104255] parsing function signature fails when it uses a function parameter outside of an unevaluated context

2024-02-26 Thread barry.revzin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104255

Barry Revzin  changed:

   What|Removed |Added

 CC||barry.revzin at gmail dot com

--- Comment #8 from Barry Revzin  ---
(In reply to Patrick Palka from comment #2)
> The error message is obscure, but it seems what GCC has issue with here is
> the use of the function parameter seq2 in the trailing return type occurring
> outside of an unevaluated context.
> 
> I'm not totally sure if the testcase is valid
> (https://eel.is/c++draft/basic.scope.param#note-1 might suggest it's not?),

But we're not using the parameter for its "value" here (which I think means in
the sense of lvalue-to-rvalue conversion... as in reading a parameter of type
int), so I don't think this would be a reason to reject?

[Bug c++/114123] list-initialization with a single element

2024-02-26 Thread yx_liu at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114123

--- Comment #3 from Yaxun Liu  ---
So, since vector has a ctor that accepts initializer list, that ctor is
favored over its copy ctor. With the initializer-list ctor, a is converted to
A(a) first, then {A(a)} is passed to that ctor.

[Bug c++/114124] Rejected use of function parameter as non-type template parameter in trailing-return-type

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114124

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=104255

--- Comment #2 from Andrew Pinski  ---
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104255#c2 also .

[Bug c++/114124] Rejected use of function parameter as non-type template parameter in trailing-return-type

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114124

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Keywords||rejects-valid
 Ever confirmed|0   |1
   Last reconfirmed||2024-02-27

--- Comment #1 from Andrew Pinski  ---
Reduced to something that is C++11 to show this never worked in GCC:
```
struct Constant {
constexpr operator int() const noexcept { return 3; }
};

template 
struct n { };

constexpr 
auto function(Constant s) -> n {
return {};
}
```

[Bug c++/114124] New: Rejected use of function parameter as non-type template parameter in trailing-return-type

2024-02-26 Thread barry.revzin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114124

Bug ID: 114124
   Summary: Rejected use of function parameter as non-type
template parameter in trailing-return-type
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: barry.revzin at gmail dot com
  Target Milestone: ---

Reduced from StackOverflow:

template 
struct Constant {
constexpr operator int() const noexcept { return V; }
};


template 
struct Array { };

auto function(auto s) -> Array {
return {};
}

auto const a = function(Constant<3>{});


gcc trunk rejects this example with:

:10:42: error: template argument 2 is invalid
   10 | auto function(auto s) -> Array {
  |  ^
:10:42: error: template argument 2 is invalid
:10:42: error: template argument 2 is invalid
:10:42: error: template argument 2 is invalid
:10:26: error: invalid template-id
   10 | auto function(auto s) -> Array {
  |  ^
:10:39: error: use of parameter outside function body before '+' token
   10 | auto function(auto s) -> Array {
  |   ^
:10:42: error: use of parameter outside function body before '>' token
   10 | auto function(auto s) -> Array {
  |  ^
:10:1: error: deduced class type 'Array' in function return type
   10 | auto function(auto s) -> Array {
  | ^~~~
:8:8: note: 'template struct Array' declared here
8 | struct Array { };
  |^
:14:16: error: 'function' was not declared in this scope; did you mean
'union'?
   14 | auto const a = function(Constant<3>{});
  |^~~~
  |union

But this exact equivalent formulation of function is accepted:

auto function(auto s) {
return Array{};
}

In this case, we're not actually reading the value of s to form the non-type
template argument, so this should be valid.

[Bug c++/114123] list-initialization with a single element

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114123

--- Comment #2 from Andrew Pinski  ---
Also see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83264#c14 where Jason is
pinging the CWG about the interactions here.

[Bug c++/114123] list-initialization with a single element

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114123

--- Comment #1 from Andrew Pinski  ---
I think GCC's behavior here is correct, see PR 83264 .
Specifically https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83264#c11 .

And https://cplusplus.github.io/CWG/issues/1467.html .

[Bug sanitizer/113284] [14 regression] many failures in asan after r14-6946-ge66dc37b299cac

2024-02-26 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113284

--- Comment #9 from Peter Bergner  ---
(In reply to GCC Commits from comment #8)
> The master branch has been updated by Ilya Leoshkevich :

Bill, can you double check our testsuite results and close this if it's now
fixed?

[Bug c++/114123] New: list-initialization with a single element

2024-02-26 Thread yx_liu at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114123

Bug ID: 114123
   Summary: list-initialization with a single element
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yx_liu at hotmail dot com
  Target Milestone: ---

https://godbolt.org/z/fPd4q7zMd

The issue happens with gcc 13.2 and trunk. The following is the code:

```
#include 
#include 
using namespace std;

struct A {
int x;
A(int x_) : x(x_) {printf("%p : A(int %d)\n", this, x);}
A(const A& a) { x = a.x; printf("%p : A(const A& %p)\n", this, );}
A(const vector& a) { printf ("%p : vector& %p\n", this, );}
};

int main() {
vector a{1,2};
vector b{a};

printf("%ld\n", b.size());
}
```

Based on my understanding of https://cplusplus.github.io/CWG/issues/1467.html,

If T is a class type and the initializer list has a single element of type cv
U, where U is T or a class derived from T, the object is initialized from that
element (by copy-initialization for copy-list-initialization, or by
direct-initialization for direct-list-initialization).

b should be direct-initialized by a, i.e. equivalent to 

vector b(a);

I would expect the copy ctor of vector to be called, and the elements of a
will be copied to b, and b.size() will be 2.

However, A(const vector& a) is called to contruct b and b.size() is 1.

Currently clang trunk has the expected behavior and b.size() is 1.

https://godbolt.org/z/dh7d5x81T

A few days ago when clang tried to implement CWG2137 (
https://github.com/llvm/llvm-project/pull/77768). It showed the same behavior
as gcc. However that PR was reverted and clang went back to the expected
behavior.

[Bug sanitizer/113728] libasan uses incorrect prctl prototype

2024-02-26 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113728

--- Comment #3 from Peter Bergner  ---
(In reply to Florian Weimer from comment #2)
> This has been worked around in glibc. Should we close this issue?

As the bug reporter and given glibc now has a workaround, I think you're fine
to close this if you think there's nothing to be done in GCC.

[Bug target/114098] _tile_loadconfig doesn't work

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

--- Comment #5 from GCC Commits  ---
The releases/gcc-12 branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:23f4aa6c68e24a76d3784bcfdad5a53e46cd8f95

commit r12-10180-g23f4aa6c68e24a76d3784bcfdad5a53e46cd8f95
Author: H.J. Lu 
Date:   Sun Feb 25 10:21:04 2024 -0800

x86: Properly implement AMX-TILE load/store intrinsics

ldtilecfg and sttilecfg take a 512-byte memory block.  With
_tile_loadconfig implemented as

extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
  __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}

GCC sees:

(parallel [
  (asm_operands/v ("ldtilecfg   %X0") ("") 0
   [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
 (const_int -64 [0xffc0])) [1
MEM[(const void * *)_data]+0 S8 A128])]
   [(asm_input:DI ("m"))]
   (clobber (reg:CC 17 flags))])

and the memory operand size is 1 byte.  As the result, the rest of 511
bytes is ignored by GCC.  Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.

gcc/ChangeLog:

PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.

gcc/testsuite/ChangeLog:

PR target/114098
* gcc.target/i386/amxtile-4.c: New test.

(cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)

[Bug target/114098] _tile_loadconfig doesn't work

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098

--- Comment #4 from GCC Commits  ---
The releases/gcc-13 branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:2b3ecdf4fb13471b69d80583e10c5baedfe84d7c

commit r13-8365-g2b3ecdf4fb13471b69d80583e10c5baedfe84d7c
Author: H.J. Lu 
Date:   Sun Feb 25 10:21:04 2024 -0800

x86: Properly implement AMX-TILE load/store intrinsics

ldtilecfg and sttilecfg take a 512-byte memory block.  With
_tile_loadconfig implemented as

extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
  __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}

GCC sees:

(parallel [
  (asm_operands/v ("ldtilecfg   %X0") ("") 0
   [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
 (const_int -64 [0xffc0])) [1
MEM[(const void * *)_data]+0 S8 A128])]
   [(asm_input:DI ("m"))]
   (clobber (reg:CC 17 flags))])

and the memory operand size is 1 byte.  As the result, the rest of 511
bytes is ignored by GCC.  Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.

gcc/ChangeLog:

PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.

gcc/testsuite/ChangeLog:

PR target/114098
* gcc.target/i386/amxtile-4.c: New test.

(cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)

[Bug c++/99426] [modules] failed to read compiled module cluster 1186: Bad file data

2024-02-26 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99426

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
   Last reconfirmed|2021-03-30 00:00:00 |2024-2-21

--- Comment #6 from Patrick Palka  ---
I've observed a similar error on trunk when compiling our xtreme-header
testcase without -fno-module-lazy:

$ g++ -fmodules-ts -std=c++20
gcc/testsuite/g++.dg/modules/xtreme-header_{a.H,b.C}
In module imported at gcc/testsuite/g++.dg/modules/xtreme-header_b.C:4:1:
./gcc/testsuite/g++.dg/modules/xtreme-header_a.H: At global scope:
./gcc/testsuite/g++.dg/modules/xtreme-header_a.H: error: failed to read
compiled module cluster 5541: Bad file data
./gcc/testsuite/g++.dg/modules/xtreme-header_a.H: note: compiled module file is
‘gcm.cache/,/gcc/testsuite/g++.dg/modules/xtreme-header_a.H.gcm’
In file included from
/scratchpad/gcc-build-prefix/include/c++/14.0.1/string:54,
 from
/scratchpad/gcc-build-prefix/include/c++/14.0.1/bitset:52,
 from gcc/testsuite/g++.dg/modules/xtreme-header.h:9:
/scratchpad/gcc-build-prefix/include/c++/14.0.1/bits/basic_string.h:4249:33:
fatal error: failed to load pendings for ‘std::__cxx11::basic_string’

Although this manifests as a serialization error, it's really a GC issue: a
streamed-in local class from entity_ary (which is not a GC root) gets
prematurely GC'd (since it's only reachable from entity_ary), and later when
resolving a reference to this GC'd local class the tt_entity case of
trees_in::tree_node fails.

In order to achieve a small reproducer, it seems setting
--param=ggc-min-heapsize/expand=0 is not enough, we need to add more collection
points:

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 2803824d11e..44c205b2529 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -27367,6 +27367,7 @@ instantiate_pending_templates (int retries)
{
  tree instantiation = reopen_tinst_level ((*t)->tinst);
  bool complete = false;
+ ggc_collect (GGC_COLLECT_FORCE);

  if (TYPE_P (instantiation))
{

With this patch to force GC after every instantiation, the following small
testcase can reproduce the issue:

$ cat xtreme-header.ii
struct string {
  template  void g(T) {
struct _Local { };
  }
};

template  void f(_Tp) { }

inline void foo() {
  string s;
  f(s);
  s.g(0);
}

$ cat xtreme-header_a.H
#include "xtreme-header.ii"

$ cat xtreme-header_b.C
#include "xtreme-header.ii"
import "xtreme-header_a.H";

$ g++ -fmodules-ts xtreme_header_{a.H,b.C}
./xtreme-header_a.H: error: failed to read compiled module cluster 4: Bad file
data
./xtreme-header_a.H: note: compiled module file is
‘gcm.cache/,/xtreme-header_a.H.gcm’
In file included from xtreme-header_b.C:1:
xtreme-header.ii:12:6: fatal error: failed to load pendings for ‘::string’

Another way to observe this GC issue by forcing a GC right before we clear
entity_ary and checking if any trees within entity_ary have been freed:

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 106af7bdb3e..a6edc9b033a 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -20460,6 +20460,12 @@ fini_modules (cpp_reader *reader, void *cookie, bool
has_inits)
   /* No need to lookup modules anymore.  */
   modules_hash = NULL;

+  ggc_collect (GGC_COLLECT_FORCE);
+  if (entity_ary)
+for (binding_slot& t : *entity_ary)
+  if (!t.is_lazy () && (tree)t && TREE_CODE ((tree)t) == 0xa5a5)
+   printf ("XXX\n");
+
   /* Or entity array.  We still need the entity map to find import numbers. 
*/
   vec_free (entity_ary);
   entity_ary = NULL;

With that the following minimal testcase demonstrates the GC issue:

$ cat xtreme-header.ii
template  void _M_construct() { struct A { }; }

$ g++ -fmodules-ts xtreme-header_{a.H,b.C}
XXX

[Bug middle-end/114087] RISC-V optimization on checking certain bits set ((x & mask) == val)

2024-02-26 Thread andrew at sifive dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114087

Andrew Waterman  changed:

   What|Removed |Added

 CC||andrew at sifive dot com

--- Comment #1 from Andrew Waterman  ---
Note that, in some of these cases, there is a tradeoff.  If this code were
executed in a loop, such that the constant loads could be hoisted, (2a) would
have a shorter critical path than (2c); likewise (3a) vs. (3c).

[Bug debug/111409] Invalid .debug_macro.dwo macro information for split DWARF

2024-02-26 Thread osandov at osandov dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111409

Omar Sandoval  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Omar Sandoval  ---
Thanks for the test fix. I believe this is now resolved.

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-02-26 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #29 from Peter Bergner  ---
(In reply to Jakub Jelinek from comment #28)
> Yes, so it is the backend that told function.cc that there is a parameter
> save area and it should be adding REG_EQUIV notes.  So, the idea would be
> that for the case we talk about (<= 8 normal arguments, then only unused
> DECL_HIDDEN_STRING_LENGTH ones) that the backend would also say that there
> is no parameter save area, basically pretend there are <= 8 arguments.

How can we know there are no uses of the hidden arg(s)?  That backend function
is being called at expand time, so we haven't yet run any RTL dataflow
information to tell us.  Is there some tree attribute for the arg that can tell
is whether it's used or not?  ...or is there some SSA data for that arg that
can show it has no use?  ...and if so, would that still work for -O0 compiles?

[Bug target/114028] [14] RISC-V rv64gcv_zvl256b: miscompile at -O3

2024-02-26 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114028

Patrick O'Neill  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Patrick O'Neill  ---
Resolved.

[Bug target/114122] New: RISC-V: poor code generation in calling convention with vlen > 4096

2024-02-26 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114122

Bug ID: 114122
   Summary: RISC-V: poor code generation in calling convention
with vlen > 4096
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ewlu at rivosinc dot com
  Target Milestone: ---

godbolt: https://godbolt.org/z/9bKPWYn65

For vector sizes with vlen > 4096, we generate more than 2x insns than
necessary for the function.

ex: (from godbolt link)
v256si_RET1_ARG3:
li  a5,128
vsetvli zero,a5,e32,m1,ta,ma
vle32.v v3,0(a2)
vle32.v v1,0(a1)
vle32.v v2,0(a3)
addia1,a1,512 <-- all these addis are unnecessary
addia2,a2,512
addia3,a3,512
addia4,a0,512
vadd.vv v1,v1,v3
vadd.vv v1,v1,v2
vse32.v v1,0(a0) <-- Return value set here, can return now
vle32.v v1,0(a1)
vle32.v v3,0(a2)
vle32.v v2,0(a3)
vadd.vv v1,v1,v3
vadd.vv v1,v1,v2
vse32.v v1,0(a4)
ret

Printing the gimple in the veclower2 (tree) pass, the optab_handler cannot find
an op which supports vlen 8192, which in turn generates vector constructors
that do not get completely optimized out.

 
  sizes-gimplified asm_written BLK
  size 
  unit-size 
  align:128 warn_if_not_align:0 symtab:1226528480 alias-set 1
canonical-type 0x7efc491b7a80 nunits:256
  pointer_to_this >
  length:2
  val 

  def_stmt _10 = _8 + _9;
  version:10>
  val 

  def_stmt _13 = _11 + _12;
  version:13>>

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #8 from H.J. Lu  ---
A patch is posted at

https://patchwork.sourceware.org/project/gcc/list/?series=31343

[Bug analyzer/111305] [13/14 Regression] GCC Static Analyzer -Wanalyzer-out-of-bounds FP and ICE problem

2024-02-26 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111305

David Malcolm  changed:

   What|Removed |Added

   Last reconfirmed||2024-02-26
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #1 from David Malcolm  ---
ICE happens with GCC 14
False +ve happens with GCC 13 and 14

[Bug middle-end/113988] during GIMPLE pass: bitintlower: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5470

2024-02-26 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113988

--- Comment #26 from Zdenek Sojka  ---
Created attachment 57548
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57548=edit
testcase failing with -O -m32

This testcase does not need -mavx512f, but -m32 instead:
$ x86_64-pc-linux-gnu-gcc -O -m32 testcase32.c
during GIMPLE pass: bitintlower
testcase32.c: In function 'foo':
testcase32.c:6:1: internal compiler error: in handle_cast, at
gimple-lower-bitint.cc:1559
6 | foo(void)
  | ^~~
0xd784d7 handle_cast
/repo/gcc-trunk/gcc/gimple-lower-bitint.cc:1559
0x2697a04 lower_mergeable_stmt
/repo/gcc-trunk/gcc/gimple-lower-bitint.cc:2525
0x269bdec lower_stmt
/repo/gcc-trunk/gcc/gimple-lower-bitint.cc:5459
0x269ed49 gimple_lower_bitint
/repo/gcc-trunk/gcc/gimple-lower-bitint.cc:6759
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121

Jakub Jelinek  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek  ---
pass_pre_slp_scalar_cleanup invokes another copy of FRE and I think this goes
wrong in there.
The .USUBC calls emitted by bitintlower1 are:
  _50 = .USUBC (0, _47, _48);
  _61 = .USUBC (0, _60, _51);
  _65 = .USUBC (0, 0, _49);
  _36 = .USUBC (_32, _33, _34);
  _40 = .USUBC (0, 0, _37);
where the first two process even and odd limbs of y from the first
.SUB_OVERFLOW,
the two second operands are initialized with
  _47 = VIEW_CONVERT_EXPR(y)[_45];
and
  _59 = _45 + 1;
  _60 = VIEW_CONVERT_EXPR(y)[_59];
where _45 is an IV going from 0 to 6 in steps of 2 and y has just y[7]
non-zero, all lower limbs zero.  The third .USUBC is the final processing of
the first .SUB_OVERFLOW
and the remaining two are from the second .SUB_OVERFLOW, we can ignore that
now.
Now, in cunroll we can see some jump threading from earlier passes:
  _50 = .USUBC (0, _47, _48);
  _16 = .USUBC (0, _17, _51);
  _74 = .USUBC (0, _73, _51);
  _61 = .USUBC (0, _60, _51);
  _65 = .USUBC (0, 0, _125);
but the _48 vs. _51 last operands clearly identify where it is coming from.
Now the pre-slp fre4 seems to have replaced the second arguments of the 3 calls
with 0s:
  _50 = .USUBC (0, _47, _48);
  _16 = .USUBC (0, 0, _51);
  _74 = .USUBC (0, 0, _51);
  _61 = .USUBC (0, 0, _51);
  _65 = .USUBC (0, 0, _125);
While that it would be correct to replace _47 with 0, because _45 iterates over
the 0, 2, 4 and 6 indexes into the array and the array is known to be 0 there
due to
  __builtin_memset (, 0, 56);
that is not the case for VIEW_CONVERT_EXPR(y)[7].
  _16 = .USUBC (0, _17, _51);
is guarded on _45 <= 3 (aka 0 or 2) and so _17 -> 0 replacement is ok.
  _74 = .USUBC (0, _73, _51);
is guarded on _45 == 4 and VIEW_CONVERT_EXPR(y)[5] is also
known to be 0, so _73 -> 0 is ok as well.
But in
  _59 = _45 + 1;
  _60 = VIEW_CONVERT_EXPR(y)[_59];
  _61 = .USUBC (0, _60, _51);
either we don't know anything, in that case we need to load, or we know that
_45 is 6
and _59 is 7 and VIEW_CONVERT_EXPR(y)[7] is _84, not 0.

[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121

--- Comment #5 from Jakub Jelinek  ---
And it works even with -O2 -ftree-loop-vectorize -fno-tree-slp-vectorize and
doesn't work with -O2 -fno-tree-loop-vectorize -ftree-slp-vectorize so will
need to look at what SLP vectorization does here.

[Bug c++/114114] [11/12/13/14 Regression] Internal compiler error on function-local conditional noexcept

2024-02-26 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114114

Patrick Palka  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=91378
 CC||ppalka at gcc dot gnu.org
   Keywords|needs-bisection |

--- Comment #2 from Patrick Palka  ---
Started with r10-2274-gd40e36310722e6

[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121

--- Comment #4 from Jakub Jelinek  ---
(In reply to Andrew Pinski from comment #2)
> As an aside we at -O3 has:
>   _87 = .USUBC (_30, 3, 0);
>   _93 = IMAGPART_EXPR <_87>;
>   _88 = .USUBC (0, 0, _93);
>   _29 = IMAGPART_EXPR <_88>;
>   _187 = .USUBC (0, 0, _29);
>   _217 = IMAGPART_EXPR <_187>;
>   _218 = .USUBC (0, 0, _217);
>   _214 = IMAGPART_EXPR <_218>;
>   _213 = .USUBC (0, 0, _214);
>   _212 = IMAGPART_EXPR <_213>;
>   _200 = .USUBC (0, 0, _212);
>   _34 = IMAGPART_EXPR <_200>;
>   _35 = .USUBC (0, 0, _34);
>   _36 = IMAGPART_EXPR <_35>;
>   _39 = .USUBC (0, 0, _36);
>   _40 = REALPART_EXPR <_39>;
>   _2 = (signed long) _40;
>   _77 = _2 < 0;
> 
> isn't `.USUBC (0, 0, _93);` just Complex<-_93, _93!=0> ? Or did I
> misunderstand USUBC ?

It is, but would that result in better generated code in case there is .USUBC
(_400, _401, _29); later on?
It would be nice to optimize such long series of .USUBC (0, 0, prev) into just
copying of the result of the first one such call.  For similar long series of
.UADDC (0, 0, prev) that would be just the first one and all the rest is 0+0i
because it never carries over anymore.

[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121

--- Comment #3 from Jakub Jelinek  ---
Indeed.  And -O2 -fno-tree-vectorize works.

I've changed it to
unsigned a, b, c, d, e;
unsigned _BitInt(256) f;

__attribute__((noipa)) unsigned short
bswap16 (int t)
{
  return __builtin_bswap16 (t);
}

void
foo (unsigned z, unsigned _BitInt(512) y, unsigned *r)
{
  unsigned t = __builtin_sub_overflow_p (0, y << 509, f);
  z *= bswap16 (t);
  d = __builtin_sub_overflow_p (c, 3, (unsigned _BitInt(512)) 0);
  unsigned q = z + c + b;
  unsigned short n = q >> (8 + a);
  *r = b + e + n;
}

int
main ()
{
  unsigned x;
  foo (8, 2, );
  if (x != 8)
__builtin_abort ();
}
and bswap16 is called with 1 with -O2 -fno-tree-vectorize and 0 with -O2, so
the problem is either during the computation of y << 509 (but that is fairly
simple thing out of bitintlower, set highest limb to the lowest limb << 61 and
clear all others),
or during the sub overflow.

[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121

--- Comment #2 from Andrew Pinski  ---
As an aside we at -O3 has:
  _87 = .USUBC (_30, 3, 0);
  _93 = IMAGPART_EXPR <_87>;
  _88 = .USUBC (0, 0, _93);
  _29 = IMAGPART_EXPR <_88>;
  _187 = .USUBC (0, 0, _29);
  _217 = IMAGPART_EXPR <_187>;
  _218 = .USUBC (0, 0, _217);
  _214 = IMAGPART_EXPR <_218>;
  _213 = .USUBC (0, 0, _214);
  _212 = IMAGPART_EXPR <_213>;
  _200 = .USUBC (0, 0, _212);
  _34 = IMAGPART_EXPR <_200>;
  _35 = .USUBC (0, 0, _34);
  _36 = IMAGPART_EXPR <_35>;
  _39 = .USUBC (0, 0, _36);
  _40 = REALPART_EXPR <_39>;
  _2 = (signed long) _40;
  _77 = _2 < 0;

isn't `.USUBC (0, 0, _93);` just Complex<-_93, _93!=0> ? Or did I misunderstand
USUBC ?

[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-02-26

--- Comment #1 from Andrew Pinski  ---
Confirmed. Though I wonder if this could be an issue even without using _BitInt
since bitintlower produces the same IR for -O2 and -O3.

[Bug libstdc++/114118] std::is_floating_point<_Float32> and __is_floating<_Float32> are false in C++20 and older

2024-02-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114118

--- Comment #2 from Jonathan Wakely  ---
I don't think we want to enable all the  functions etc. because those
aren't expected to be present before C++23. But the types like _Float32 are
already present, and that seems fine. I just think we should be able to detect
them in the library traits.

At the very least we should make __numeric_traits work for them, as that
already works for __int128 and __float128 when those are defined, even when
they're not considered to be an integer or floating-point type respectively.

[Bug tree-optimization/114121] New: wrong code with _BitInt() arithmetics at -O2

2024-02-26 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121

Bug ID: 114121
   Summary: wrong code with _BitInt() arithmetics at -O2
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
CC: jakub at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 57547
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57547=edit
reduced testcase

Output:
$ x86_64-pc-linux-gnu-gcc -O0 testcase.c && ./a.out
$ x86_64-pc-linux-gnu-gcc -O1 testcase.c && ./a.out
$ x86_64-pc-linux-gnu-gcc -O2 testcase.c && ./a.out
Aborted
$ x86_64-pc-linux-gnu-gcc -O3 testcase.c && ./a.out

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-9167-20240225110837-gd1b241b9506-checking-yes-rtl-df-extra-nobootstrap-pr113988-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-9167-20240225110837-gd1b241b9506-checking-yes-rtl-df-extra-nobootstrap-pr113988-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240225 (experimental) (GCC)

[Bug tree-optimization/114120] New: add reduction with promotion and then truncation poorly vectorized

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114120

Bug ID: 114120
   Summary: add reduction with promotion and then truncation
poorly vectorized
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
Blocks: 53947
  Target Milestone: ---
Target: x86_64

Take:
```
unsigned char f(unsigned char *src)
{
unsigned  sum = 0;
for(int y = 0; y < 8; y++)
{
sum += src[y];
}
return sum;
}
```

On x86_64 we should vectorize to the same as what is done for:
```
unsigned char f0(unsigned char *src)
{
unsigned char sum = 0;
for(int y = 0; y < 8; y++)
{
sum += src[y];
}
return sum;
}
```

But GCC does not as GCC keeps sum in unsigned and the reduction is done in
`unsigned int`.

Note LLVM is able to vectorize this decently.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug tree-optimization/114119] New: add reduction promotion from unsigned char to unsigned not vectorized

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114119

Bug ID: 114119
   Summary: add reduction promotion from unsigned char to unsigned
not vectorized
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64

Take:
```
unsigned  f(unsigned char *src)
{
unsigned sum = 0;
for(int y = 0; y < 8; y++)
{
sum += src[y];
}
return sum;
}
```

This is not vectorized for aarch64 but it is for x86_64.

[Bug analyzer/105898] RFE: -fanalyzer should complain about overlapping args to mempcpy, wmemcpy, and wmempcpy

2024-02-26 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105898

Eric Gallager  changed:

   What|Removed |Added

Summary|RFE: -fanalyzer should  |RFE: -fanalyzer should
   |complain about overlapping  |complain about overlapping
   |args to memcpy and mempcpy  |args to mempcpy, wmemcpy,
   ||and wmempcpy

--- Comment #5 from Eric Gallager  ---
(In reply to David Malcolm from comment #4)
> I implemented this a different way, for memcpy, in r14-3556-g034d99e81484fb
> (by special-casing it).
> 
> We don't yet check mempcpy, wmemcpy, or wmempcp; keeping bug open to handle
> those.

Retitling.

[Bug libstdc++/114118] std::is_floating_point<_Float32> and __is_floating<_Float32> are false in C++20 and older

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114118

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
The reason was mainly that without -std=c++23, most of the library support just
isn't there.  The f16/f32/f64/f128 etc. literal suffixes will result in
pedwarns,
__STDCPP_FLOAT*_T__ isn't defined,  is a C++23 header, etc.
Most of the library changes were guarded with __STDCPP_FLOAT*_T__ macros.
If you think it is worth it enabling it for C++20 or older as well and such
changes
wouldn't cause problems for valid C++20 or 17 etc. code not using the types,
then
all that (perhaps except for bfloat16_t stuff?) would need to start using
__FLT*_MANT_DIG__ and similar macros instead.  But then we also run into a
problem that I think clang++ predefines those even when it doesn't implement
the C++23 paper.

[Bug fortran/114012] overloaded unary operator called twice

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114012

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:2f71e801ad0bb1f620334aadbd7c99cc4efe6309

commit r14-9186-g2f71e801ad0bb1f620334aadbd7c99cc4efe6309
Author: Harald Anlauf 
Date:   Sun Feb 25 21:18:23 2024 +0100

Fortran: do not evaluate polymorphic functions twice in assignment
[PR114012]

PR fortran/114012

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Evaluate non-trivial
arguments just once before assigning to an unlimited polymorphic
dummy variable.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr114012.f90: New test.

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

H.J. Lu  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|hjl.tools at gmail dot com |unassigned at gcc dot 
gnu.org

--- Comment #6 from H.J. Lu  ---
(In reply to Jakub Jelinek from comment #5)
> Yeah.  Not to mention, one can call backtrace even if -g0; you just don't
> get nice names for the addresses.  Without the patch you get crashes in the
> unwinder when doing backtrace.

Should we generate REG_CFA_UNDEFINED for unsaved callee-saved registers to
help unwinder:

https://patchwork.sourceware.org/project/gcc/list/?series=30327

[Bug libstdc++/114118] New: std::is_floating_point<_Float32> and __is_floating<_Float32> are false in C++20 and older

2024-02-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114118

Bug ID: 114118
   Summary: std::is_floating_point<_Float32> and
__is_floating<_Float32> are false in C++20 and older
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

Since GCC 13 we defined _Float32 etc. as distinct types, but the library only
considers them to be floating-point types for C++23 and later, when 
declares the aliases std::float32_t etc.

This means that the proposed solution for PR 114018 only works in C++23:

  // _GLIBCXX_RESOLVE_LIB_DEFECTS
  // 3790. P1467 accidentally changed nexttoward's signature
  template
typename __gnu_cxx::__enable_if<__is_floating<_Tp>::__value, _Tp>::__type
nexttoward(_Tp, long double) = delete; // not defined for extended FP types

For C++20 std::nexttoward(_Float32(0), 0.0L) compiles and selects the float
overload.

To consistently delete them we would need to do:

#if __FLT32_DIG__
  void nexttoward(_Float32, long double) = delete;
#endif

We should probably just make __is_floating<_Float32> true for all -std modes.

And also define __gnu_cxx::__numeric_traits<_Float32>.

[Bug rtl-optimization/113617] [14 Regression] Symbol ... referenced in section `.data.rel.ro.local' of ...: defined in discarded section ... since r14-4944

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113617

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Jakub Jelinek  ---
Fixed.

[Bug c/114112] Error message is translatable but inserts untranslated substring

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114112

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-02-26
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.
An enum should be used instead and then N_( should be used around it.
Like what is done for format_specifier_kind  in c-family/c-format.cc:
```
/* Enum describing the kind of specifiers present in the format and
   requiring an argument.  */
enum format_specifier_kind {
  CF_KIND_FORMAT,
  CF_KIND_FIELD_WIDTH,
  CF_KIND_FIELD_PRECISION
};

static const char *kind_descriptions[] = {
  N_("format"),
  N_("field width specifier"),
  N_("field precision specifier")
};
```

[Bug rtl-optimization/113617] [14 Regression] Symbol ... referenced in section `.data.rel.ro.local' of ...: defined in discarded section ... since r14-4944

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113617

--- Comment #17 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:1931c40364bb9fb0a7c4b650917e3ac0e88bf6f4

commit r14-9185-g1931c40364bb9fb0a7c4b650917e3ac0e88bf6f4
Author: Jakub Jelinek 
Date:   Mon Feb 26 17:55:07 2024 +0100

varasm: Handle private COMDAT function symbol reference in readonly data
section [PR113617]

If default_elf_select_rtx_section is called to put a reference to some
local symbol defined in a comdat section into memory, which happens more
often
since the r14-4944 RA change, linking might fail.
default_elf_select_rtx_section puts such constants into .data.rel.ro.local
etc. sections and if linker chooses comdat sections from some other TU
and discards the one to which a relocation in .data.rel.ro.local remains,
linker diagnoses error.  References to private comdat symbols can only
appear
from functions or data objects in the same comdat group, so the following
patch arranges using .data.rel.ro.local.pool. and similar
sections.

2024-02-26  Jakub Jelinek  
H.J. Lu  

PR rtl-optimization/113617
* varasm.cc (default_elf_select_rtx_section): For
references to private symbols in comdat sections
use .data.relro.local.pool., .data.relro.pool.
or .rodata. comdat sections.

* g++.dg/other/pr113617.C: New test.
* g++.dg/other/pr113617.h: New test.
* g++.dg/other/pr113617-aux.cc: New test.

[Bug c++/114114] [11/12/13/14 Regression] Internal compiler error on function-local conditional noexcept

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114114

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-02-26
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed, reduced further:
```
template
constexpr void test() {
constexpr bool is_yes = yes_or_no;
struct S
{
constexpr S() noexcept(is_yes){}
};
S s;
}
int main()
{
test();
}
```

[Bug c++/114114] [11/12/13/14 Regression] Internal compiler error on function-local conditional noexcept

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114114

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.5
Summary|Internal compiler error on  |[11/12/13/14 Regression]
   |function-local conditional  |Internal compiler error on
   |noexcept|function-local conditional
   ||noexcept
  Known to fail||10.1.0, 9.3.0, 9.5.0
  Known to work||9.1.0, 9.2.0

[Bug middle-end/114111] [avr] Expensive code instead of conditional branch.

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114111

--- Comment #2 from Andrew Pinski  ---
Maybe this is something that could be done during isel to undo what was done in
phiopt ...

[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257

--- Comment #6 from Sam James  ---
Created attachment 57546
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57546=edit
gcc 14 test results

$ gcc-13 --version
gcc-13 (Gentoo 13.2.1_p20240210 p13) 13.2.1 20240210
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc-13 -v -E -x c /dev/null -o /dev/null -march=native 2>&1 | grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/13/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64 -dumpbase null

$ gcc-13 -v -E -x c /dev/null -o /dev/null -mcpu=native 2>&1 | grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/13/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64 -dumpbase null

$ gcc-13 -v -E -x c /dev/null -o /dev/null -march=native -mcpu=native 2>&1 |
grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/13/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64
-march=armv8-a+crc+lse+rcpc+rdma+dotprod+aes+sha3+fp16fml+sb+ssbs+i8mm+bf16+flagm+pauth
-dumpbase null

$ gcc-14 --version
gcc-14 (Gentoo 14.0.1_pre20240211-r1 p22) 14.0.1 20240211 (experimental)
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc-14 -v -E -x c /dev/null -o /dev/null -march=native 2>&1 | grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/14/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64 -dumpbase null

$ gcc-14 -v -E -x c /dev/null -o /dev/null -mcpu=native 2>&1 | grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/14/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64 -dumpbase null

$ gcc-14 -v -E -x c /dev/null -o /dev/null -march=native -mcpu=native 2>&1 |
grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/14/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64
-march=armv8-a+flagm+dotprod+rdma+lse+crc+aes+sha3+fp16fml+rcpc+i8mm+bf16+sb+ssbs+pauth
-dumpbase null

Still hosed :(

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

--- Comment #5 from Jakub Jelinek  ---
Yeah.  Not to mention, one can call backtrace even if -g0; you just don't get
nice names for the addresses.  Without the patch you get crashes in the
unwinder when doing backtrace.

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

--- Comment #4 from Andrew Pinski  ---
(In reply to H.J. Lu from comment #3)
> (In reply to Jakub Jelinek from comment #2)
> > Created attachment 57545 [details]
> > gcc14-pr114116.patch
> > 
> > This seems to fix it, so far tested just on the small testcase, back to the
> > expected backtrace there.
> 
> Should we check -g? Without -g, I don't think we need to save FP.

NO, the code generated with -g should be the same as without ...

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

--- Comment #3 from H.J. Lu  ---
(In reply to Jakub Jelinek from comment #2)
> Created attachment 57545 [details]
> gcc14-pr114116.patch
> 
> This seems to fix it, so far tested just on the small testcase, back to the
> expected backtrace there.

Should we check -g? Without -g, I don't think we need to save FP.

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

--- Comment #2 from Jakub Jelinek  ---
Created attachment 57545
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57545=edit
gcc14-pr114116.patch

This seems to fix it, so far tested just on the small testcase, back to the
expected backtrace there.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #7 from H.J. Lu  ---
Created attachment 57544
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57544=edit
A patch

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2024-02-26
   Target Milestone|--- |14.0
   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com

[Bug other/28322] GCC new warnings and compatibility

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28322

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |4.4.0

[Bug c/114117] -Wno-foo handling

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114117

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Andrew Pinski  ---
Dup.

*** This bug has been marked as a duplicate of bug 63499 ***

[Bug c/63499] gcc treats unknown -Wno-xxx options differently than -Wxxx

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63499

Andrew Pinski  changed:

   What|Removed |Added

 CC||pto at linuxbog dot dk

--- Comment #6 from Andrew Pinski  ---
*** Bug 114117 has been marked as a duplicate of this bug. ***

[Bug c/114117] -Wno-foo handling

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114117

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
(In reply to Peter Toft from comment #0)
> Can gcc adopt the clang-style of giving a warning if -Wno- is used
> for cases where -W does not exist?

No, current behavior have is 100% intentional and documented.

[Bug c/114117] -Wno-foo handling

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114117

Sam James  changed:

   What|Removed |Added

 CC||sjames at gcc dot gnu.org

--- Comment #1 from Sam James  ---
See -Wunknown-warning and the part under -Wfatal-errors at
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wfatal-errors.

[Bug c/114117] New: -Wno-foo handling

2024-02-26 Thread pto at linuxbog dot dk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114117

Bug ID: 114117
   Summary: -Wno-foo handling
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pto at linuxbog dot dk
  Target Milestone: ---

I have worked a lot with clang and gcc compilers for many years, with focus on
C and C++.

It we take something really simple

int f()
{
int x = 1;
return x;
}

and compile with Gcc 13.2 - all fine -> see https://godbolt.org/z/Wxqxzzj1G

However if I then add a "-Wno-" pattern e.g. -Wno-comment I still have a clean
compilation -> https://godbolt.org/z/j5Yf5ozqo

Let me then try to ignore an unknown option "-Wno-petertoft" for the same code
then surprisingly gcc is happy - see https://godbolt.org/z/efxGzhcM1

If I try the same with clang 17 then clang returns the expected
warning: unknown warning option '-Wno-petertoft'; did you mean '-Wno-selector'?
[-Wunknown-warning-option]
See https://godbolt.org/z/TvbzWPaPP

When working with large code-bases with differerent origin, it is quite
challenging to have the silent gcc behaviur that -Wno-say-hello-to-rms-from-me
is silently dropped. The clang behaviour is much more consistent if you ask me.

Can gcc adopt the clang-style of giving a warning if -Wno- is used for
cases where -W does not exist?

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

H.J. Lu  changed:

   What|Removed |Added

   Last reconfirmed||2024-02-26
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com

[Bug c/114042] diagnostics about __builtin_stdc_bit_ceil() mentions __builtin_clzg()

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114042

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Jakub Jelinek  ---
Fixed.

[Bug c/114042] diagnostics about __builtin_stdc_bit_ceil() mentions __builtin_clzg()

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114042

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:77576915cfd26e603aba5295dfdac54a5545f5f2

commit r14-9184-g77576915cfd26e603aba5295dfdac54a5545f5f2
Author: Jakub Jelinek 
Date:   Mon Feb 26 16:30:16 2024 +0100

c: Improve some diagnostics for __builtin_stdc_bit_* [PR114042]

The PR complains that for the __builtin_stdc_bit_* "builtins" the
diagnostics doesn't mention the name of the builtin the user used, but
instead __builtin_{clz,ctz,popcount}g instead (which is what the FE
immediately lowers it to).

The following patch repeats the checks from
check_builtin_function_arguments
which are there done on BUILT_IN_{CLZ,CTZ,POPCOUNT}G, such that they
diagnose it with the name of the "builtin" user actually used before it
is gone.

2024-02-26  Jakub Jelinek  

PR c/114042
* c-parser.cc (c_parser_postfix_expression): Diagnose
__builtin_stdc_bit_* argument with ENUMERAL_TYPE or BOOLEAN_TYPE
type or if signed here rather than on the replacement builtins
in check_builtin_function_arguments.

* gcc.dg/builtin-stdc-bit-2.c: Adjust testcase for actual builtin
names rather than names of builtin replacements.

[Bug rtl-optimization/114044] ICE: in expand_fn_using_insn, at internal-fn.cc:208 with _BitInt() and -O -fno-tree-dce

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114044

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Jakub Jelinek  ---
Created attachment 57543
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57543=edit
gcc14-pr114044.patch

Untested fix on the ifn expansion side.

[Bug rtl-optimization/10837] noreturn attribute causes no sibling calling optimization

2024-02-26 Thread lukas.graetz--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10837

--- Comment #20 from Lukas Grätz  ---
(In reply to Petr Skocik from comment #19)
> IMO(In reply to Xi Ruoyao from comment #16)
>  
> > In practice most _Noreturn functions are abort, exit, ..., i.e. they are
> > only executed one time so optimizing against a cold path does not help much.
> > I don't think it's a good idea to encourage people to construct some fancy
> > code by a recursive _Noreturn function (why not just use a loop?!)  And if
> > you must write such fancy code anyway IMO musttail attribute (PR83324) will
> > be a better solution.
> 
> There's also longjmp, which may not be all that super cold and may be
> executed multiple times. And while yeah, nobody will notice a single call vs
> jmp time save against a process spawn/exit, for a longjmp wrapper, it'll
> make it a few % faster (as would utilizing _Noreturn attributes for better
> register allocation: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097,
> which would also save a bit of codesize too). Taillcalls can also save a bit
> of codesize if the target is near.


Just to emphasize, tail call optimization is not just for speed. It is
essential to avoid waste of stack space. Especially, to avoid potential stack
overflows, it should _not_ be necessary to replace all recursions with loops,
as Xi Ruoyao suggests. Ah, and I also think that recursions in C is not fancy
(anymore), since everyone expects the compiler to do sibcall or similar
optimizations. Noreturn functions are the exception for that. So it would be
consequent indeed to do sibcall optimization for noreturn functions, too!

Personally, I would be satisfied with the new attribute musttail to enforces
tail calls whenever necessary (given that this will be available for C, not C++
only). But speed-wise, musttail might not have the desired effect. It is meant
for preserving stack space.

---

Following Petr Skocik, I quick-tested on my computer:

= longjmp_wrapper.c =
#include 

__attribute__((noreturn))
void longjmp_wrapper(jmp_buf env, int val) {
longjmp(env, val);
}

= longjmp_main.c 
#include 
#include 

__attribute__((noreturn))
void longjmp_wrapper(jmp_buf env, int val);

int main(void) {
jmp_buf env;
for (int i = 0; i < INT_MAX; i++) {
if (setjmp(env) == 0) {
longjmp_wrapper(env, 1);
}
}
}
=

After compiling with

$ gcc -O3 -m32 -c -S longjmp_wrapper.c -o longjmp_wrapper.S

I copied and manually modified the generated longjmp_wrapper.S as follows:

9,15c9
<   subl$20, %esp
<   .cfi_def_cfa_offset 24
<   pushl   28(%esp)
<   .cfi_def_cfa_offset 28
<   pushl   28(%esp)
<   .cfi_def_cfa_offset 32
<   calllongjmp
---
>   jmp longjmp


Then I compiled both versions with longjmp_main.c, again with -m32. Measured
with "time", the sibcall and unmodified version took around 23.5 sec and 24.5
sec on my computer. So around 4 % improvement for 32 bit x86. For 64 bit x86,
both took around 18 secs without noticeable speed difference (perhaps because
both arguments are passed in registers instead of stack by 64 bit calling
conventions).

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

--- Comment #1 from Jakub Jelinek  ---
Maybe introduce TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP or something similar?

[Bug target/66874] RFE: x86_64_fallback_frame_state more robust

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66874

--- Comment #6 from Sam James  ---
Pretty sure my issue is indeed PR114116.

[Bug middle-end/114109] x264 satd vectorization vs LLVM

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114109

Richard Biener  changed:

   What|Removed |Added

 Blocks||53947
 CC||rguenth at gcc dot gnu.org

--- Comment #5 from Richard Biener  ---
There's at least one other bug about this (or a similar) pattern.  Note using
-fno-vect-cost-model isn't really recommended.

Might want to relate the various x264 missed-opt bugs.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug middle-end/114111] [avr] Expensive code instead of conditional branch.

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114111

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-02-26
 Target||avr

--- Comment #1 from Richard Biener  ---
I think RTL expansion only (if even) considers BRANCH_COST.  I also think
that while we have if () to non-branchy code conversion we don't have the
reverse on GIMPLE so RTL expansion sees code like

  _7 = c_3(D) & 64;
  _1 = _7 != 0;
  _2 = (int) _1;
  _5 = _2 + x_4(D);
  return _5;

and when setcc is available it doesn't consider test & branch.  It would
only effectively do

  if (_7 != 0)
_1 = 1;
  else
_1 = 0;
  _2 = (int) _1;
  _5 = _2 + x_4(D);
  return _5;

so probably not help much in practice unless we move the computation
below back into the branch during RTL optimization.

This possibly asks for a better GIMPLE representation, at least for the
purpose of getting good code for AVR.  RTL expansion probably isn't the
best place to fix this.

[Bug tree-optimization/114086] Boolean switches could have a lot better codegen, possibly utilizing bit-vectors

2024-02-26 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114086

--- Comment #9 from Andrew Macleod  ---
(In reply to Jakub Jelinek from comment #8)
> Unfortunately doing the ((682 >> x) & 1) to x & 1 optimization in match.pd
> isn't possible, we can only use global ranges there and we need path
> specific range here.
> Can it be done in VRP pass?  Though, I'm afraid I'm quite lost where it
> actually has
> the statement optimizations (rather than mere computing of ranges),
> Aldy/Andrew, any hints?  I mean like what old tree-vrp.c was doing in
> simplify_stmt_using_ranges.

I don't think much has changed there... We still call into all the code in
vr-values.cc to do simplifications.  I think Aldy changed it all to be
contained in class 'simplify_using_ranges'.. but those routines are all still
in vr-values.cc.   tree-vrp calls into the top level simplfy() routine.

  bool fold_stmt (gimple_stmt_iterator *gsi) override
  {
bool ret = m_simplifier.simplify (gsi);
if (!ret)
  ret = ::fold_stmt (gsi, follow_single_use_edges);
return ret;
  }


If that fails, then rangers fold_stmt() is invoked.  That is merely a
contextual wrapper around a call to gimple-fold::fold_stmt to see if normal
folding can find anything.  Under the covers I believe that invokes match.pd
which, if it was using the current range_query, would get contextual info.



> Guess we could duplicate that in match.pd for the case which can use global
> range or
> doesn't need any range at all.
> I mean
> unsigned int
> foo (int x)
> {
>   return (0xU >> x) & 1;
> }
> 
> unsigned int
> bar (int x)
> {
>   return (0xU >> x) & 1;
> }
> 
> unsigned int
> baz (int x)
> {
>   if (x >= 22) __builtin_unreachable ();
>   return (0x5aU >> x) & 1;
> }
> can be optimized even with global ranges (or the first one with no ranges).
> foo and baz equivalent is x & 1, while bar is (~x) & 1 or (x & 1) ^ 1, dunno
> what is more canonical.

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

Jakub Jelinek  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com
   Priority|P3  |P1
   Target Milestone|--- |14.0

[Bug target/114116] New: [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

Bug ID: 114116
   Summary: [14 Regression] Broken backtraces in bootstrapped
x86_64 gcc
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

The expected ICE on
void
foo (void)
{
  unsigned _BitInt (575) a = 3;
  __builtin_clzg (a);
}
with -fno-tree-dce -O1 (might go away soon when PR114044 is fixed) is from
stage1-gcc/cc1
~/src/gcc/obj88/stage1-gcc/cc1 -quiet pr114044-2.c -fno-tree-dce -O1
during RTL pass: expand
pr114044-2.c: In function ‘foo’:
pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn, at
internal-fn.cc:208
5 |   __builtin_clzg (a);
  |   ^~
0x12e77f6 expand_fn_using_insn
../../gcc/internal-fn.cc:208
0x12f6321 expand_direct_optab_fn
../../gcc/internal-fn.cc:3817
0x12fce16 expand_CLZ
../../gcc/internal-fn.def:444
0x12fdb14 expand_internal_call(internal_fn, gcall*)
../../gcc/internal-fn.cc:4913
0x12fdb3f expand_internal_call(gcall*)
../../gcc/internal-fn.cc:4921
0xf39343 expand_call_stmt
../../gcc/cfgexpand.cc:2771
0xf3e2db expand_gimple_stmt_1
../../gcc/cfgexpand.cc:3932
0xf3e99b expand_gimple_stmt
../../gcc/cfgexpand.cc:4077
0xf48362 expand_gimple_basic_block
../../gcc/cfgexpand.cc:6133
0xf4a9d2 execute
../../gcc/cfgexpand.cc:6872
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

but from gcc/cc1

during RTL pass: expand
pr114044-2.c: In function ‘foo’:
pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn, at
internal-fn.cc:208
5 |   __builtin_clzg (a);
  |   ^~
0x7d9246 expand_fn_using_insn
../../gcc/internal-fn.cc:208

pr114044-2.c:5:3: internal compiler error: Segmentation fault
0x1554262 crash_signal
../../gcc/toplev.cc:319
0x2b20320 x86_64_fallback_frame_state
./md-unwind-support.h:63
0x2b20320 uw_frame_state_for
../../../libgcc/unwind-dw2.c:1013
0x2b2165d _Unwind_Backtrace
../../../libgcc/unwind.inc:303
0x2acbd69 backtrace_full
../../libbacktrace/backtrace.c:127
0x2a32fa6 diagnostic_context::action_after_output(diagnostic_t)
../../gcc/diagnostic.cc:781
0x2a331bb diagnostic_action_after_output(diagnostic_context*, diagnostic_t)
../../gcc/diagnostic.h:1002
0x2a331bb diagnostic_context::report_diagnostic(diagnostic_info*)
../../gcc/diagnostic.cc:1633
0x2a33543 diagnostic_impl
../../gcc/diagnostic.cc:1767
0x2a33c26 internal_error(char const*, ...)
../../gcc/diagnostic.cc:2225
0xe232c8 fancy_abort(char const*, int, char const*)
../../gcc/diagnostic.cc:2336
0x7d9246 expand_fn_using_insn
../../gcc/internal-fn.cc:208
Segmentation fault (core dumped)

I believe this is caused by the r14-8470 change.
The problem can be also seen when running the cc1 in the debugger.
When a breakpoint as added on fancy_abort (.gdbinit normally does that), the
backtrace still looks sane:
#0  fancy_abort (file=file@entry=0x2bd70fb "../../gcc/internal-fn.cc",
line=line@entry=208, function=function@entry=0x2bd76cf "expand_fn_using_insn")
at ../../gcc/diagnostic.cc:2313
#1  0x007d9247 in expand_fn_using_insn (stmt=,
icode=CODE_FOR_nothing, ninputs=1, noutputs=1) at ../../gcc/internal-fn.cc:208
#2  0x00fcd1d0 in expand_call_stmt (stmt=0x7fffea307000) at
../../gcc/cfgexpand.cc:2771
#3  expand_gimple_stmt_1 (stmt=) at
../../gcc/cfgexpand.cc:3932
#4  expand_gimple_stmt (stmt=) at
../../gcc/cfgexpand.cc:4077
#5  0x00fcdf18 in expand_gimple_basic_block (bb=,
disable_tail_calls=false) at ../../gcc/cfgexpand.cc:6133
#6  0x00fd059f in (anonymous namespace)::pass_expand::execute
(this=, fun=) at ../../gcc/cfgexpand.cc:6872
#7  0x0140bff8 in execute_one_pass (pass=) at ../../gcc/passes.cc:2646
#8  0x0140c890 in execute_pass_list_1 (pass=) at ../../gcc/passes.cc:2755
#9  0x0140c8c9 in execute_pass_list (fn=0x7fffea302000, pass=) at ../../gcc/passes.cc:2766
#10 0x01011a26 in cgraph_node::expand (this=) at ../../gcc/context.h:48
#11 cgraph_node::expand (this=) at
../../gcc/cgraphunit.cc:1798
#12 0x010137fb in expand_all_functions () at
../../gcc/cgraphunit.cc:2028
#13 symbol_table::compile (this=0x7fffea13) at ../../gcc/cgraphunit.cc:2402
#14 0x01015e18 in symbol_table::compile (this=0x7fffea13) at
../../gcc/cgraphunit.cc:2315
#15 symbol_table::finalize_compilation_unit (this=0x7fffea13) at
../../gcc/cgraphunit.cc:2587
#16 0x01554742 in compile_file () at ../../gcc/toplev.cc:476
#17 0x00e281cc in do_compile () at 

[Bug c/114113] bogus -Walloc-zero warning

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114113

Richard Biener  changed:

   What|Removed |Added

   Keywords||diagnostic

--- Comment #1 from Richard Biener  ---
I bet we thread the p[n] == 0 case because of the later loop i < n condition.

Consider when p[0] == 0, the code would then call malloc (0).

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #6 from Richard Biener  ---
Maybe we can automatically consider that when handling the ifunc attribute?

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #5 from Andrew Pinski  ---
The obvious workaround is to mark the ifunc_resolver with
no_profile_instrument_function attribute since is only ever called once and
really does not need to be PGO'ed anyways.

[Bug tree-optimization/114107] poor vectorization at -O3 when dealing with arrays of different multiplicity, good with -O2

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #14 from Richard Biener  ---
Mine.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #4 from Andrew Pinski  ---
It is the use of TLS inside an ifunc resolver which seems like causing issues
...

[Bug c++/114104] nodiscard not diagnosed on synthesized operator!=

2024-02-26 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114104

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org

--- Comment #4 from Patrick Palka  ---
(In reply to Harald van Dijk from comment #2)
> For similar useless operations, such as f() ^ true;, GCC emits a similar
> warning "warning: value computed is not used [-Wunused-value]". Presumably,
> if that warning were implemented in GCC for ! as well, it should also fire
> for your original x != 0 test?
That sounds plausible.  The relevant code is

gcc/cp/cvt.cc
@@ -1647,11 +1647,6 @@ convert_to_void (tree expr, impl_conv_void implicit,
tsubst_flags_t complain)
  enum tree_code code = TREE_CODE (e);
  enum tree_code_class tclass = TREE_CODE_CLASS (code);
  if (tclass == tcc_comparison
  || tclass == tcc_unary
  || tclass == tcc_binary
  || code == VEC_PERM_EXPR
  || code == VEC_COND_EXPR)
warn_if_unused_value (e, loc);

which doesn't consider boolean operations (TRUTH_NOT_EXPR / TRUTH_AND_EXPR /
TRUTH_OR_EXPR) because their class is tcc_expression.  This is probably just an
oversight (even the C front end warns for !f()).

[Bug tree-optimization/114099] [14 regression] ICE in find_uses_to_rename_use when building darktable-4.6.1

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Richard Biener  ---
Fixed.

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #16 from Richard Biener  ---
Should be fixed.

[Bug tree-optimization/114099] [14 regression] ICE in find_uses_to_rename_use when building darktable-4.6.1

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:fb68e2cac1283f731a3a979cb714621afb1ddfcc

commit r14-9182-gfb68e2cac1283f731a3a979cb714621afb1ddfcc
Author: Richard Biener 
Date:   Mon Feb 26 12:27:42 2024 +0100

tree-optimization/114099 - virtual LC PHIs and early exit vect

In some cases exits can lack LC PHI nodes for the virtual operand.
We have to create them when the epilog loop requires them which also
allows us to remove some only halfway correct fixups.  This is the
variant triggering for alternate exits.

PR tree-optimization/114099
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Create and fill in a needed virtual LC PHI for the alternate
exits.  Remove code dealing with that missing.

* gcc.dg/vect/vect-early-break_120-pr114099.c: New testcase.

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068

--- Comment #15 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:8293df8019adfffae3384cb6fb9cb6f496fe8608

commit r14-9181-g8293df8019adfffae3384cb6fb9cb6f496fe8608
Author: Richard Biener 
Date:   Mon Feb 26 11:25:50 2024 +0100

tree-optimization/114068 - missed virtual LC PHI after vect peeling

When we choose the IV exit to be one leading to no virtual use we
fail to have a virtual LC PHI even though we need it for the epilog
entry.  The following makes sure to create it so that later updating
works.

PR tree-optimization/114068
* tree-vect-loop-manip.cc (get_live_virtual_operand_on_edge):
New function.
(slpeel_tree_duplicate_loop_to_edge_cfg): Add a virtual LC PHI
on the main exit if needed.  Remove band-aid for the case
it was missing.

* gcc.dg/vect/vect-early-break_118-pr114068.c: New testcase.
* gcc.dg/vect/vect-early-break_119-pr114068.c: Likewise.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #3 from Sam James  ---
(In reply to Sam James from comment #1)
> One of the xz developers, Jia Tan, has kindly minimised it to not need
> BIND_NOW. I've adapted it a bit to cleanup flags and warnings.

(oops, sorry, this one does need it - we were discussing whether we could elide
it but didn't get there yet.)

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #2 from Sam James  ---
The reproducer succeeds for me with Clang 17.0.6, but fails for me with GCC
10..14.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #1 from Sam James  ---
One of the xz developers, Jia Tan, has kindly minimised it to not need
BIND_NOW. I've adapted it a bit to cleanup flags and warnings.

I can reproduce it with the following, at least:
```
#!/bin/sh
gcc-14 -O2 -march=znver2 -fvisibility=hidden -fPIC -fprofile-update=atomic
-fprofile-dir=$(pwd) -fprofile-generate=$(pwd) -c test.c -o test.o -Wall
-Wextra
gcc-14 -o libapp.so test.o -shared -Wl,-z,now -fPIC -lgcov
gcc-14 -o app main.c -lgcov -L. -lapp
LD_LIBRARY_PATH=. ./app
```

main.c:
```
#include 

extern int func();

int main(void)
{
printf( "Hello world %p\n", func);

return 0;
}
```

test.c:
```
__attribute__((visibility("default")))
void *foo_ifunc2() __attribute__((ifunc("foo_resolver")));


__attribute__((visibility("default")))
void bar(void)
{
}

static int f3()
{
return 5;
}


__attribute__((visibility("default")))
void (*foo_resolver(void))(void)
{
f3();
return bar;
}


__attribute__((optimize("O0")))
__attribute__((visibility("default")))
int func()
{
foo_ifunc2();
return 0;
}
```

[Bug gcov-profile/114115] New: xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

Bug ID: 114115
   Summary: xz-utils segfaults when built with -fprofile-generate
(bad interaction between IFUNC and binding?)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

This was first reported downstream in Gentoo at https://bugs.gentoo.org/925415.

xz-utils-5.6.0 (it started to use IFUNC recently for crc32) started to
segfault, but only when built with -march=x86-64-v3 & -fprofile-generate.

For convenience, a broken builddir is available at
http://dev.gentoo.org/~sam/bugs/xz/pgo/xz-5.6.0-abi_x86_64.amd64.tar.xz.

```
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x41b6 in ?? ()
(gdb) bt
#0  0x41b6 in ?? ()
#1  0x7f861b2fcc75 in crc32_resolve () at
/var/tmp/portage/app-arch/xz-utils-5.6.0/work/xz-5.6.0/src/liblzma/check/crc32_fast.c:140
#2  0x7f861b3541e4 in elf_machine_rela (map=,
scope=, reloc=0x7f861b2e05c8, sym=0x7f861b2ddfd8,
version=,
reloc_addr_arg=0x7f861b32ab10 , skip_ifunc=) at ../sysdeps/x86_64/dl-machine.h:314
#3  elf_dynamic_do_Rela (map=0x7f861b343160, scope=,
reladdr=, relsize=, nrelative=,
lazy=,
skip_ifunc=) at
/var/tmp/portage/sys-libs/glibc-2.39-r1/work/glibc-2.39/elf/do-rel.h:147
#4  _dl_relocate_object (l=l@entry=0x7f861b343160, scope=,
reloc_mode=, consider_profiling=,
consider_profiling@entry=0) at dl-reloc.c:301
#5  0x7f861b363d61 in dl_main (phdr=, phnum=,
user_entry=, auxv=) at rtld.c:2311
#6  0x7f861b36059f in _dl_sysdep_start
(start_argptr=start_argptr@entry=0x7ffdeae5bd20,
dl_main=dl_main@entry=0x7f861b362060 )
at ../sysdeps/unix/sysv/linux/dl-sysdep.c:140
#7  0x7f861b361da2 in _dl_start_final (arg=0x7ffdeae5bd20) at rtld.c:494
#8  _dl_start (arg=0x7ffdeae5bd20) at rtld.c:581
#9  0x7f861b360b88 in _start () from /lib64/ld-linux-x86-64.so.2
#10 0x0006 in ?? ()
#11 0x7ffdeae5cfc9 in ?? ()
#12 0x7ffdeae5d021 in ?? ()
#13 0x7ffdeae5d026 in ?? ()
#14 0x7ffdeae5d034 in ?? ()
#15 0x7ffdeae5d03a in ?? ()
#16 0x7ffdeae5d04b in ?? ()
#17 0x in ?? ()
(gdb)
```

```
(gdb) frame 1
#1  0x7f861b2fcc75 in crc32_resolve () at
/var/tmp/portage/app-arch/xz-utils-5.6.0/work/xz-5.6.0/src/liblzma/check/crc32_fast.c:140
140 {
(gdb) list
135 // This resolver is shared between all three dispatch methods. It
serves as
136 // the ifunc resolver if ifunc is supported, otherwise it is called as
a
137 // regular function by the constructor or first call resolution
methods.
138 static crc32_func_type
139 crc32_resolve(void)
140 {
141 return is_arch_extension_supported()
142 ? _arch_optimized : _generic;
143 }
144
(gdb)
```

[Bug tree-optimization/107855] gcc.dg/vect/vect-ifcvt-18.c FAILs

2024-02-26 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107855

--- Comment #8 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #6 from Xi Ruoyao  ---
> Hmm, the test contains
>
> "/* { dg-additional-options "-Ofast -mavx" { target avx_runtime } } */"
>
> So it passes on AVX capable native builds, but fails otherwise.

I can reproduce things in a VM now: when it doesn't have avx support,
the test is compiled with -msse2 only and FAILs both for the dump and
execution:

FAIL: gcc.dg/vect/vect-ifcvt-18.c -flto -ffat-lto-objects  scan-tree-dump vect
"vectorized 3 loops"
FAIL: gcc.dg/vect/vect-ifcvt-18.c -flto -ffat-lto-objects execution test
FAIL: gcc.dg/vect/vect-ifcvt-18.c execution test
FAIL: gcc.dg/vect/vect-ifcvt-18.c scan-tree-dump vect "vectorized 3 loops"

The test aborts here:

Thread 2 received signal SIGABRT, Aborted.

#0  0xfe26e385 in __lwp_sigqueue () from /lib/libc.so.1
#1  0xfe2660ef in thr_kill () from /lib/libc.so.1
#2  0xfe19db82 in raise () from /lib/libc.so.1
#3  0xfe16b1f4 in abort () from /lib/libc.so.1
#4  0x08050d58 in main ()
at
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:34

and the dump shows

/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
note:  === analyze_loop_nest ===
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
note:   === vect_analyze_loop_form ===
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
note:   using as main loop exit: 13 -> 14 [AUX: 0]
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
missed:   not vectorized: unsupported control flow in loop.
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
missed:  bad loop form.
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
missed: couldn't vectorize loop

When I add avx support to the VM, the test PASSes.

It seems the test is missing some requirement here.

[Bug c++/114114] New: Internal compiler error on function-local conditional noexcept

2024-02-26 Thread yves.bailly at hexagon dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114114

Bug ID: 114114
   Summary: Internal compiler error on function-local conditional
noexcept
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yves.bailly at hexagon dot com
  Target Milestone: ---

Created attachment 57542
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57542=edit
Preprocessed file from -save-temps

Tested on:
- Ubuntu 22.04 with distribution's GCC 11.4.0
- Ubuntu 22.04 with "home-build" GCC 13.2.0
- Ubuntu 23.10 with distribution's GCC 13.2.0
- Godbolt's compiler explorer with GCC x86-64 "trunk"

The following code causes an internal compiler error on (1):

--8<-8<-8<-8<-8<-8<-8<-8<---
#include 

enum class YesNo: bool { Yes, No };
template 
[[nodiscard]] constexpr bool isYes(const E e) noexcept {
   return e == E::Yes;
}

template
constexpr void test() {
[[maybe_unused]] constexpr bool is_yes = isYes(yes_or_no); // (1)

struct S
{
#if true // (2)
constexpr S() noexcept(is_yes)
{ std::cout << "boo\n"; }

// The following compiles fine:
#else
constexpr S() noexcept(yes_or_no == YesNo::Yes)
{ std::cout << "boo ok\n"; }
#endif
};

S s;
}

int main()
{
test();
}

--8<-8<-8<-8<-8<-8<-8<-8<---

Changing the "true" to "false" on (2) makes the code compile, link and run
fine.

Note: this code is accepted by Clang and MSVC.


Output of gcc -v:
Using built-in specs.
COLLECT_GCC=/home/ybailly/gcc13/bin/g++
COLLECT_LTO_WRAPPER=/home/ybailly/gcc13/libexec/gcc/x86_64-pc-linux-gnu/13.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-13.2.0/configure --prefix=/home/ybailly/gcc13
--enable-languages=c,c++,fortran --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (GCC) 


Output of "~/gcc13/bin/g++ -o test_gcc.x -std=c++20 test_gcc.cpp":
test_gcc.cpp: In instantiation of ‘constexpr test()::S::S() [with YesNo
yes_or_no = YesNo::Yes]’:
test_gcc.cpp:16:19:   required from ‘constexpr test()::S::S() [with YesNo
yes_or_no = YesNo::Yes]’
test_gcc.cpp:24:5:   required from ‘constexpr void test() [with YesNo yes_or_no
= YesNo::Yes]’
test_gcc.cpp:31:21:   required from here
test_gcc.cpp:11:37: internal compiler error: Segmentation fault
   11 | [[maybe_unused]] constexpr bool is_yes = isYes(yes_or_no); // (1)
  | ^~
0xe013bf crash_signal
../../gcc-13.2.0/gcc/toplev.cc:314
0x7f673d7e851f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x88089e hash_table, tree_node*>
>::hash_entry, false, xcallocator>::find_slot_with_hash(tree_node* const&,
unsigned int, insert_option)
../../gcc-13.2.0/gcc/hash-table.h:1039
0x88089e hash_map, tree_node*>
>::put(tree_node* const&, tree_node* const&)
../../gcc-13.2.0/gcc/hash-map.h:170
0x88089e register_local_specialization(tree_node*, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:1970
0x896009 tsubst_decl
../../gcc-13.2.0/gcc/cp/pt.cc:15446
0x885904 tsubst_copy
../../gcc-13.2.0/gcc/cp/pt.cc:17417
0x886588 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:21676
0x889bed maybe_instantiate_noexcept(tree_node*, int)
../../gcc-13.2.0/gcc/cp/pt.cc:26754
0x88ddb2 regenerate_decl_from_template
../../gcc-13.2.0/gcc/cp/pt.cc:26553
0x88ddb2 instantiate_body
../../gcc-13.2.0/gcc/cp/pt.cc:26865
0x88e678 instantiate_decl(tree_node*, bool, bool)
../../gcc-13.2.0/gcc/cp/pt.cc:27217
0x897f22 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:19397
0x898431 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:18844
0x898431 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:18858
0x898265 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:18844
0x898265 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:19238
0x88de06 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:26930
0x88de06 instantiate_body
../../gcc-13.2.0/gcc/cp/pt.cc:26930
0x88e678 instantiate_decl(tree_node*, bool, bool)
../../gcc-13.2.0/gcc/cp/pt.cc:27217


Preprocessed file attached, greatly reduced by removing  - the same
error appears without it.

Regards,

  1   2   >