[Bug testsuite/113861] c-c++-common/gomp/pr63328.c and gcc.dg/gomp/pr87895-2.c now XPASS

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113861

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-02-10

[Bug testsuite/113861] New: c-c++-common/gomp/pr63328.c and gcc.dg/gomp/pr87895-2.c now XPASS

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113861

Bug ID: 113861
   Summary: c-c++-common/gomp/pr63328.c and
gcc.dg/gomp/pr87895-2.c now XPASS
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: testsuite-fail
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: pinskia at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64

These started to XPASS after r14-6416-gf5fc001a84a7db which was the fix to get
them xpass.

Will submit a patch to un-xfail them tomorrow or the day after.

[Bug c++/103524] [meta-bug] modules issue

2024-02-09 Thread nshead at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 103339, which changed state.

Bug 103339 Summary: [modules] ICE in exporting module on use of outside 
specialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103339

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug c++/103339] [modules] ICE in exporting module on use of outside specialization

2024-02-09 Thread nshead at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103339

Nathaniel Shead  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 CC||nshead at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED

--- Comment #6 from Nathaniel Shead  ---
So fixed.

[Bug c++/103524] [meta-bug] modules issue

2024-02-09 Thread nshead at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 103499, which changed state.

Bug 103499 Summary: C++20 modules error: invalid use of non-static member 
function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103499

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug c++/103499] C++20 modules error: invalid use of non-static member function

2024-02-09 Thread nshead at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103499

Nathaniel Shead  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
   Target Milestone|--- |14.0
   Assignee|unassigned at gcc dot gnu.org  |nshead at gcc dot 
gnu.org
 CC||nshead at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #8 from Nathaniel Shead  ---
Fixed in GCC 14.

[Bug c++/113545] ICE in label_matches with constexpr function with switch-statement and converted (nonconstant, cast address) input

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113545

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Hans-Peter Nilsson :

https://gcc.gnu.org/g:48207a5f00d6ae7cb11038e7c17f6858de4a884e

commit r14-8907-g48207a5f00d6ae7cb11038e7c17f6858de4a884e
Author: Hans-Peter Nilsson 
Date:   Mon Jan 22 01:09:03 2024 +0100

c++: testcases for PR113545 (constexpr with switch and passing
non-constexpr parameter)

Test-cases, with constexpr-reinterpret3.C dg-ice:ing the PR c++/113545 bug.

Regarding the request in the comment, A dg-do run when there's an ICE
will cause some CI's to signal an error for the run being "UNRESOLVED"
(compilation failed to produce executable).  Note that dejagnu (1.6.3)
itself doesn't consider this an error.

gcc/testsuite:
PR c++/113545
* g++.dg/cpp1y/constexpr-reinterpret3.C,
g++.dg/cpp1y/constexpr-reinterpret4.C: New tests.

[Bug target/113859] popcount HI can be vectorized for non-SVE

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113859

--- Comment #1 from Andrew Pinski  ---
SI (and DI) can be optimized too.

LLVM is produces for int:
ldr d0, [x0]
cnt v0.8b, v0.8b
uaddlp  v0.4h, v0.8b
uaddlp  v0.2s, v0.4h
str d0, [x1]
ret

And for long:
```
ldr q0, [x0]
cnt v0.16b, v0.16b
uaddlp  v0.8h, v0.16b
uaddlp  v0.4s, v0.8h
uaddlp  v0.2d, v0.4s
str q0, [x1]
ret
```

That is for SLP version:
```
void f(unsigned long *  __restrict b, unsigned long * __restrict d)
{
d[0]  = __builtin_popcountll(b[0]);
d[1]  = __builtin_popcountll(b[1]);
}
```
s/long/int/ in the first case.

Note using SVE is better than the above if it is available and that is part of
PR 113860 though.

[Bug target/113860] SVE popcount can be used for 16bit, 32bit and 64bit

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113860

--- Comment #1 from Andrew Pinski  ---
SVE instructions can also be used for V4HI/V8HI/V2SI/V4SI so the SLP vectorizer
can use them.

[Bug target/113860] New: SVE popcount can be used for 16bit, 32bit and 64bit

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113860

Bug ID: 113860
   Summary: SVE popcount can be used for 16bit, 32bit and 64bit
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64

Take:
```
void f(unsigned long *  __restrict b, unsigned long * __restrict d)
{
d[0]  = __builtin_popcountll(b[0]);
}

```

Currently with `-march=armv9-a`, GCC produces:
```
ldr d31, [x0]
cnt v31.8b, v31.8b
addvb31, v31.8b
str d31, [x1]
```

But I think we could do:
```
ptrue   p6.b, all
ldr d31, [x0]
cnt z31.d, p6/m, z31.d
str d31, [x1]
```

Instead, especially if this is inside a loop (not vectorized), as p6.b
assignment could be pulled out. Or something similar to that.

Likewise for short (.h) and int (.b).

[Bug modula2/113848] modula2 doesn't build with clang

2024-02-09 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113848

Gaius Mulley  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Gaius Mulley  ---
Closing now the patch has been applied.

[Bug modula2/113848] modula2 doesn't build with clang

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113848

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Gaius Mulley :

https://gcc.gnu.org/g:863202684dff775ae4a3e576f77044474384d41f

commit r14-8906-g863202684dff775ae4a3e576f77044474384d41f
Author: Gaius Mulley 
Date:   Sat Feb 10 02:18:54 2024 +

PR modula2/113848 modula2 does not build with clang

Re-write address arithmetic in gm2-libs/SArgs.mod:GetArg
to avoid (void *) computation.  mc treats ADDRESS as (char *)
but does not cast user type (PtrToChar) to (char *) when
performing address arithmetic.

gcc/m2/ChangeLog:

PR modula2/113848
* gm2-libs/SArgs.mod (GetArg): Re-write address arithmetic
to avoid (void *) computation.

Signed-off-by: Gaius Mulley 

[Bug target/113859] New: popcount HI can be vectorized for non-SVE

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113859

Bug ID: 113859
   Summary: popcount HI can be vectorized for non-SVE
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64

Take:
```
void f(unsigned short *  __restrict b, unsigned short * __restrict d)
{
  for(int i = 0; i < 1024; i++)
d[i]  = __builtin_popcount(b[i]);
}

```

This can be vectorized to:
```
ldr q0, [x9]
cnt v0.16b, v0.16b
uaddlp  v0.8h, v0.16b
stp q0, [x9]
```

[Bug modula2/113848] modula2 doesn't build with clang

2024-02-09 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113848

Gaius Mulley  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-02-10
 Ever confirmed|0   |1

--- Comment #3 from Gaius Mulley  ---
Confirmed.

[Bug modula2/113848] modula2 doesn't build with clang

2024-02-09 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113848

--- Comment #2 from Gaius Mulley  ---
Created attachment 57375
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57375=edit
Proposed fix

Many thanks for the bug report and hint.  Here is the proposed patch (currently
being bootstrap tested).

[Bug target/113858] New: `ldr s0/h0/b0` should be used to zero extend into the full register

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113858

Bug ID: 113858
   Summary: `ldr s0/h0/b0` should be used to zero extend into the
full register
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64-*-*

Take:
```
A couple more:
```
#define vect128 __attribute__((vector_size(16) ))
#define vect64 __attribute__((vector_size(8) ))


vect64  int i64l( int *a)
{
  return (vect64 int){*a, 0};
}
vect64  short s64l( short  *a)
{
  return (vect64 short ){*a, 0};
}

vect128  int i128l( int *a)
{
  return (vect128 int){*a, 0};
}
vect64  unsigned char c64l( unsigned char *a)
{
  return (vect64 unsigned char){*a, 0};
}

```

Currently only i64l is implemented using `ldr s0` but the rest could be
implemented similarly.

Note LLVM is able to optimize this which is why I filed this seperately from PR
113857 .

[Bug fortran/113845] ice in gfc_get_array_ss

2024-02-09 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113845

--- Comment #5 from Steve Kargl  ---
On Fri, Feb 09, 2024 at 10:06:47PM +, anlauf at gcc dot gnu.org wrote:
> 
> --- Comment #4 from anlauf at gcc dot gnu.org ---
> Created attachment 57374
>   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57374=edit
> Proof-of-concept patch
> 
> The attached - hackish - patch tries to avoid the infinite recursion by
> fixing up the character length especially for the intrinsics ADJUST[LR].
> 
> I'm not entirely happy with this, but could not yet find a better place.
> And in gfc_resolve_adjustl the backend_decl is not yet set.
> 

You're much quicker than I!  I only just identified the
infinite recursion and where it was occurring in the
scalarizer.  I got sidetrack on a whole different issue.

I'm wondering if we need to worry about other actual
arguments.  I note

subroutine test_adjustl(x)
  character(*) :: x(100)
   x = adjustl(x)
  call bar(x)
end subroutine

does not cause problems.

[Bug target/113857] fmov should be used to zero the upper bits of the vector register

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113857

--- Comment #1 from Andrew Pinski  ---
Without f16 extension, s64 could be done as:
```
and x0, x0, 0x
fmov s0, x0
```

Similarly if we are doing a `c64` .

[Bug target/113856] `(vect64 float){1.0f, 0}` code generation could just be `fmov sN, 1.0f`

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113856

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug target/113857] New: fmov should be used to zero the upper bits of the vector register

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113857

Bug ID: 113857
   Summary: fmov should be used to zero the upper bits of the
vector register
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64

Take:
```
#define vect128 __attribute__((vector_size(16) ))
#define vect64 __attribute__((vector_size(8) ))

vect64  float f64( float a)
{
  return (vect64 float){a, 0};
}

vect128  float f128( float a)
{
  return (vect128 float){a, 0, 0, 0};
}
```

GCC produces fmov for f64 but does not for f128.


Note we could the same for:
```
vect64  short s64( short a)
{
  return (vect64 short){a, 0};
}

vect128  int i128( int a)
{
  return (vect128 int){a, 0};
}
```

Well s64 can only be done if f16 extensions are enabled.

[Bug target/113856] New: `(vect64 float){1.0f, 0}` code generation could just be `fmov sN, 1.0f`

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113856

Bug ID: 113856
   Summary: `(vect64 float){1.0f, 0}` code generation could just
be `fmov sN, 1.0f`
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
```
#define vect64 __attribute__((vector_size(8) ))

vect64  float f1( float a)
{
  return (vect64 float){1.0f, 0};
}
vect64  float f2( float a)
{
  return (vect64 float){1.0f, 1.0f};
}
```

Currently GCC produces:
```
f1:
adrpx0, .LC0
ldr d0, [x0, #:lo12:.LC0]
ret
f2:
fmovv0.2s, 1.0e+0
ret
```


But f1 could be implemented using fmov also.
Like:
```
f1:
fmovs0, 1.0e+0
ret
```

[Bug target/113855] [14 Regression] __gcc_nested_func_ptr_{created,deleted} exports from 32-bit libgcc_s.so.1

2024-02-09 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113855

--- Comment #3 from Iain Sandoe  ---
(In reply to Jakub Jelinek from comment #2)
> Guess an ia32 implementation would be even better, shouldn't be that hard.


I have a draft on one of my machines, I'll post it here tomorrow.

[Bug target/113855] [14 Regression] __gcc_nested_func_ptr_{created,deleted} exports from 32-bit libgcc_s.so.1

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113855

--- Comment #2 from Jakub Jelinek  ---
Guess an ia32 implementation would be even better, shouldn't be that hard.

[Bug target/113855] [14 Regression] __gcc_nested_func_ptr_{created,deleted} exports from 32-bit libgcc_s.so.1

2024-02-09 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113855

--- Comment #1 from Iain Sandoe  ---
I was looking at making an i686 impl.

but we could also do as you suggest for gcc-14.

[Bug target/113855] [14 Regression] __gcc_nested_func_ptr_{created,deleted} exports from 32-bit libgcc_s.so.1

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113855

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Priority|P3  |P1
   Keywords||ABI
 CC||iains at gcc dot gnu.org

[Bug target/113855] New: [14 Regression] __gcc_nested_func_ptr_{created,deleted} exports from 32-bit libgcc_s.so.1

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113855

Bug ID: 113855
   Summary: [14 Regression]
__gcc_nested_func_ptr_{created,deleted} exports from
32-bit libgcc_s.so.1
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

Seems when I build i686-linux trunk gcc,
__gcc_nested_func_ptr_{created,deleted}@@GCC_14.0.0 is not exported from
libgcc_s.so.1,
while when I build x86_64-linux trunk gcc,
__gcc_nested_func_ptr_{created,deleted}@@GCC_14.0.0
is exported from both 64-bit and 32-bit libgcc.
That looks wrong.
I guess given the trampoline_insns content it can't really work on ia32,
wonder if the whole file shouldn't be wrapped with #ifndef __x86_64__ or
similar for now.

[Bug fortran/113845] ice in gfc_get_array_ss

2024-02-09 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113845

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 CC||anlauf at gcc dot gnu.org

--- Comment #4 from anlauf at gcc dot gnu.org ---
Created attachment 57374
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57374=edit
Proof-of-concept patch

The attached - hackish - patch tries to avoid the infinite recursion by
fixing up the character length especially for the intrinsics ADJUST[LR].

I'm not entirely happy with this, but could not yet find a better place.
And in gfc_resolve_adjustl the backend_decl is not yet set.

Suggestions welcome!

[Bug target/113764] [X86] Generates lzcnt when bsr is sufficient

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113764

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
It is far more complicated than this.
When TARGET_LZCNT is on, CLZ_DEFINED_VALUE_AT_ZERO is 2 and already in GIMPLE
opts can use the fact that it has particular behavior on zero argument.
Before my _BitInt changes for clz/ctz etc., there was no way to differentiate
it in GIMPLE except for builtin (which had UB at zero) vs. ifn (which had it
depending on C?Z_DEFINED_VALUE_AT_ZERO).  Now even ifn can be UB at zero
(single argument) or well defined (two).  But still on RTL we have just one
thing, CLZ or CTZ rtxes which honor
C?Z_DEFINED_VALUE_AT_ZERO for the particular mode.
So, I think having at least in one function some lzcnt and some bsr insns
wouldn't be possible.

[Bug c++/98388] Throwing move-only parameter results in hard error in SFINAE context

2024-02-09 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98388

Marek Polacek  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113853

--- Comment #2 from Marek Polacek  ---
Mostly fixed, but bug 113853 remains.

[Bug c++/113834] [14 Regression] internal compiler error: in tree_to_shwi, at tree.cc:6461

2024-02-09 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834

Marek Polacek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #10 from Marek Polacek  ---
Fixed for GCC 14.

[Bug c++/113834] [14 Regression] internal compiler error: in tree_to_shwi, at tree.cc:6461

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834

--- Comment #9 from GCC Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:f29f7f86935e29786bf9f976ec99d7639b381b14

commit r14-8904-gf29f7f86935e29786bf9f976ec99d7639b381b14
Author: Marek Polacek 
Date:   Fri Feb 9 12:03:50 2024 -0500

c++: fix ICE with __type_pack_element [PR113834]

Here we crash on this invalid code because we seem to infinitely recurse
and end up with __type_pack_element with index that doesn't
tree_fits_shwi_p
which then crashes on tree_to_shwi.

Thanks to Jakub for suggesting a nicer fix than my original one.

PR c++/113834

gcc/cp/ChangeLog:

* semantics.cc (finish_type_pack_element): Perform range checking
before tree_to_shwi.

gcc/testsuite/ChangeLog:

* g++.dg/ext/type_pack_element4.C: New test.

[Bug c++/98388] Throwing move-only parameter results in hard error in SFINAE context

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98388

--- Comment #1 from GCC Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:3a3e0f1b46a3ad71ebeedc419393e3a36f1ce6db

commit r14-8903-g3a3e0f1b46a3ad71ebeedc419393e3a36f1ce6db
Author: Marek Polacek 
Date:   Tue Feb 6 15:35:16 2024 -0500

c++: make build_throw SFINAE-friendly [PR98388]

Here the problem is that we give hard errors while substituting
template parameters during overload resolution of is_throwable
which has an invalid throw in decltype.

The backtrace shows that fn_type_unification -> instantiate_template
-> tsubst* passes complain=0 as expected, but build_throw doesn't
have a complain parameter.  So let's add one.  Also remove a redundant
local variable which I should have removed in my P2266 patch.

There's still one problem for which I opened
.
We need to patch up treat_lvalue_as_rvalue_p and remove the dg-bogus.

Thanks to Patrick for notifying me of this PR.  This doesn't fully fix
113789; there I think I'll have to figure our why a candidate wasn't
discarded from the overload set.

PR c++/98388

gcc/cp/ChangeLog:

* coroutines.cc (coro_rewrite_function_body): Pass
tf_warning_or_error
to build_throw.
(morph_fn_to_coro): Likewise.
* cp-tree.h (build_throw): Adjust.
* except.cc (expand_end_catch_block): Pass tf_warning_or_error to
build_throw.
(build_throw): Add a tsubst_flags_t parameter.  Use it.  Remove
redundant variable.  Guard an inform call.
* parser.cc (cp_parser_throw_expression): Pass tf_warning_or_error
to build_throw.
* pt.cc (tsubst_expr) : Pass complain to
build_throw.

libcc1/ChangeLog:

* libcp1plugin.cc (plugin_build_unary_expr): Pass tf_error to
build_throw.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/sfinae69.C: New test.

[Bug c++/113854] New: the expression 'is_invocable_v ... evaluated to 'false'

2024-02-09 Thread f.heckenbach--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113854

Bug ID: 113854
   Summary: the expression 'is_invocable_v ... evaluated to
'false'
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: f.heckenb...@fh-soft.de
  Target Milestone: ---

The following error output is less than helpful. It contains a long litany of
mostly library internals. The ultimate error is "the expression
'is_invocable_v<_Fn, _Args ...> [with _Fn = main::._anon_116&; _Args =
{std::unique_ptr >&}]' evaluated to 'false'"
which, while technically correct, is still not very helpful.

The actual error is trying to pass a move-only type by value rather than by
reference, but I can't infer this from the messages. So again, unfortunately, I
have to read gcc's message as "something's wrong, but I won't tell you what"
...

One of the promises of concepts was better error messages. That's obviously not
the case here, compared to the version with plain old std::find_if (below).

% cat test.cpp
#include 
#include 
#include 

int main ()
{
  std::vector > v;
  std::ranges::find_if (v, [] (auto i) { return !!i; });
}
% g++ -std=c++20 test.cpp
test.cpp: In function 'int main()':
test.cpp:8:24: error: no match for call to '(const std::ranges::__find_if_fn)
(std::vector >&, main()::)'
8 |   std::ranges::find_if (v, [] (auto i) { return !!i; });
  |   ~^~~~
In file included from /usr/include/c++/12/ranges:47,
 from test.cpp:3:
/usr/include/c++/12/bits/ranges_util.h:474:7: note: candidate: 'template  requires (input_iterator<_Iter>)
&& (sentinel_for<_Sent, _Iter>) && (indirect_unary_predicate<_Pred,
std::projected<_Iter, _Proj> >) constexpr _Iter
std::ranges::__find_if_fn::operator()(_Iter, _Sent, _Pred, _Proj) const'
  474 |   operator()(_Iter __first, _Sent __last,
  |   ^~~~
/usr/include/c++/12/bits/ranges_util.h:474:7: note:   template argument
deduction/substitution failed:
test.cpp:8:24: note:   candidate expects 4 arguments, 2 provided
8 |   std::ranges::find_if (v, [] (auto i) { return !!i; });
  |   ~^~~~
/usr/include/c++/12/bits/ranges_util.h:487:7: note: candidate: 'template  requires (input_range<_Range>) &&
(indirect_unary_predicate<_Pred,
std::projected)())),
_Proj> >) constexpr std::ranges::borrowed_iterator_t<_Range>
std::ranges::__find_if_fn::operator()(_Range&&, _Pred, _Proj) const'
  487 |   operator()(_Range&& __r, _Pred __pred, _Proj __proj = {}) const
  |   ^~~~
/usr/include/c++/12/bits/ranges_util.h:487:7: note:   template argument
deduction/substitution failed:
/usr/include/c++/12/bits/ranges_util.h:487:7: note: constraints not satisfied
In file included from /usr/include/c++/12/compare:39,
 from /usr/include/c++/12/bits/stl_pair.h:65,
 from /usr/include/c++/12/bits/stl_algobase.h:64,
 from /usr/include/c++/12/vector:60,
 from test.cpp:1:
/usr/include/c++/12/concepts: In substitution of 'template  requires (input_range<_Range>) &&
(indirect_unary_predicate<_Pred,
std::projected)())),
_Proj> >) constexpr std::ranges::borrowed_iterator_t<_Range>
std::ranges::__find_if_fn::operator()(_Range&&, _Pred, _Proj) const [with
_Range = std::vector >&; _Proj = std::identity; _Pred =
main()::]':
test.cpp:8:24:   required from here
/usr/include/c++/12/concepts:336:13:   required for the satisfaction of
'invocable<_Fn, _Args ...>' [with _Fn = main::._anon_116&; _Args =
{std::unique_ptr >&}]
/usr/include/c++/12/concepts:340:13:   required for the satisfaction of
'regular_invocable<_Fn, _Args ...>' [with _Fn = main::._anon_116&; _Args =
{std::unique_ptr >&}]
/usr/include/c++/12/concepts:344:13:   required for the satisfaction of
'predicate<_Fn&, typename std::__detail::__iter_traits_impl::type, std::indirectly_readable_traits::type> >::type::value_type&>' [with _Fn =
main::._anon_116; _Iter =
std::projected<__gnu_cxx::__normal_iterator >*, std::vector >, std::allocator > > > >, std::identity>]
/usr/include/c++/12/bits/iterator_concepts.h:710:13:   required for the
satisfaction of 'indirect_unary_predicate<_Pred, std::projected())), _Proj> >' [with
_Pred = main::._anon_116; _Range = std::vector >, std::allocator > > >&; _Proj = std::identity]
/usr/include/c++/12/concepts:336:25: note: the expression 'is_invocable_v<_Fn,
_Args ...> [with _Fn = main::._anon_116&; _Args = {std::unique_ptr >&}]' evaluated to 'false'
  336 | concept invocable = is_invocable_v<_Fn, _Args...>;
  | ^

% cat test2.cpp
#include 
#include 
#include 

int main ()
{
  std::vector > v;
  std::find_if (v.begin (), v.end (), [] (auto i) { return !!i; });
}
% g++ -std=c++20 test2.cpp
In file 

[Bug libgomp/113698] GNU OpenMP with OMP_PROC_BIND alters thread affinity in a way that negatively affects performance

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113698

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |MOVED

--- Comment #5 from Andrew Pinski  ---
Moved and looks to be fixed

[Bug libgomp/113698] GNU OpenMP with OMP_PROC_BIND alters thread affinity in a way that negatively affects performance

2024-02-09 Thread kugan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113698

--- Comment #4 from kugan at gcc dot gnu.org ---
Thanks for looking into this. The main reason we ere seeing performance issue
turned out to be due to glibc malloc issue in
https://sourceware.org/bugzilla/show_bug.cgi?id=30945

[Bug fortran/104908] [11/12/13/14 Regression] incorrect Fortran out-of-bound runtime error.

2024-02-09 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104908

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from anlauf at gcc dot gnu.org ---
Fixed on mainline for gcc-14, and backported to all open branches.

Thanks for the report, and sorry that it took so long.

[Bug fortran/104908] [11/12/13/14 Regression] incorrect Fortran out-of-bound runtime error.

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104908

--- Comment #12 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:07b575d5860dbc8791dbb9d10af9f918f34d7ff0

commit r11-11231-g07b575d5860dbc8791dbb9d10af9f918f34d7ff0
Author: Harald Anlauf 
Date:   Sat Jan 27 17:41:43 2024 +0100

Fortran: fix bounds-checking errors for CLASS array dummies [PR104908]

Commit r11-1235 addressed issues with bounds of unlimited polymorphic array
dummies.  However, using the descriptor from sym->backend_decl does break
the case of CLASS array dummies.  The obvious solution is to restrict the
fix to the unlimited polymorphic case, thus keeping the original descriptor
in the ordinary case.

gcc/fortran/ChangeLog:

PR fortran/104908
* trans-array.c (gfc_conv_array_ref): Restrict use of transformed
descriptor (sym->backend_decl) to the unlimited polymorphic case.

gcc/testsuite/ChangeLog:

PR fortran/104908
* gfortran.dg/pr104908.f90: New test.

(cherry picked from commit ce61de1b8a1bb3a22118e900376f380768f2ba59)

[Bug fortran/104908] [11/12/13/14 Regression] incorrect Fortran out-of-bound runtime error.

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104908

--- Comment #11 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:f7b3a82be2f4e9f43524185226c0df686c7b0154

commit r12-10148-gf7b3a82be2f4e9f43524185226c0df686c7b0154
Author: Harald Anlauf 
Date:   Sat Jan 27 17:41:43 2024 +0100

Fortran: fix bounds-checking errors for CLASS array dummies [PR104908]

Commit r11-1235 addressed issues with bounds of unlimited polymorphic array
dummies.  However, using the descriptor from sym->backend_decl does break
the case of CLASS array dummies.  The obvious solution is to restrict the
fix to the unlimited polymorphic case, thus keeping the original descriptor
in the ordinary case.

gcc/fortran/ChangeLog:

PR fortran/104908
* trans-array.cc (gfc_conv_array_ref): Restrict use of transformed
descriptor (sym->backend_decl) to the unlimited polymorphic case.

gcc/testsuite/ChangeLog:

PR fortran/104908
* gfortran.dg/pr104908.f90: New test.

(cherry picked from commit ce61de1b8a1bb3a22118e900376f380768f2ba59)

[Bug libgcc/113850] condition variables timed wait does a lot of spurious wakeups on Win32 threading implementation

2024-02-09 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113850

Eric Botcazou  changed:

   What|Removed |Added

   Last reconfirmed||2024-02-09
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Target Milestone|--- |13.3

--- Comment #2 from Eric Botcazou  ---
Ouch.  Thanks for catching this (possibly auto-completion induced) typo!

[Bug tree-optimization/110422] asm goto vs SRA

2024-02-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.5
  Known to work||11.4.1, 12.3.1, 13.2.1,
   ||14.0

[Bug target/113764] [X86] Generates lzcnt when bsr is sufficient

2024-02-09 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113764

--- Comment #2 from Roger Sayle  ---
Investigating further, the thinking behind GCC's current behaviour can be found
in Agner Fog's instruction tables; on many architectures BSR is much slower
than LZCNT.

Legacy AMD:  BSR=4 cycles,  LZCNT=2 cycles
AMD BOBCAT:  BSR=6 cycles,  LZCNT=5 cycles
AMD JAGUAR:  BSR=4 cycles,  LZCNT=1 cycle
AMD ZEN[1-3]:BSR=4 cycles,  LZCNT=1 cycle
AMD ZEN4:BSR=1 cycle,   LZCNT=1 cycle
INTEL:   BSR=3 cycles,  LZCNT=3 cycles
KNIGHTS LANDING: BSR=11 cycles, LZCNT=3 cycles

Hence using bsr is only "better" in some (but not all) contexts, and a
reasonable default (for generic tuning) is to ignore BSR when LZCNT is
available, as it's only one extra cycle of latency to perform the XOR.

The correct solution is to add a tuning parameter to the x86 backend, to
control whether it's beneficial to use BSR when LZCNT is available, for example
when optimizing for size with -Os or -Oz.  This is more reasonable now that
current Intel and AMD architectures have the same latency for BSR and LZCNT,
than when LZCNT first appeared (explaining !TARGET_LZCNT in i386.md).

[Bug fortran/113799] gfc_replace_expr: double free detected ?

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113799

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:b3d622d70ba209b63471fc1b0970870046e55745

commit r14-8902-gb3d622d70ba209b63471fc1b0970870046e55745
Author: Harald Anlauf 
Date:   Thu Feb 8 21:51:38 2024 +0100

Fortran: error recovery on arithmetic overflow on unary operations
[PR113799]

PR fortran/113799

gcc/fortran/ChangeLog:

* arith.cc (reduce_unary): Remember any overflow encountered during
reduction of unary arithmetic operations on array constructors and
continue, and return error status, but terminate on serious errors.

gcc/testsuite/ChangeLog:

* gfortran.dg/arithmetic_overflow_2.f90: New test.

[Bug fortran/113846] ice in fold_convert_loc, at fold-const.cc:2757

2024-02-09 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113846

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=105371
   Last reconfirmed||2024-02-09

--- Comment #1 from anlauf at gcc dot gnu.org ---
MERGE and CLASS are not yet handled well.

See also pr105371 for scalar versions.

[Bug c++/113853] implicit move in throw in trailing return type

2024-02-09 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113853

Marek Polacek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
   Last reconfirmed||2024-02-09
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Keywords||rejects-valid

[Bug c++/113853] New: implicit move in throw in trailing return type

2024-02-09 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113853

Bug ID: 113853
   Summary: implicit move in throw in trailing return type
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mpolacek at gcc dot gnu.org
  Target Milestone: ---

Found while working on bug 98388.  This should compile fine in all dialects:

```
// { dg-do compile { target c++11 } }

struct moveonly {
moveonly() = default;
moveonly(moveonly&&) = default;
};

template
constexpr auto is_throwable(T t) -> decltype(throw t, true) {
return true;
}
template
constexpr bool is_throwable(...) { return false; }

constexpr bool b = is_throwable(moveonly{});
#if __cplusplus >= 202002L
static_assert (b, "move from the function parameter");
#else
static_assert (!b, "no move from the function parameter");
#endif
```

but it doesn't; see
.

[Bug fortran/113845] ice in gfc_get_array_ss

2024-02-09 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113845

--- Comment #3 from kargl at gcc dot gnu.org ---
(In reply to kargl from comment #2)
> (In reply to kargl from comment #1)
> > Thanks.  Reduce test case.
> > 
> > subroutine test_adjustl(x)
> >   character(*) :: x(100)
> >   call bar(adjustl(x))
> > end subroutine
> 
> Forcing gfc_simplify_adjustl to return NULL fixes this issue.
> Likely, simplification trashes the stack because ubound is
> unknown.

Whoops, after compiling I ran the wrong test code through
gfortran.  This appears to be a red-herring. :(

[Bug tree-optimization/110422] asm goto vs SRA

2024-02-09 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422

Martin Jambor  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Martin Jambor  ---
Fixed on all opened release branches too.

[Bug tree-optimization/110422] asm goto vs SRA

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422

--- Comment #8 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Martin Jambor
:

https://gcc.gnu.org/g:010e4a04c4daaf1b0dacf6aa30fcbdaa73eda33c

commit r11-11230-g010e4a04c4daaf1b0dacf6aa30fcbdaa73eda33c
Author: Martin Jambor 
Date:   Fri Feb 9 18:58:43 2024 +0100

sra: Disqualify bases of operands of asm gotos

PR 110422 shows that SRA can ICE assuming there is a single edge
outgoing from a block terminated with an asm goto.  We need that for
BB-terminating statements so that any adjustments they make to the
aggregates can be copied over to their replacements.  Because we can't
have that after ASM gotos, we need to punt.

gcc/ChangeLog:

2024-01-17  Martin Jambor  

PR tree-optimization/110422
* tree-sra.c (scan_function): Disqualify bases of operands of asm
gotos.

gcc/testsuite/ChangeLog:

2024-01-17  Martin Jambor  

PR tree-optimization/110422
* gcc.dg/torture/pr110422.c: New test.

(cherry picked from commit 2b7204c52392c1c0da9c91a5feae0c44018a6f37)

[Bug c++/103524] [meta-bug] modules issue

2024-02-09 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 112580, which changed state.

Bug 112580 Summary: [14 Regression]:  g++.dg/modules/xtreme-header-4_b.C et al; 
ICE tree check: expected class 'type', have 'declaration'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112580

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug c++/112580] [14 Regression]: g++.dg/modules/xtreme-header-4_b.C et al; ICE tree check: expected class 'type', have 'declaration'

2024-02-09 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112580

Patrick Palka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Patrick Palka  ---
The "invalid use of non-static data member" errors when compiling the
xtreme-header tests should be fixed, which should be the last of the
xtreme-header 14 regressions.

[Bug fortran/113845] ice in gfc_get_array_ss

2024-02-09 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113845

--- Comment #2 from kargl at gcc dot gnu.org ---
(In reply to kargl from comment #1)
> Thanks.  Reduce test case.
> 
> subroutine test_adjustl(x)
>   character(*) :: x(100)
>   call bar(adjustl(x))
> end subroutine

Forcing gfc_simplify_adjustl to return NULL fixes this issue.
Likely, simplification trashes the stack because ubound is
unknown.

[Bug c++/112580] [14 Regression]: g++.dg/modules/xtreme-header-4_b.C et al; ICE tree check: expected class 'type', have 'declaration'

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112580

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:f931bd7725f5cea948dd55ac370b5b9fd9a00198

commit r14-8900-gf931bd7725f5cea948dd55ac370b5b9fd9a00198
Author: Patrick Palka 
Date:   Fri Feb 9 12:40:28 2024 -0500

c++/modules: anon union member of as-base class [PR112580]

Here when streaming in the fields of the as-base version of
_Formatting_scanner we end up overwriting ANON_AGGR_TYPE_FIELD
of the anonymous union type, since it turns out this type is shared
between the original FIELD_DECL and the as-base FIELD_DECL copy (copied
during layout_class_type).  ANON_AGGR_TYPE_FIELD first gets properly set
to the original FIELD_DECL when streaming in the canonical definition of
_Formatting_scanner, and then gets overwritten to the as-base
FIELD_DECL when streaming in the the as-base definition.  This leads to
lookup_anon_field later giving the wrong answer when resolving the
_M_values use at instantiation time.

This patch makes us avoid overwriting ANON_AGGR_TYPE_FIELD when streaming
in an as-base class definition; it should already be properly set at that
point.

PR c++/112580

gcc/cp/ChangeLog:

* module.cc (trees_in::read_class_def): When streaming in
an anonymous union field of an as-base class, don't overwrite
ANON_AGGR_TYPE_FIELD.

gcc/testsuite/ChangeLog:

* g++.dg/modules/anon-3_a.H: New test.
* g++.dg/modules/anon-3_b.C: New test.

Reviewed-by: Jason Merrill 

[Bug fortran/113845] ice in gfc_get_array_ss

2024-02-09 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113845

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-02-09
 Ever confirmed|0   |1
 CC||kargl at gcc dot gnu.org

--- Comment #1 from kargl at gcc dot gnu.org ---
Thanks.  Reduce test case.

subroutine test_adjustl(x)
  character(*) :: x(100)
  call bar(adjustl(x))
end subroutine

subroutine test_adjustr(x)
  character(*) :: x(100)
  call bar(adjustr(x))
end subroutine

On FreeBSD either subroutine causes

% gfcx -c rt.f90
pid 50109 comm f951 has trashed its stack, killing
gfortran: internal compiler error: Illegal instruction signal terminated
program f951
Please submit a full bug report, with preprocessed source (by using
-freport-bug

Note, I have 
% limits | grep stack
  stacksize  524288 kB

[Bug libstdc++/113851] boyer_moore_searcher and boyer_moore_horspool_searcher fail to accept ADL-incompatible element types

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113851

--- Comment #2 from Jonathan Wakely  ---
We fail much simpler cases too:

#include 

template 
struct holder {
T t;
};

struct incomplete;

int main() {
using validator = holder*;
validator varr[1]{};
(void) std::find(varr, varr + 1, nullptr);
}

[Bug other/109668] 'python' vs. 'python3'

2024-02-09 Thread palmer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109668

palmer at gcc dot gnu.org changed:

   What|Removed |Added

 CC||palmer at gcc dot gnu.org

--- Comment #3 from palmer at gcc dot gnu.org ---
Jan-Benedict Glaw is reporting (via a crosstool-ng bug
) that we've got a
few python2 scripts in the RISC-V port that can just be converted over.  I just
sent along a patch to clean that up.

[Bug libstdc++/113851] boyer_moore_searcher and boyer_moore_horspool_searcher fail to accept ADL-incompatible element types

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113851

--- Comment #1 from Jonathan Wakely  ---
Ugh.

The problem is:

 1272 | if (__iter == _M_bad_char.end())

which does ADL for operator==

In general there's no way to fix this, we *must* do ADL for iterator equality
comparisons (and other operators like operator* and operator++) because they
might be named member functions, or they might be built-in operators if the
iterator is a pointer.

In this case we *must* do ADL for operator== because it's a hidden friend so
there's no other way to find it. And if you can't test iterators for equality,
you can't really do anything useful with them (except for output iterators).

[Bug libstdc++/113851] boyer_moore_searcher and boyer_moore_horspool_searcher fail to accept ADL-incompatible element types

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113851

Jonathan Wakely  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-02-09
 Status|UNCONFIRMED |NEW

[Bug c++/113852] -Wsign-compare doesn't warn on unsigned result types

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113852

--- Comment #5 from Jonathan Wakely  ---
The behaviour changed with r247495

common.opt (fstrict-overflow): Alias negative to fwrapv.

2017-05-02  Richard Biener 

* common.opt (fstrict-overflow): Alias negative to fwrapv.
* doc/invoke.texi (fstrict-overflow): Remove all traces of
-fstrict-overflow documentation.
* tree.h (TYPE_OVERFLOW_UNDEFINED): Do not test flag_strict_overflow.
(POINTER_TYPE_OVERFLOW_UNDEFINED): Test !flag_wrapv instead of
flag_strict_overflow.
* ipa-inline.c (can_inline_edge_p): Do not test flag_strict_overflow.
* lto-opts.c (lto_write_options): Do not stream it.
* lto-wrapper.c (merge_and_complain): Do not handle it.
* opts.c (default_options_table): Do not set -fstrict-overflow.
(finish_options): Likewise do not clear it when sanitizing.
* simplify-rtx.c (simplify_const_relational_operation): Do not
test flag_strict_overflow.

[Bug c++/113852] -Wsign-compare doesn't warn on unsigned result types

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113852

--- Comment #4 from Jonathan Wakely  ---
Reduced:

int main()
{
unsigned short a1 = (1u << 16) - 1;
unsigned short a2 = a1;

/* a1 * a2 should be 4294836225 in math terms */
unsigned long long a3 = 4294836225;

/*
 * The result of (a1 * a2) is of type int and the result is negative.
 * (a1 * a2) ends up as some bogus high number because the common
 * type here ends up as uint64_t and sign-extension occurs.
 */
if ((a1 * a2) > a3) {
__builtin_abort();
}
}

[Bug c++/113852] -Wsign-compare doesn't warn on unsigned result types

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113852

--- Comment #3 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #2)
> I assume what's happening is that GCC assumes integer promotion from
> uint16_t to int is value preserving and so we get two positive values, and
> therefore comparison with an unsigned value is fine - there are no negative
> values involved and so it doesn't matter that we're comparing int with
> unsigned long. But of course that's not true here. We have two positive ints
> but their product overflows to produce a negative int. I guess we're also
> assuming no overflow happens, because that would be undefined and we assume
> no UB.

Ah yes, if you add -fwrapv then you get the -Wsign-compare warning, because now
a negative product can occur without undefined overflow.

[Bug c++/113852] -Wsign-compare doesn't warn on unsigned result types

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113852

Jonathan Wakely  changed:

   What|Removed |Added

   Keywords||diagnostic
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-02-09

--- Comment #2 from Jonathan Wakely  ---
If a1 and a2 are const then GCC also notices the undefined overflow:

: In function 'int main()':
:18:13: warning: integer overflow in expression of type 'int' results
in '-131071' [-Woverflow]
   18 | if ((a1 * a2) > a3) {
  |  ~~~^~~~
:18:19: warning: comparison of integer expressions of different
signedness: 'int' and 'uint64_t' {aka 'long unsigned int'} [-Wsign-compare]
   18 | if ((a1 * a2) > a3) {
  | ~~^~~~

I don't know why the -Wsign-compare warning is only given when we've detected
the overflow. The types are always the same, whether the values are known to
overflow or not.

I assume what's happening is that GCC assumes integer promotion from uint16_t
to int is value preserving and so we get two positive values, and therefore
comparison with an unsigned value is fine - there are no negative values
involved and so it doesn't matter that we're comparing int with unsigned long.
But of course that's not true here. We have two positive ints but their product
overflows to produce a negative int. I guess we're also assuming no overflow
happens, because that would be undefined and we assume no UB.

When the values are constant we can tell the overflow happens, and no longer
assume it doesn't happen.

[Bug libstdc++/113841] Can't swap two std::hash

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113841

--- Comment #8 from Jonathan Wakely  ---
Calling swap unqualified performs ADL, which has to find all the associated
namespaces and associated classes. To do that it has to complete all the types
involved, which means it tries to complete:
std::hash*>
std::pair
MyArrVec
MyVec
MyAllocator

Trying to complete MyVec hits an error outside the immediate context, because
its default constructor cannot be instantiated, because MyAllocator is not
default constructible.

That's why comment 5 fixes it.

[Bug c++/113852] -Wsign-compare doesn't warn on unsigned result types

2024-02-09 Thread admin at computerquip dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113852

--- Comment #1 from Zachary L  ---
Sorry, that should say "If *both* a1 or a2 are constexpr, the warning will
occur."

[Bug c++/113852] New: -Wsign-compare doesn't warn on unsigned result types

2024-02-09 Thread admin at computerquip dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113852

Bug ID: 113852
   Summary: -Wsign-compare doesn't warn on unsigned result types
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: admin at computerquip dot com
  Target Milestone: ---

I haven't quite figured out the pattern here so the title may not be great.
Some code may help explain better: https://godbolt.org/z/d8cqd1WqP

```
#include 
#include 
#include 

int main()
{
uint16_t a1 = std::numeric_limits::max();
uint16_t a2 = std::numeric_limits::max();

/* a1 * a2 should be 4294836225 in math terms */
uint64_t a3 = 4294836225;

/*
 * The result of (a1 * a2) is of type int and the result is negative.
 * (a1 * a2) ends up as some bogus high number because the common
 * type here ends up as uint64_t and sign-extension occurs.
 */
if ((a1 * a2) > a3) {
std::cout << "this will print\n";
}
}
```

Some observations I've noticed:
* If either a1 or a2 are constexpr, the warning will occur.
* This used to warn up until 8.1.
* Clang also doesn't warn here but MSVC will with /W3 or higher.
* This seems like a slight variation of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101645

[Bug libstdc++/113841] Can't swap two std::hash

2024-02-09 Thread ostash at ostash dot kiev.ua via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113841

--- Comment #7 from Viktor Ostashevskyi  ---
I'm still wondering why for std::hash, the T type is checked for anything.
It shouldn't matter what T is, as we're hashing T*...

[Bug tree-optimization/113783] ICE: in lower_stmt, at gimple-lower-bitint.cc:5455 with -O -mavx512f and _BitInt() in memcpy()

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113783

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Jakub Jelinek  ---
Fixed.

[Bug libstdc++/113851] New: boyer_moore_searcher and boyer_moore_horspool_searcher fail to accept ADL-incompatible element types

2024-02-09 Thread de34 at live dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113851

Bug ID: 113851
   Summary: boyer_moore_searcher and boyer_moore_horspool_searcher
fail to accept ADL-incompatible element types
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: rejects-valid
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: de34 at live dot cn
  Target Milestone: ---

The following program doesn't compile with libc++ due to ADL which attempts to
complete a bad type.

```
#include 
#include 

template 
struct holder {
T t;
};

struct incomplete;

int main() {
using validator = holder*;
validator varr[1]{};
(void) std::search(varr, varr + 1, std::boyer_moore_searcher{varr, varr + 1});
(void) std::search(varr, varr + 1, std::boyer_moore_horspool_searcher{varr, varr + 1});
}
```

It seems that non-ADL-proof iterator operations are problematic, and all
standard library implementations suffer from similar problems
(https://godbolt.org/z/Ta6PafcnK).

[Bug tree-optimization/113783] ICE: in lower_stmt, at gimple-lower-bitint.cc:5455 with -O -mavx512f and _BitInt() in memcpy()

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113783

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:c9bdcb0c3433ce09f5bb713a51a14130858578a2

commit r14-8899-gc9bdcb0c3433ce09f5bb713a51a14130858578a2
Author: Jakub Jelinek 
Date:   Fri Feb 9 16:17:08 2024 +0100

lower-bitint: Fix handling of VIEW_CONVERT_EXPRs to minimally supported
huge INTEGER_TYPEs [PR113783]

On the following testcases memcpy lowering folds the calls to
reading and writing of MEM_REFs with huge INTEGER_TYPEs - uint256_t
with OImode or uint512_t with XImode.  Further optimization turn
the load from MEM_REF from the large/huge _BitInt var into
VIEW_CONVERT_EXPR
from it to the uint256_t/uint512_t.  The backend doesn't really
support those except for "movoi"/"movxi" insns, so it isn't possible
to handle it like casts to supportable INTEGER_TYPEs where we can
construct those from individual limbs - there are no OImode/XImode shifts
and the like we can use.
So, the following patch makes sure for such VCEs that the SSA_NAME operand
of the VCE lives in memory and then turns it into a VIEW_CONVERT_EXPR so
that we actually load the OImode/XImode integer from memory (i.e. a mov).
We need to make sure those aren't merged with other
operations in the gimple_lower_bitint hunks.
For SSA_NAMEs which have underlying VAR_DECLs that is all we need, those
VAR_DECL have ARRAY_TYPEs.
For SSA_NAMEs which have underlying PARM_DECLs or RESULT_DECLs those have
BITINT_TYPE and I had to tweak expand_expr_real_1 for that so that it
doesn't try convert_modes on those when one of the modes is BLKmode - we
want to fall through into the adjust_address on the MEM.

2024-02-09  Jakub Jelinek  

PR tree-optimization/113783
* gimple-lower-bitint.cc (bitint_large_huge::lower_stmt): Look
through VIEW_CONVERT_EXPR for final cast checks.  Handle
VIEW_CONVERT_EXPRs from large/huge _BitInt to > MAX_FIXED_MODE_SIZE
INTEGER_TYPEs.
(gimple_lower_bitint): Don't merge mergeable operations or other
casts with VIEW_CONVERT_EXPRs to > MAX_FIXED_MODE_SIZE
INTEGER_TYPEs.
* expr.cc (expand_expr_real_1): Don't use convert_modes if either
mode is BLKmode.

* gcc.dg/bitint-88.c: New test.

[Bug middle-end/110754] assume create spurious load for volatile variable

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110754

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #8 from Jakub Jelinek  ---
Created attachment 57373
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57373=edit
gcc14-pr110754.patch

Untested fix.

[Bug libstdc++/113841] Can't swap two std::hash

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113841

--- Comment #6 from Jonathan Wakely  ---
IIRC the reason we don't just default that constructor is because we need to be
sure to value-initialize the allocator, not just default-initialize it.

[Bug libstdc++/113841] Can't swap two std::hash

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113841

Jonathan Wakely  changed:

   What|Removed |Added

  Component|c++ |libstdc++

--- Comment #5 from Jonathan Wakely  ---
Actually, maybe we need this change in the library, so that it works in GCC 12
and also works with Clang:

--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -135,6 +135,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
_GLIBCXX20_CONSTEXPR
_Vector_impl() _GLIBCXX_NOEXCEPT_IF(
is_nothrow_default_constructible<_Tp_alloc_type>::value)
+#if __cpp_lib_concepts
+   requires is_default_constructible_v<_Tp_alloc_type>
+#endif
: _Tp_alloc_type()
{ }

[Bug libgcc/113850] condition variables timed wait does a lot of spurious wakeups on Win32 threading implementation

2024-02-09 Thread matteo at mitalia dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113850

--- Comment #1 from Matteo Italia  ---
Created attachment 57372
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57372=edit
Proposed patch

[Bug libgcc/113850] New: condition variables timed wait does a lot of spurious wakeups on Win32 threading implementation

2024-02-09 Thread matteo at mitalia dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113850

Bug ID: 113850
   Summary: condition variables timed wait does a lot of spurious
wakeups on Win32 threading implementation
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: matteo at mitalia dot net
  Target Milestone: ---

Created attachment 57371
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57371=edit
Test program to reproduce the bug

Sample program:
```
#include 
#include 
#include 
#include 

int main() {
std::condition_variable cv;
std::mutex mx;
bool pass = false;

auto thread_fn = [&](bool timed) {
int wakeups = 0;
using sc = std::chrono::system_clock;
auto before = sc::now();
std::unique_lock ml(mx);
if (timed) {
cv.wait_for(ml, std::chrono::seconds(2), [&]{
++wakeups;
return pass;
});
} else {
cv.wait(ml, [&]{
++wakeups;
return pass;
});
}
printf("pass: %d; wakeups: %d; elapsed: %d ms\n", pass, wakeups,
int((sc::now() - before) / std::chrono::milliseconds(1)));
pass = false;
};

{
// timed wait, let expire
std::thread t(thread_fn, true);
t.join();
}

{
// timed wait, wake up explicitly after 1 second
std::thread t(thread_fn, true);
std::this_thread::sleep_for(std::chrono::seconds(1));
{
std::unique_lock ml(mx);
pass = true;
}
cv.notify_all();
t.join();
}

{
// non-timed wait, wake up explicitly after 1 second
std::thread t(thread_fn, false);
std::this_thread::sleep_for(std::chrono::seconds(1));
{
std::unique_lock ml(mx);
pass = true;
}
cv.notify_all();
t.join();
}
return 0;
}
```

On Linux or on Win32 with the MCF threading model the output looks like
```
pass: 0; wakeups: 2; elapsed: 2000 ms
pass: 1; wakeups: 2; elapsed: 991 ms
pass: 1; wakeups: 2; elapsed: 996 ms
```

but with the Win32 "regular" threading model
```
pass: 0; wakeups: 1418; elapsed: 2000 ms
pass: 1; wakeups: 479; elapsed: 988 ms
pass: 1; wakeups: 2; elapsed: 992 ms
```

As can be seen here, there's a huge number of spurious wakeups, but only for
the timed waits; ultimately, I traced the problem to
`__gthr_win32_abs_to_rel_time`, that is used to convert from the absolute
`timespec` value (used internally through the gthread abstraction layer) to the
relative milliseconds to be passed to `SleepConditionVariableCS`:

```
/* Convert absolute thread time to relative time in millisecond.  */

DWORD
__gthr_win32_abs_to_rel_time (const __gthread_time_t *abs_time)
{
  union {
ULONGLONG nsec100;
FILETIME ft;
  } now;
  ULONGLONG abs_time_nsec100;

  /* The Windows epoch is 1/1/1601 while the Unix epoch is 1/1/1970.  */
  GetSystemTimeAsFileTime ();
  now.nsec100 -= FILETIME_1970;

  abs_time_nsec100
= (ULONGLONG) abs_time->tv_sec * NSEC100_PER_SEC
+ CEIL_DIV (abs_time->tv_nsec, 100);

  if (abs_time_nsec100 < now.nsec100)
return 0;

  return (DWORD) CEIL_DIV (abs_time_nsec100 - now.nsec100, NSEC100_PER_SEC);
}
```

the final line has a typo: as the return value is in milliseconds, it should
divide by the constant NSEC100_PER_MSEC; as it is, it returns the value in
seconds, so it's going to be 1000 times smaller than it should, which results
in ~1000× spurious wakeups. I suppose that was a typo or some
rebase/refactoring mishap, as the correct constant was already defined above
(and wasn't used for anything else). The patch of course is trivial, and
applying it makes the test program return to its "normal" wakeups count.

[Bug c++/113834] [14 Regression] internal compiler error: in tree_to_shwi, at tree.cc:6461

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #8 from Jakub Jelinek  ---
(In reply to Marek Polacek from comment #7)
> > But then the diagnostics is confusing.
> 
> There's a lot of places where we do exactly that:
> handle_assume_aligned_attribute, ...

Sure, I know.  In some cases it is from assumption that one uses reasonable
types and values for say alignments or sizes etc., in other cases laziness,
diagnose something and if it in the corner cases diagnoses something slightly
different, user can figure it out.

[Bug middle-end/108410] x264 averaging loop not optimized well for avx512

2024-02-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108410

--- Comment #10 from Richard Biener  ---
So this is now fixed if you use --param vect-partial-vector-usage=2, there is
at the moment no way to get masking/not masking costed against each other.  In
theory vect_analyze_loop_costing and vect_estimate_min_profitable_iters
could do both and we could delay vect_determine_partial_vectors_and_peeling.

[Bug c++/113834] [14 Regression] internal compiler error: in tree_to_shwi, at tree.cc:6461

2024-02-09 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834

--- Comment #7 from Marek Polacek  ---
(In reply to Jakub Jelinek from comment #6)
> (In reply to Marek Polacek from comment #5)
> > To fix the ICE we could do:
> > 
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -4644,7 +4644,9 @@ static tree
> >  finish_type_pack_element (tree idx, tree types, tsubst_flags_t complain)
> >  {
> >idx = maybe_constant_value (idx);
> > -  if (TREE_CODE (idx) != INTEGER_CST || !INTEGRAL_TYPE_P (TREE_TYPE (idx)))
> > +  if (TREE_CODE (idx) != INTEGER_CST
> > +  || !INTEGRAL_TYPE_P (TREE_TYPE (idx))
> > +  || !tree_fits_shwi_p (idx))
> >  {
> >if (complain & tf_error)
> > error ("%<__type_pack_element%> index is not an integral constant");
> 
> But then the diagnostics is confusing.

There's a lot of places where we do exactly that:
handle_assume_aligned_attribute, ...

> Perhaps use tree_int_cst_sgn (idx) < 0 instead of tree_to_shwi + val < 0,
> wi::to_widest (idx) >= TREE_VEC_LENGTH (types) for out of range and
> only use tree_to_shwi after those checks?

That sounds good too though, I can do that instead.

[Bug middle-end/108376] TSVC s1279 runs 40% faster with aocc than gcc at zen4

2024-02-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108376

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |WONTFIX
 Status|NEW |RESOLVED

--- Comment #4 from Richard Biener  ---
So I'd say INVALID or WONTFIX.

[Bug libstdc++/113841] Can't swap two std::hash

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113841

Jonathan Wakely  changed:

   What|Removed |Added

  Known to fail||12.3.1
  Known to work||13.1.0
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-02-09
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=109030

--- Comment #4 from Jonathan Wakely  ---
This is a compiler bug not a library bug. The preprocessed source from GCC 12
compiles fine with GCC 13.

GCC started to accept it with r13-6716

c++: maybe_constant_init and unevaluated operands [PR109030]

This testcase in this PR (already fixed by r13-6526-ge4692319fd5fc7)
demonstrates that maybe_constant_init can be called on an unevaluated
operand (e.g. from massage_init_elt) so this entry point should also
limit constant evaluation in that case, like maybe_constant_value does.

PR c++/109030

gcc/cp/ChangeLog:

* constexpr.cc (maybe_constant_init_1): For an unevaluated
non-manifestly-constant operand, don't constant evaluate
and instead call fold_to_constant as in maybe_constant_value.

[Bug libstdc++/113835] [13/14 Regression] compiling std::vector with const size in C++20 is slow

2024-02-09 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113835

--- Comment #2 from Jonathan Wakely  ---
(In reply to Richard Biener from comment #1)
> GCC 12 was fast (possibly std::vector wasn't constexpr there?)

It was constexpr since 12.1.0

So this might be related to Jason's implicit constexpr changes instead.

[Bug c++/113834] [14 Regression] internal compiler error: in tree_to_shwi, at tree.cc:6461

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834

--- Comment #6 from Jakub Jelinek  ---
(In reply to Marek Polacek from comment #5)
> To fix the ICE we could do:
> 
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -4644,7 +4644,9 @@ static tree
>  finish_type_pack_element (tree idx, tree types, tsubst_flags_t complain)
>  {
>idx = maybe_constant_value (idx);
> -  if (TREE_CODE (idx) != INTEGER_CST || !INTEGRAL_TYPE_P (TREE_TYPE (idx)))
> +  if (TREE_CODE (idx) != INTEGER_CST
> +  || !INTEGRAL_TYPE_P (TREE_TYPE (idx))
> +  || !tree_fits_shwi_p (idx))
>  {
>if (complain & tf_error)
> error ("%<__type_pack_element%> index is not an integral constant");

But then the diagnostics is confusing.

Perhaps use tree_int_cst_sgn (idx) < 0 instead of tree_to_shwi + val < 0,
wi::to_widest (idx) >= TREE_VEC_LENGTH (types) for out of range and
only use tree_to_shwi after those checks?

[Bug rust/113499] crab1 fails to link when configuring with --disable-plugin

2024-02-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113499

--- Comment #3 from Richard Biener  ---
(In reply to Richard Biener from comment #2)
> Re-confirmed.  Can be reproduced both on a glibc 2.31 and glibc 2.38 system
> with

It does work with glibc 2.38, so only glibc 2.31 fails this (and possibly other
OS).

[Bug c++/113834] [14 Regression] internal compiler error: in tree_to_shwi, at tree.cc:6461

2024-02-09 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834

Marek Polacek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |mpolacek at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

[Bug rust/113499] crab1 fails to link when configuring with --disable-plugin

2024-02-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113499

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-02-09
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #2 from Richard Biener  ---
Re-confirmed.  Can be reproduced both on a glibc 2.31 and glibc 2.38 system
with

../src/configure --enable-languages=rust --disable-bootstrap --disable-plugin

See GCC_ENABLE_PLUGIN which adjusts 'pluginlibs' but also causes symbols to
be exported from the executable.  You need to figure what you need.  For
example the 'jit' frontend also requires this (--enable-host-shared), but
IIRC it doesn't require -ldl

Some hosts may not support dynamically loading things.

[Bug c++/113834] [14 Regression] internal compiler error: in tree_to_shwi, at tree.cc:6461

2024-02-09 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #5 from Marek Polacek  ---
To fix the ICE we could do:

--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -4644,7 +4644,9 @@ static tree
 finish_type_pack_element (tree idx, tree types, tsubst_flags_t complain)
 {
   idx = maybe_constant_value (idx);
-  if (TREE_CODE (idx) != INTEGER_CST || !INTEGRAL_TYPE_P (TREE_TYPE (idx)))
+  if (TREE_CODE (idx) != INTEGER_CST
+  || !INTEGRAL_TYPE_P (TREE_TYPE (idx))
+  || !tree_fits_shwi_p (idx))
 {
   if (complain & tf_error)
error ("%<__type_pack_element%> index is not an integral constant");

[Bug target/113615] internal compiler error: in extract_insn, at recog.cc:2812

2024-02-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113615

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Richard Biener  ---
This seems fixed now.

[Bug rtl-optimization/101188] [11/12/13 Regression] [postreload] Uses content of a clobbered register

2024-02-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |law at gcc dot gnu.org
   Target Milestone|--- |11.5
 Status|REOPENED|ASSIGNED

[Bug rtl-optimization/101188] [11/12/13 Regression] [postreload] Uses content of a clobbered register

2024-02-09 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

Georg-Johann Lay  changed:

   What|Removed |Added

 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED
Summary|[postreload] Uses content   |[11/12/13 Regression]
   |of a clobbered register |[postreload] Uses content
   ||of a clobbered register

--- Comment #19 from Georg-Johann Lay  ---
Reopened for back-porting.

[Bug target/113847] [14 Regression] 10% slowdown of 462.libquantum on AMD Ryzen 7700X and Ryzen 7900X

2024-02-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113847

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug modula2/113848] modula2 doesn't build with clang

2024-02-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113848

--- Comment #1 from Richard Biener  ---
void * arithmetic is a GCC extension, I suggest to change that to char *

[Bug tree-optimization/113849] wrong code with _BitInt() arithmetics at -O1

2024-02-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113849

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-02-09

--- Comment #1 from Richard Biener  ---
Confirmed.

[Bug tree-optimization/113849] New: wrong code with _BitInt() arithmetics at -O1

2024-02-09 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113849

Bug ID: 113849
   Summary: wrong code with _BitInt() arithmetics at -O1
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
CC: jakub at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 57370
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57370=edit
reduced testcase

This testcase was originally attached to PR113774, but this is still failed
after the fix.

Output:
$ x86_64-pc-linux-gnu-gcc -O1 testcase.c
$ ./a.out
Aborted

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8897-20240209102321-g0a329ecf113-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-8897-20240209102321-g0a329ecf113-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240209 (experimental) (GCC)

[Bug modula2/113848] New: modula2 doesn't build with clang

2024-02-09 Thread fkastl at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113848

Bug ID: 113848
   Summary: modula2 doesn't build with clang
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: build
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: fkastl at suse dot cz
  Target Milestone: ---
  Host: x86_64-linux
Target: x86_64-linux

Building GCC using clang with modula2 language enabled raises this error.

m2/gm2-libs-boot/SArgs.c:93:90: error: arithmetic on a pointer to void
   93 |   ppc = static_cast ((void *) (((void *)
(UnixArgs_GetArgV ()))+(n*sizeof (SArgs_PtrToChar;
  | 
^

This started happening between

g:fbb569315a291d2d5b32ad0fdaf0c42da9f5e93b and
g:260a22de4fa3d4ad3bb0d3ef2cd45d7f03eb3160

The only commit touching ./gcc/m2/gm2-libs/Sargs.{def,mod} is

g:64b0130bb6702c67a13caefaae9facef23d6ac60

so I suppose that's the culprit commit.


The build is configured using
--disable-multilib --disable-libsanitizer --disable-bootstrap
--with-system-zlib --enable-languages=c,c++,fortran,go,jit,lto,rust,m2
--enable-host-shared

[Bug tree-optimization/113774] wrong code with _BitInt() arithmetics at -O2

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113774

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #9 from Jakub Jelinek  ---
Fixed.

[Bug tree-optimization/113818] ICE: verify_gimple failed: missing 'PHI' def with -Os -fnon-call-exceptions -finstrument-functions-once and _BitInt()

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113818

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Jakub Jelinek  ---
Fixed.

[Bug middle-end/113415] ICE: RTL check: -mstringop-strategy=byte_loop vs inline-asm goto with block copies

2024-02-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113415

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Jakub Jelinek  ---
Fixed.

[Bug middle-end/113415] ICE: RTL check: -mstringop-strategy=byte_loop vs inline-asm goto with block copies

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113415

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:0ad1884089c0fad4dfc72516bc68ec508cba1832

commit r14-8896-g0ad1884089c0fad4dfc72516bc68ec508cba1832
Author: Jakub Jelinek 
Date:   Fri Feb 9 11:08:33 2024 +0100

expand: Fix asm goto expansion [PR113415]

The asm goto expansion ICEs on the following testcase (which normally
is rejected later), because expand_asm_stmt emits the code to copy
the large var out of the out operand to its memory location into
after_rtl_seq ... after_rtl_end sequence and because it is asm goto,
it duplicates the sequence on each successor edge of the asm goto.
The problem is that with -mstringop-strategy=byte_loop that sequence
contains loops, so CODE_LABELs, JUMP_INSNs, with other strategies
could contain CALL_INSNs etc.
But the copying is done using a loop doing
emit_insn (copy_insn (PATTERN (curr)));
which does the right thing solely for INSNs, it will do the wrong thing
for JUMP_INSNs, CALL_INSNs, CODE_LABELs (with RTL checking even ICE on
them), BARRIERs and the like.

The following patch partially fixes it (with the hope that such stuff only
occurs in asms that really can't be accepted; if one uses say "=rm" or
"=g" constraint then the operand uses the memory directly and nothing is
copied) by using the
duplicate_insn_chain function which is used e.g. in RTL loop unrolling and
which can handle JUMP_INSNs, CALL_INSNs, BARRIERs etc.
As it is meant to operate on sequences inside of basic blocks, it doesn't
handle CODE_LABELs (well, it skips them), so if we need a solution that
will be correct at runtime here for those cases, we'd need to do further
work (e.g. still use duplicate_insn_chain, but if we notice any
CODE_LABELs,
walk the sequence again, add copies of the CODE_LABELs and then remap
references to the old CODE_LABELs in the copied sequence to the new ones).
Because as is now, if the code in one of the sequence copies (where the
CODE_LABELs have been left out) decides to jump to such a CODE_LABEL, it
will jump to the CODE_LABEL which has been in the original sequence (which
the code emits on the last edge, after all, duplicating the sequence
EDGE_COUNT times and throwing away the original was wasteful, compared to
doing that just EDGE_COUNT - 1 times and using the original.

2024-02-09  Jakub Jelinek  

PR middle-end/113415
* cfgexpand.cc (expand_asm_stmt): For asm goto, use
duplicate_insn_chain to duplicate after_rtl_seq sequence instead
of hand written loop with emit_insn of copy_insn and emit original
after_rtl_seq on the last edge.

* gcc.target/i386/pr113415.c: New test.

[Bug tree-optimization/113818] ICE: verify_gimple failed: missing 'PHI' def with -Os -fnon-call-exceptions -finstrument-functions-once and _BitInt()

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113818

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:6c124873f5197ca8aac5acfada4b0e7fba49807f

commit r14-8895-g6c124873f5197ca8aac5acfada4b0e7fba49807f
Author: Jakub Jelinek 
Date:   Fri Feb 9 11:07:34 2024 +0100

lower-bitint: Fix up additions of EH edges [PR113818]

Due to -fnon-call-exceptions the bitint lowering adds new EH edges
in various places, so that the EH edge points from handling (e.g. load or
store) of each of the limbs.  The problem is that the EH edge destination
as shown in the testcase can have some PHIs.  If it is just a virtual
PHI, no big deal, the pass uses TODO_update_ssa_only_virtuals, but if
it has other PHIs, I think we need to copy the values from the preexisting
corresponding EH edge (which is from the original stmt to the EH pad)
to the newly added EH edge, so that the PHI arguments are the same rather
than missing (which ICEs during checking at the end of the pass).

This patch adds a function to do that and uses it whenever adding EH edges.

2024-02-09  Jakub Jelinek  

PR tree-optimization/113818
* gimple-lower-bitint.cc (add_eh_edge): New function.
(bitint_large_huge::handle_load,
bitint_large_huge::lower_mergeable_stmt,
bitint_large_huge::lower_muldiv_stmt): Use it.

* gcc.dg/bitint-89.c: New test.

[Bug tree-optimization/113774] wrong code with _BitInt() arithmetics at -O2

2024-02-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113774

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:97e49bf00d1a7b7a2a02531a1c5362fad27348d9

commit r14-8894-g97e49bf00d1a7b7a2a02531a1c5362fad27348d9
Author: Jakub Jelinek 
Date:   Fri Feb 9 11:06:00 2024 +0100

lower-bitint: Attempt not to emit always true conditions in handle_cast
[PR113774]

The following patch is the optimization part of PR113774, where in
handle_cast we emit some conditionals which are always true and presumably
VRP would figure that out later and clean it up, except that instead
thread1 is invoked and threads everything through the conditions, so we end
up with really ugly code which is hard to be cleaned up later and then
run into PR113831 VN bug and miscompile stuff.

handle_cast computes low and high as limb indexes, where idx < low
doesn't need any special treatment, just uses the operand's limb,
idx >= high cases all the bits in the limb are an extension (so, for
unsigned widening cast all those bits are 0, for signed widening cast
all those bits are equal to the in earlier code computed sign mask,
narrowing cast don't trigger this code) and then the idx == low && idx <
high case if it exists need special treatment (some bits are copied, others
extended, or all bits are copied but sign mask needs to be computed).

The code already attempted to optimize away some unneeded casts, in the
first hunk below e.g. for the case like 257 -> 321 bit extension, where
low is 4 and high 5 and we use a loop handling the first 4 limbs (2
iterations) with m_upwards_2limb 4 - no special handling is needed in the
loop, and the special handling is done on the first limb after the loop
and then the last limb after the loop gets the extension only, or
in the second hunk where can emit a single comparison instead of
2 e.g. for the low == high case - that must be a zero extension from
multiple of limb bits, say 192 -> 328, or for the case where we know
the idx == low case happens in the other limb processed in the loop, not
the current one.

But the testcase shows further cases where we always know some of the
comparisons can be folded to true/false, in particular there is
255 -> 257 bit zero extension, so low 3, high 4, m_upwards_2limb 4.
The loop handles 2 limbs at the time and for the first limb we were
emitting idx < low ? operand[idx] : 0; but because idx goes from 0
with step 2 2 iterations, idx < 3 is always true, so we can just
emit operand[idx].  This is handled in the first hunk.  In addition
to fixing it (that is the " - m_first" part in there) I've rewritten
it using low to make it more readable.

Similarly, in the other limb we were emitting
idx + 1 <= low ? (idx + 1 == low ? operand[idx] & 0x7ffff :
operand[idx]) : 0
but idx + 1 <= 3 is always true in the loop, so all we should emit is
idx + 1 == low ? operand[idx] & 0x7ffff : operand[idx],
Unfortunately for the latter, when single_comparison is true, we emit
just one comparison, but the code which fills the branches will fill it
with the operand[idx] and 0 cases (for zero extension, for sign extension
similarly), not the operand[idx] (aka copy) and operand[idx] & 0x7ffff
(aka most significant limb of the narrower precision) cases.  Instead
of making the code less readable by using single_comparison for that and
handling it in the code later differently I've chosen to just emit
a condition which will be always true and let cfg cleanup clean it up.

2024-02-09  Jakub Jelinek  

PR tree-optimization/113774
* gimple-lower-bitint.cc (bitint_large_huge::handle_cast): Don't
emit any comparison if m_first and low + 1 is equal to
m_upwards_2limb, simplify condition for that.  If not
single_comparison, not m_first and we can prove that the idx <= low
comparison will be always true, emit instead of idx <= low
comparison low <= low such that cfg cleanup will optimize it at
the end of the pass.

* gcc.dg/torture/bitint-57.c: New test.

[Bug c++/113830] GCC accepts invalid code when instantiating the local class inside a function

2024-02-09 Thread harald at gigawatt dot nl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113830

--- Comment #14 from Harald van Dijk  ---
(In reply to Bo Wang from comment #13)
> (In reply to Harald van Dijk from comment #12)
> > (In reply to Bo Wang from comment #11)
> > > I have read the working draft standard of C++20
> > > (https://github.com/cplusplus/draft/tree/c%2B%2B20).
> > > 
> > > Following the subsection "13.9.2 Explicit instantiation" in the section
> > > "13.9 Template instantiation and specialization", the statement `template
> > > void f();` is an explicit instantiation, which requires instantiating
> > > everything in the function.
> > 
> > Where are you getting "everything in the function" from? It seems to say
> > rather the opposite in [temp.explicit]p14:
> > 
> > > An explicit instantiation does not constitute a use of a default 
> > > argument, so default argument instantiation is not done.
> > 
> > Now, the example shows that this was intended to apply to default arguments
> > of the function itself, but the actual wording does not limit it to that, so
> > I actually think this is a bug in clang, by the current wording this must be
> > accepted?
> 
> Please refer to the example in Comment 9 which has no default arguments.

Okay, sure, but if we have established that the standard does not say
"everything in the function" needs to be instantiated, where does it say that
*this* needs to be instantiated?

> For the standard, I found this one in "13.9 Template instantiation and
> specialization" p6 of C++20, which requires access checking.

That explains that the special exception that generally applies to template
instantiations does not apply here. This means the usual rules apply, so for
instance, you can't refer to a private member of a class unless you're a
friend. But for templates, these usual rules apply upon instantiation, so we
still need to establish whether or not this is required to be instantiated.

[Bug target/113847] New: [14 Regression] 10% slowdown of 462.libquantum on AMD Ryzen 7700X and Ryzen 7900X

2024-02-09 Thread fkastl at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113847

Bug ID: 113847
   Summary: [14 Regression] 10% slowdown of 462.libquantum on AMD
Ryzen 7700X and Ryzen 7900X
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization, needs-bisection
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fkastl at suse dot cz
  Target Milestone: ---
  Host: x86_64-linux
Target: x86_64-linux

As seen here

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=956.210.0

between commits

g:d826596acb02edf4

and

g:23cd2961bd2ff635

there is about 10% slowdown of execution time of the 2006SPEC 462.libquantum
benchmark.

The test is run with -O2 and lto on an AMD Ryzen 7700X.

I also reproduced the slowdown on a AMD Ryzen 7900X machine. However I wasn't
able to reproduce the slowdown on an AMD EPYC machine - also Zen4
microarchitecture. So I suppose this slowdown occurs only on Zen4 Ryzen CPUs or
is maybe even more specific.

I'm not sure if we want to do anything about this. The same slowdown on the
same machine has already happened once, see pr112547. The benchmark results
eventually returned to the original values.

  1   2   >