[Bug libstdc++/105934] New: [9/10/11/12/13 Regression] C++11 pointer versions of atomic_fetch_add missing because of P0558

2022-06-11 Thread hstong at ca dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105934

Bug ID: 105934
   Summary: [9/10/11/12/13 Regression] C++11 pointer versions of
atomic_fetch_add missing because of P0558
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: rejects-valid
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hstong at ca dot ibm.com
  Target Milestone: ---

P0558 removed the "pointer specializations" of `atomic_fetch_add`.

The replacements are not call-compatible for calls that use explicit template
arguments to guide conversions.

https://godbolt.org/z/aTT9EdP93

### SOURCE ():
#include 
struct A { template  operator std::atomic *(); };
int *f(A *ap) { return std::atomic_fetch_add(*ap, 1); }

### COMPILER INVOCATION:
g++ -fsyntax-only -std=c++11 -xc++ -

### ACTUAL OUTPUT:
: In function 'int* f(A*)':
:3:50: error: no matching function for call to
'atomic_fetch_add(A&, int)'
In file included from :1:
/opt/wandbox/gcc-head/include/c++/13.0.0/atomic:1525:5: note: candidate: '_ITp
std::atomic_fetch_add(atomic<_ITp>*, __atomic_diff_t<_ITp>) [with _ITp = int;
__atomic_diff_t<_ITp> = int]'
 1525 | atomic_fetch_add(atomic<_ITp>* __a,
  | ^~~~
/opt/wandbox/gcc-head/include/c++/13.0.0/atomic:1525:36: note:   no known
conversion for argument 1 from 'A' to 'std::atomic*'
 1525 | atomic_fetch_add(atomic<_ITp>* __a,
  |  ~~^~~
/opt/wandbox/gcc-head/include/c++/13.0.0/atomic:1531:5: note: candidate: '_ITp
std::atomic_fetch_add(volatile atomic<_ITp>*, __atomic_diff_t<_ITp>) [with _ITp
= int; __atomic_diff_t<_ITp> = int]'
 1531 | atomic_fetch_add(volatile atomic<_ITp>* __a,
  | ^~~~
/opt/wandbox/gcc-head/include/c++/13.0.0/atomic:1531:45: note:   no known
conversion for argument 1 from 'A' to 'volatile std::atomic*'
 1531 | atomic_fetch_add(volatile atomic<_ITp>* __a,
  |  ~~~^~~

### EXPECTED OUTPUT:
Clean compile

### COMPILER VERSION INFO (g++ -v):
Using built-in specs.
COLLECT_GCC=/opt/wandbox/gcc-head/bin/g++
COLLECT_LTO_WRAPPER=/opt/wandbox/gcc-head/libexec/gcc/x86_64-pc-linux-gnu/13.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../source/configure --prefix=/opt/wandbox/gcc-head
--enable-languages=c,c++ --disable-multilib --without-ppl --without-cloog-ppl
--enable-checking=release --disable-nls --enable-lto
LDFLAGS=-Wl,-rpath,/opt/wandbox/gcc-head/lib,-rpath,/opt/wandbox/gcc-head/lib64,-rpath,/opt/wandbox/gcc-head/lib32
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.0.0 20220611 (experimental) (GCC)

[Bug c++/105931] [12 regression] ICE in cxx_eval_constant_expression

2022-06-11 Thread sam at gentoo dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105931

--- Comment #2 from Sam James  ---
Interesting!

The minimised version of this is:
```
$ cat test.cxx
template  decltype(0 % ElemSize == 0)
```

```
$ g++ -o test.o -c -O2 test.cxx
new.cxx:1:52: internal compiler error: unexpected expression ‘ElemSize’ of kind
template_parm_index
1 | template  decltype(0 % ElemSize == 0)
  |   ~^~~~
0x6def33 cxx_eval_constant_expression
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:7587
0x6df310 cxx_eval_outermost_constant_expr
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:7824
0x6e154d potential_constant_expression_1
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:9274
0x6e1ef5 potential_constant_expression_1(tree_node*, bool, bool, bool, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:9550
0x6e1ef5 is_constant_expression(tree_node*)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:9607
0x6e1ef5 is_nondependent_constant_expression(tree_node*)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:9644
0x6e2b04 maybe_constant_value(tree_node*, tree_node*, bool)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:8071
0x74e35b fold_for_warn(tree_node*)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/expr.cc:416
0x8c9552 shorten_compare(unsigned int, tree_node**, tree_node**, tree_node**,
tree_code*)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/c-family/c-common.cc:3237
0x889d72 cp_build_binary_op(op_location_t const&, tree_code, tree_node*,
tree_node*, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/typeck.cc:6158
0x6bd96c build_new_op(op_location_t const&, tree_code, int, tree_node*,
tree_node*, tree_node*, tree_node*, tree_node**, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/call.cc:6935
0x88041b build_x_binary_op(op_location_t const&, tree_code, tree_node*,
tree_code, tree_node*, tree_code, tree_node*, tree_node**, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/typeck.cc:4563
0x81f7e7 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool,
bool)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/pt.cc:20369
0x82337a instantiate_non_dependent_expr_internal(tree_node*, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/pt.cc:6367
0x82337a instantiate_non_dependent_expr_sfinae(tree_node*, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/pt.cc:6388
0x85eec3 finish_decltype_type(tree_node*, bool, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/semantics.cc:11255
0x7e183f cp_parser_decltype
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/parser.cc:16540
0x7fa3e7 cp_parser_simple_type_specifier
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/parser.cc:19647
0x7d6cbd cp_parser_type_specifier
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/parser.cc:19424
0x7d7d81 cp_parser_decl_specifier_seq
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/parser.cc:15905
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://bugs.gentoo.org/> for instructions.
```

This minimised version is only slightly different to the one in the original
bug.

[Bug target/96463] [SVE] Optimise svld1rq from vectors

2022-06-11 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96463

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Prathamesh Kulkarni
:

https://gcc.gnu.org/g:494bec025002df422f2faa947138bf3643d80b54

commit r13-1055-g494bec025002df422f2faa947138bf3643d80b54
Author: Prathamesh Kulkarni 
Date:   Sun Jun 12 08:50:16 2022 +0530

PR96463: Optimise svld1rq from vectors for little endian AArch64 targets.

The patch folds:
lhs = svld1rq({-1, -1, ...}, rhs)
into:
tmp = mem_ref [(elem_type * {ref-all}) rhs]
lhs = vec_perm_expr.
which is then expanded using aarch64_expand_sve_dupq.

Example:

svint32_t
foo (int32x4_t x)
{
  return svld1rq (svptrue_b8 (), [0]);
}

code-gen:
foo:
.LFB4350:
dup z0.q, z0.q[0]
ret

The patch relaxes type-checking for VEC_PERM_EXPR by allowing different
vector types for lhs and rhs provided:
(1) rhs3 is constant and has integer type element.
(2) len(lhs) == len(rhs3) and len(rhs1) == len(rhs2)
(3) lhs and rhs have same element type.

gcc/ChangeLog:
PR target/96463
* config/aarch64/aarch64-sve-builtins-base.cc: Include ssa.h.
(svld1rq_impl::fold): Define.
* config/aarch64/aarch64.cc (expand_vec_perm_d): Define new members
op_mode and op_vec_flags.
(aarch64_evpc_reencode): Initialize newd.op_mode and
newd.op_vec_flags.
(aarch64_evpc_sve_dup): New function.
(aarch64_expand_vec_perm_const_1): Gate existing calls to
aarch64_evpc_* functions under d->vmode == d->op_mode,
and call aarch64_evpc_sve_dup.
(aarch64_vectorize_vec_perm_const): Remove assert
d->vmode != d->op_mode, and initialize d.op_mode and
d.op_vec_flags.
* tree-cfg.cc (verify_gimple_assign_ternary): Allow different
vector types for lhs and rhs in VEC_PERM_EXPR if rhs3 is
constant.

gcc/testsuite/ChangeLog:
PR target/96463
* gcc.target/aarch64/sve/acle/general/pr96463-1.c: New test.
* gcc.target/aarch64/sve/acle/general/pr96463-2.c: Likewise.

[Bug lto/105933] New: LTO ltrans object files does not have proper st_bind and st_visibility

2022-06-11 Thread ishitatsuyuki at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105933

Bug ID: 105933
   Summary: LTO ltrans object files does not have proper st_bind
and st_visibility
   Product: gcc
   Version: 12.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ishitatsuyuki at gmail dot com
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

It looks like the .ltrans.o object files emitted by GCC gold plugin does not
specify the `st_bind` and `st_visibility` attributes properly, and instead
relies on the linker to somehow carry them over from the information passed
from `add_symbols`.

This is causing issues in mold (https://github.com/rui314/mold/issues/524), for
cases of TLS symbols which uses GNU_UNIQUE instead of WEAK and cannot be
overridden. In mold we just throw away the IR object files and symbols once we
get the LTO output, doing the name resolution again --- and therefore the
assumption made for gold does not hold here. 

I'd argue this design is not great for debugging as well, as it creates an
object file that can be only linked while using the LTO plugin; if the
individual object files are directly passed to the linker through command-line,
then it results in a duplicate symbols error.

On a separate topic, it looks like we're missing COMDAT information as well ---
which AFAIK cannot be passed through any of the gold plugin API, unlike
`add_symbols`. Due to weird interactions between weak symbols in the main
object files and strong symbols in a static library, I think mold requires the
COMDAT to deduplicate the sections beforehand in this case, even if it's not
strictly necessary by the spec. So we would appreciate if you could include
that information in the LTO output too.

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-11 Thread sneves at dei dot uc.pt via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

Samuel Neves  changed:

   What|Removed |Added

 CC||sneves at dei dot uc.pt

--- Comment #6 from Samuel Neves  ---
Based on that bisect commit, it is also possible to repro this issue in earlier
GCCs (11, 10, seems fine on <= 9) purely by taking away the -mno-sseX, which
triggers the same splitting as now on gcc-12: https://godbolt.org/z/KEcWGT9Yc

[Bug preprocessor/105732] [10/11 Regression] internal compiler error: unspellable token PADDING

2022-06-11 Thread linux_dr at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105732

--- Comment #20 from Loren Osborn  ---
great... thank you for the update.

[Bug target/105932] New: Small structures returned incorrectly in i386 Microsoft ABI

2022-06-11 Thread josephcsible at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105932

Bug ID: 105932
   Summary: Small structures returned incorrectly in i386
Microsoft ABI
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Keywords: ABI, wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: josephcsible at gmail dot com
  Target Milestone: ---

Consider this C code:

struct foo {
int x, y;
};

extern int x, y;

struct foo f(void) {
struct foo rv;
rv.x = x;
rv.y = y;
return rv;
}

When compiled with "-O2 -m32 -mabi=ms", it compiles to this:

f:
movdx, %xmm0
movl4(%esp), %eax
movdy, %xmm1
punpckldq   %xmm1, %xmm0
movq%xmm0, (%eax)
ret

Which expects to be passed a hidden parameter to hold the address of the return
value. But in the i386 Microsoft ABI, that's not how returns work for POD types
that are 64 bits or smaller. Here's what it should compile to instead:

f:
movdx, %eax
movdy, %edx
ret

[Bug middle-end/105905] A possible rounding error

2022-06-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105905

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|WAITING |RESOLVED

--- Comment #3 from Andrew Pinski  ---
So clang defaults to -ffp-contract=off (maybe on which is actually the same as
off for GCC) while GCC defaults to -ffp-contract=fast. And with -march=native,
the FMA instruction is enabled which allows GCC to do contractions for some
floating point and uses FMA more.

Using -ffp-contract=off (or -ffp-contract=on) will get the behavior the
developer wants.

[Bug middle-end/105905] A possible rounding error

2022-06-11 Thread zhonghao at pku dot org.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105905

--- Comment #2 from zhonghao at pku dot org.cn ---
A programmer answered me, and provided some details. Here, I copy his response:

"This code:

Vector2 v{Math::sin(37.0_degf), Math::cos(37.0_degf)};

Utility::print("{:.10}\n", v[1]);
Utility::print("{:.10}\n", (v*v.lengthInverted())[1]);
Utility::print("{:.10}\n", (v/v.length())[1]);
prints the following in a debug build (-march=native -g) and in a release build
without -march=native (so just -O3.

0.7986354828
0.7986354828
0.7986354828
However, it prints the following in a -march=native -O3 build.

0.7986354828
0.798635602
0.7986355424
Okay, so I thought it's some optimization kicking in, producing a different
result, but then I realized that this code:

Vector2 v{Math::sin(37.0_degf), Math::cos(37.0_degf)};

// Utility::print("{:.10}\n", v[1]);
Utility::print("{:.10}\n", (v*v.lengthInverted())[1]);
Utility::print("{:.10}\n", (v/v.length())[1]);
prints

0.7986354828
0.7986354828
even with -march=native -O3. So, ummm, the v[1] in combination with
Utility::print() causes that particular optimization to kick in, and if it's
not there, it doesn't optimize anything? If I change Utility::print() to
std::printf(), it also stops being strange and prints 0.7986354828 three times.
So I suppose there has to be sufficiently complex code around these operations
to make some optimization kick in? I tried to look at the disassembly, the
"strange" variant has a bunch of FMA calls, the non-strange variant has none,
but those calls could also have been somewhere else, I'm not that good at
understanding the compiler output.

I tested with GCC 10 as well, and it has the same weird behavior as 11.
Unfortunately I don't remember if I was at GCC 10 or 9 before that commit.
Clang prints 0.7986354828 always.
"

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-11 Thread torvalds--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #5 from Linus Torvalds  ---
(In reply to Linus Torvalds from comment #4)
> 
> I'm  not proud of that hacky thing, but since gcc documentation is written
> in sanskrit, and mere mortals can't figure it out, it's the best I could do.

And bu 'sanskrit' I mean 'texi'. 

I understand that it's how GNU projects are supposed to work, but markdown (or
rst) is just *so* much more legible and you really can read it like text.

Anyway, that's my excuse for not knowing how to "just generate cc1" for a saner
git bisect run. What I did worked, but was just incredibly ugly. There must be
some better way gcc developers have when they want to bisect cc1 behavior.

[Bug c++/105756] [12 Regression] ICE in cxx_eval_constant_expression at cp/constexpr.cc:7586: unexpected expression ‘ElemSize’ of kind template_parm_index

2022-06-11 Thread sam at gentoo dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105756

--- Comment #7 from Sam James  ---
Thanks a bunch. Unfortunately the original issue (not the reduced one) still
fails, but I've filed bug 105931 for that.

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-11 Thread torvalds--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #4 from Linus Torvalds  ---
So hey, since you guys use git now, I thought I might as well just bisect this.

Now, I have no idea what the best and most efficient way is to generate only
"cc1", so my bisection run was this unholy mess of "just run configure, and
then run 'make -j128' for long enough that 'host-x86_64-pc-linux-gnu/gcc/cc1'
gets generated, and test that".

I'm  not proud of that hacky thing, but since gcc documentation is written in
sanskrit, and mere mortals can't figure it out, it's the best I could do.

And the problem bisects down to

  8ea4a34bd0b0a46277b5e077c89cbd86dfb09c48 is the first bad commit
  commit 8ea4a34bd0b0a46277b5e077c89cbd86dfb09c48
  Author: Roger Sayle 
  Date:   Sat Mar 5 08:50:45 2022 +

  PR 104732: Simplify/fix DI mode logic expansion/splitting on -m32.

so yes, this seems to be very much specific to the i386 target.

And yes, I also verified that reverting that commit on the current master
branch solves it for me.

Again: this was just a completely mindless bisection, with a "revert to verify"
on top of the current trunk, which for me happened to be commit cd02f15f1ae
("xtensa: Improve constant synthesis for both integer and floating-point").

I'm attaching the revert patch I used just so that you can see exactly what I
did. I probably shouldn't have actually removed the testsuite entry, but again:
ENTIRELY MINDLESS BISECTION RESULT.

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-11 Thread torvalds--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #3 from Linus Torvalds  ---
Created attachment 53123
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53123=edit
Mindless revert that fixes things for me

[Bug c++/105931] [12 regression] ICE in cxx_eval_constant_expression

2022-06-11 Thread sam at gentoo dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105931

Sam James  changed:

   What|Removed |Added

 CC||herrtimson at yahoo dot de,
   ||ppalka at gcc dot gnu.org,
   ||slyfox at gcc dot gnu.org

--- Comment #1 from Sam James  ---
I'll try to reduce now but wanted to include original given the previous issue.

[Bug c++/105931] New: [12 regression] ICE in cxx_eval_constant_expression

2022-06-11 Thread sam at gentoo dot org via Gcc-bugs
rce/spidermonkey/mozjs-78.6.0/build-debug/dist/include/mozilla/Assertions.h:482:31:
note: in expansion of macro ‘MOZ_ASSERT_HELPER1’
  482 | #define MOZ_ASSERT_GLUE(a, b) a b
  |   ^
/var/tmp/portage/games-strategy/0ad-0.0.25b_alpha-r1/work/0ad-0.0.25b-alpha/libraries/source/spidermonkey/mozjs-78.6.0/build-debug/dist/include/mozilla/Assertions.h:490:5:
note: in expansion of macro ‘MOZ_ASSERT_GLUE’
  490 | MOZ_ASSERT_GLUE(   
\
  | ^~~
/var/tmp/portage/games-strategy/0ad-0.0.25b_alpha-r1/work/0ad-0.0.25b-alpha/libraries/source/spidermonkey/mozjs-78.6.0/js/src/util/TrailingArray.h:73:5:
note: in expansion of macro ‘MOZ_ASSERT’
   73 | MOZ_ASSERT((end - start) % ElemSize == 0);
  | ^~
0x6def33 cxx_eval_constant_expression
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:7587
0x6df310 cxx_eval_outermost_constant_expr
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:7824
0x6e154d potential_constant_expression_1
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:9274
0x6e1ef5 potential_constant_expression_1(tree_node*, bool, bool, bool, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:9550
0x6e1ef5 is_constant_expression(tree_node*)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:9607
0x6e1ef5 is_nondependent_constant_expression(tree_node*)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:9644
0x6e2b04 maybe_constant_value(tree_node*, tree_node*, bool)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/constexpr.cc:8071
0x74e35b fold_for_warn(tree_node*)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/expr.cc:416
0x8c9552 shorten_compare(unsigned int, tree_node**, tree_node**, tree_node**,
tree_code*)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/c-family/c-common.cc:3237
0x889d72 cp_build_binary_op(op_location_t const&, tree_code, tree_node*,
tree_node*, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/typeck.cc:6158
0x6bd96c build_new_op(op_location_t const&, tree_code, int, tree_node*,
tree_node*, tree_node*, tree_node*, tree_node**, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/call.cc:6935
0x88041b build_x_binary_op(op_location_t const&, tree_code, tree_node*,
tree_code, tree_node*, tree_code, tree_node*, tree_node**, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/typeck.cc:4563
0x81f7e7 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool,
bool)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/pt.cc:20369
0x82337a instantiate_non_dependent_expr_internal(tree_node*, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/pt.cc:6367
0x82337a instantiate_non_dependent_expr_sfinae(tree_node*, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/pt.cc:6388
0x85eec3 finish_decltype_type(tree_node*, bool, int)
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/semantics.cc:11255
0x7e183f cp_parser_decltype
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/parser.cc:16540
0x7fa3e7 cp_parser_simple_type_specifier
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/parser.cc:19647
0x7d6cbd cp_parser_type_specifier
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/parser.cc:19424
0x7e7eeb cp_parser_type_specifier_seq
   
/usr/src/debug/sys-devel/gcc-12.1.1_p20220611/gcc-12-20220611/gcc/cp/parser.cc:24362
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See <https://bugs.gentoo.org/> for instructions.
```

```
# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/12.1.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-12.1.1_p20220611/work/gcc-12-20220611/configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/12.1.1
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/12.1.1/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/12.1.1
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/12.1.1/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/12.1.1/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/12.1.1/include/g++-v12
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/12.1.1/python
--enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls --without-inclu

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #2 from Andrew Pinski  ---
thumb1 (which has 16 registers but really only 8 are GPRs) does not have this
issue in GCC 12, so I suspect a target specific change caused this.

[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86

2022-06-11 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||needs-bisection
   Target Milestone|--- |12.2
Summary|Excessive stack spill   |[12/13 Regression]
   |generation on 32-bit x86|Excessive stack spill
   ||generation on 32-bit x86
 Target||i?86-*-*

[Bug target/105930] Excessive stack spill generation on 32-bit x86

2022-06-11 Thread torvalds--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

--- Comment #1 from Linus Torvalds  ---
Side note: it might be best to clarify that this is a regression specific to
gcc-12.

Gcc 11.3 doesn't have the problem, and generates code for this same test-case
with a stack frame of only 428 bytes. That's still bigger than clang, but not
"crazy bigger".

[Bug libquadmath/105101] incorrect rounding for sqrtq

2022-06-11 Thread already5chosen at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105101

--- Comment #19 from Michael_S  ---
(In reply to jos...@codesourcery.com from comment #18)
> libquadmath is essentially legacy code.  People working directly in C 
> should be using the C23 _Float128 interfaces and *f128 functions, as in 
> current glibc, rather than libquadmath interfaces (unless their code needs 
> to support old glibc or non-glibc C libraries that don't support _Float128 
> in C23 Annex H).  It would be desirable to make GCC generate *f128 calls 
> when appropriate from Fortran code using this format as well; see 
>  for 
> more discussion of the different cases involved.
> 


On MSYS2 _Float128 and __float128 appears to be mostly the same thing, mapped
to the same library routines with significant difference that _Float128 is not
accessible from C++. Since all my test benches are written in C++ I can't even
validate that what I wrote above is 100% true.

Also according to my understanding of glibc docs (not the clearest piece of
text that I ever read) a relevant square root routine should be named
sqrtf128().
Unfortunately, nothing like that appears to be present in either math.h or in
library. Am I doing something wrong?


> Most of libquadmath is derived from code in glibc - some of it can now be 
> updated from the glibc code automatically (see update-quadmath.py), other 
> parts can't (although it would certainly be desirable to extend 
> update-quadmath.py to cover that other code as well).  See the commit 
> message for commit 4239f144ce50c94f2c6cc232028f167b6ebfd506 for a more 
> detailed discussion of what code comes from glibc and what is / is not 
> automatically handled by update-quadmath.py.  Since update-quadmath.py 
> hasn't been run for a while, it might need changes to work with more 
> recent changes to the glibc code.
> 
> sqrtq.c is one of the files not based on glibc code.  That's probably 
> because glibc didn't have a convenient generic implementation of binary128 
> sqrt to use when libquadmath was added - it has soft-fp implementations 
> used for various architectures, but those require sfp-machine.h for each 
> architecture (which maybe we do in fact have in libgcc for each relevant 
> architecture, but it's an extra complication).  Certainly making it 
> possible to use code from glibc for binary128 sqrt would be a good idea, 
> but while we aren't doing that, it should also be OK to improve sqrtq 
> locally in libquadmath.
> 
> The glibc functions for this format are generally *not* optimized for 
> speed yet (this includes the soft-fp-based versions of sqrt).  Note that 
> what's best for speed may depend a lot on whether the architecture has 
> hardware support for binary128 arithmetic; if it has such support, it's 
> more likely an implementation based on binary128 floating-point operations 
> is efficient; 

Not that simple.
Right now, there are only two [gcc] platforms with hw binary128 - IBM POWER and
IBM z. I am not sure about the later, but the former has xssqrtqp instruction
which is likely the right way to do sqrtq()/sqrtf128() on this platform. If z
is the same, which sound likely, then implementation based on binary128
mul/add/fma by now has no use cases at all.


> if it doesn't, direct use of integer arithmetic, without 
> lots of intermediate packing / unpacking into the binary128 format, is 
> likely to be more efficient.  

It's not just redundant packing/unpacking. Direct integer implementation
does fewer arithmetic operations as well, mainly because it know exactly which
parts of 226-bit multiplication product are relevant and does not calculate
parts that are irrelevant.
And with integer math it is much easier to achieve correct rounding at corner
cases that call for precision in excess of 226 bits, so even fmaq() is not
enough. And yes, there is one or two cases like that.

> See the discussion starting at 
>  
> for more on this - glibc is a better place for working on most optimized 
> function implementations than GCC.  See also 
>  - those functions are aiming to 
> be correctly rounding, which is *not* a goal for most glibc libm 
> functions, but are still quite likely to be faster than the existing 
> non-optimized functions in glibc.
> 
> fma is a particularly tricky case because it *is* required to be correctly 
> rounding, in all rounding modes, and correct rounding implies correct 
> exceptions, *and* correct exceptions for fma includes getting right the 
> architecture-specific choice of whether tininess is detected before or 
> after rounding.

I suspect that by strict IEEE-754 rules sqrt() is the same as fma(), i.e. you
have to calculate a root to infinite precision and then to round with
accordance to current mode.

[O.T.]
The whole rounding modes business complicates things quite a lot 

[Bug c/105930] New: Excessive stack spill generation on 32-bit x86

2022-06-11 Thread torvalds--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930

Bug ID: 105930
   Summary: Excessive stack spill generation on 32-bit x86
   Product: gcc
   Version: 12.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: torva...@linux-foundation.org
  Target Milestone: ---

Created attachment 53121
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53121=edit
Test-case extracted from the generic blake2b kernel code

Gcc-12 seems to generate a huge number of stack spills on this blake2b
test-case, to the point where it overflows the allowable kernel stack on 32-bit
x86.

This crypto thing has two 128-byte buffers, so a stack frame a bit larger than
256 is expected when the dataset doesn't fit in the register set.

Just as an example, on this code, clang-.14.0.0 generates a stack frame that is
296 bytes. 

In contrast, gcc-12.1.1 generates a stack frame that is almost an order of
magnitude(!) larger, at 2620 bytes.

The trivial Makefile I used for this test-case is

   # The kernel cannot just randomly use FP/MMX/AVX
CFLAGS := -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx
CFLAGS += -m32
CFLAGS += -O2

test:
gcc $(CFLAGS) -Wall -S blake2b.c
grep "sub.*%[er]sp" blake2b.s

to easily test different flags and the end result, but as can be seen from
above, it really doesn't need any special flags except the ones that disable
MMX/AVX code generation.

And the generated code looks perfectly regular, except for the fact that it
uses almost 3kB of stack space.

Note that "-m32" is required to trigger this - the 64-bit case does much
better, presumably because it has more registers and this needs fewer spills.
It gets worse with some added debug flags we use in the kernel, but not that
kind of "order of magnitude" worse.

Using -O1 or -Os makes no real difference.

This is presumably due to some newly triggered optimization in gcc-12, but I
can't even begin to guess at what we'd need to disable (or enable) to avoid
this horrendous stack growth. Some very aggressive instruction scheduling thing
that spreads out all the calculations and always wants to spill-and-reload the
subepxressions that it CSE'd? I dunno. 

Pls advice. The excessive stack literally causes build failures due to us using
-Werror-frame-larger-than= to make sure stack use remains sanely bounded. The
kernel stack is a rather limited resource.

[Bug target/105929] New: [AArch64] armv8.4-a allows atomic stp. 64-bit constants can use 2 32-bit halves with _Atomic or volatile

2022-06-11 Thread peter at cordes dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105929

Bug ID: 105929
   Summary: [AArch64] armv8.4-a allows atomic stp. 64-bit
constants can use 2 32-bit halves with _Atomic or
volatile
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: peter at cordes dot ca
  Target Milestone: ---
Target: arm64-*-*

void foo(unsigned long *p) {
*p = 0xdeadbeefdeadbeef;
}
// compiles nicely:  https://godbolt.org/z/8zf8ns14K
mov w1, 48879
movkw1, 0xdead, lsl 16
stp w1, w1, [x0]
ret

But even with -Os -march=armv8.4-a   the following doesn't:
void foo_atomic(_Atomic unsigned long *p) {
__atomic_store_n(p, 0xdeadbeefdeadbeef, __ATOMIC_RELAXED);
}

mov x1, 48879
movkx1, 0xdead, lsl 16
movkx1, 0xbeef, lsl 32
movkx1, 0xdead, lsl 48
stlrx1, [x0]
ret

ARMv8.4-a and later guarantees atomicity for aligned ldp/stp, according to
ARM's architecture reference manual: ARM DDI 0487H.a - ID020222, so we could
use the same asm as the non-atomic version.

> If FEAT_LSE2 is implemented, LDP, LDNP, and STP instructions that access 
> fewer than 16 bytes are single-copy atomic when all of the following 
> conditions are true:
> • All bytes being accessed are within a 16-byte quantity aligned to 16 bytes.
> • Accesses are to Inner Write-Back, Outer Write-Back Normal cacheable memory

(FEAT_LSE2 is the same CPU feature that gives 128-bit atomicity for aligned
ldp/stp x,x,mem)

Prior to that, apparently it wasn't guaranteed that stp of 32-bit halves merged
into a single 64-bit store. So without -march=armv8.4-a it wasn't a missed
optimization to construct the constant in a single register for _Atomic or
volatile.

But with ARMv8.4, we should use MOV/MOVK + STP.

Since there doesn't seem to be a release-store version of STP, 64-bit release
and seq_cst stores should still generate the full constant in a register,
instead of using STP + barriers.


(Without ARMv8.4-a, or with a memory-order other than relaxed, see PR105928 for
generating 64-bit constants in 3 instructions instead of 4, at least for -Os,
with add x0, x0, x0, lsl 32)

[Bug target/105928] New: [AArch64] 64-bit constants with same high/low halves can use ADD lsl 32 (-Os at least)

2022-06-11 Thread peter at cordes dot ca via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105928

Bug ID: 105928
   Summary: [AArch64] 64-bit constants with same high/low halves
can use ADD lsl 32 (-Os at least)
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: peter at cordes dot ca
  Target Milestone: ---
Target: arm64-*-*

void foo(unsigned long *p) {
*p = 0xdeadbeefdeadbeef;
}

cleverly compiles to https://godbolt.org/z/b3oqao5Kz

mov w1, 48879
movkw1, 0xdead, lsl 16
stp w1, w1, [x0]
ret

But producing the value in a register uses more than 3 instructions:

unsigned long constant(){
return 0xdeadbeefdeadbeef;
}

mov x0, 48879
movkx0, 0xdead, lsl 16
movkx0, 0xbeef, lsl 32
movkx0, 0xdead, lsl 48
ret

At least with -Os, and maybe at -O2 or -O3 if it's efficient, we could be doing
a shifted ADD or ORR to broadcast a zero-extended 32-bit value to 64-bit.

mov x0, 48879
movkx0, 0xdead, lsl 16
add x0, x0, x0, lsl 32

Some CPUs may fuse sequences of movk, and shifted operands for ALU ops may take
extra time in some CPUs, so this might not actually be optimal for performance,
but it is smaller for -Os and -Oz.

We should also be using that trick for stores to _Atomic or volatile long*,
where we currently do MOV + 3x MOVK, then an STR, with ARMv8.4-a which
guarantees atomicity.


---

ARMv8.4-a and later guarantees atomicity for ldp/stp within an aligned 16-byte
chunk, so we should use MOV/MOVK / STP there even for volatile or
__ATOMIC_RELAXED.  But presumably that's a different part of GCC's internals,
so I'll report that separately.

[Bug target/105927] New: ICE: RTL check: expected code 'reg', have 'mem' in rhs_regno, at rtl.h:1932 with -mtune=k6-3 -msse

2022-06-11 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105927

Bug ID: 105927
   Summary: ICE: RTL check: expected code 'reg', have 'mem' in
rhs_regno, at rtl.h:1932 with -mtune=k6-3 -msse
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 53120
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53120=edit
reduced testcase

Compiler output:
$ 
x86_64-pc-linux-gnu-gcc -O1 -fno-tree-dce -mtune=k6-3 -msse -m32 testcase.c 
during RTL pass: combine
testcase.c: In function 'foo':
testcase.c:14:1: internal compiler error: RTL check: expected code 'reg', have
'mem' in rhs_regno, at rtl.h:1932
   14 | }
  | ^
0x772c50 rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int,
char const*)
/repo/gcc-trunk/gcc/rtl.cc:916
0xac528d rhs_regno
/repo/gcc-trunk/gcc/rtl.h:1932
0xac6013 rhs_regno
/repo/gcc-trunk/gcc/config/i386/predicates.md:699
0xac6013 register_no_elim_operand_1
/repo/gcc-trunk/gcc/config/i386/predicates.md:677
0xac6013 register_no_elim_operand(rtx_def*, machine_mode)
/repo/gcc-trunk/gcc/config/i386/predicates.md:685
0xac6013 nonmemory_no_elim_operand(rtx_def*, machine_mode)
/repo/gcc-trunk/gcc/config/i386/predicates.md:710
0x2041a01 recog_9
/repo/gcc-trunk/gcc/config/i386/i386.md:3675
0x225d3ca recog_14
/repo/gcc-trunk/gcc/config/i386/i386.md:1086
0x22dd73d recog_330
/repo/gcc-trunk/gcc/config/i386/sse.md:28926
0x2307381 recog_for_combine_1
/repo/gcc-trunk/gcc/combine.cc:11352
0x230e42e recog_for_combine
/repo/gcc-trunk/gcc/combine.cc:11622
0x2320776 try_combine
/repo/gcc-trunk/gcc/combine.cc:3543
0x2328674 combine_instructions
/repo/gcc-trunk/gcc/combine.cc:1287
0x2328674 rest_of_handle_combine
/repo/gcc-trunk/gcc/combine.cc:14976
0x2328674 execute
/repo/gcc-trunk/gcc/combine.cc:15021
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r13-1049-20220611004016-gfddb7f65129-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/13.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r13-1049-20220611004016-gfddb7f65129-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.0.0 20220611 (experimental) (GCC)

[Bug fortran/105924] false floating point exception when evaluating exponential function

2022-06-11 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105924

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||kargl at gcc dot gnu.org
 Resolution|--- |INVALID

--- Comment #1 from kargl at gcc dot gnu.org ---
Why do you thing that you should not get an exception?

e = -400
e*e = 16
-e*e = -16
exp(-e*e) = exp(-16)  <-- This is going to underflow to zero.

You specifically asked gfortran to signal an exception if
underflow occurs with the -ffpe-trap=underflow option.  The
underflow threshold occurs at x = -745 for exp(x).

[Bug rtl-optimization/7061] Access of bytes in struct parameters

2022-06-11 Thread david.bolvansky at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=7061

Dávid Bolvanský  changed:

   What|Removed |Added

 CC||david.bolvansky at gmail dot 
com

--- Comment #10 from Dávid Bolvanský  ---
llvm emits just:
im: # @im
shufps  xmm0, xmm0, 85  # xmm0 = xmm0[1,1,1,1]
ret

[Bug c++/105925] [11/12 Regression] Could not convert '{{0, 0.0}}' from '' to 'X'

2022-06-11 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105925

Jonathan Wakely  changed:

   What|Removed |Added

Summary|[11.1 Regression] Could not |[11/12 Regression] Could
   |convert '{{0, 0.0}}' from   |not convert '{{0, 0.0}}'
   |'' to 'X'   |initializer list>' to 'X'
   Target Milestone|--- |11.4

[Bug c++/105756] [12 Regression] ICE in cxx_eval_constant_expression at cp/constexpr.cc:7586: unexpected expression ‘ElemSize’ of kind template_parm_index

2022-06-11 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105756

Patrick Palka  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Patrick Palka  ---
(In reply to Sam James from comment #4)
> Thanks! Could you consider backporting to 12.x soonish, if possible? I ask
> because with this, the 12.x branch is then in a pretty good state for more
> widespread testing.
> 
> (Unfortunately, I got a bit unlucky and kept hitting ICEs when trying to
> build a bunch of common packages.)

Done.

[Bug c++/105756] [12 Regression] ICE in cxx_eval_constant_expression at cp/constexpr.cc:7586: unexpected expression ‘ElemSize’ of kind template_parm_index

2022-06-11 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105756

--- Comment #5 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Patrick Palka
:

https://gcc.gnu.org/g:47ea22015c90df31eae763c6c9e3e4b1fb801c3a

commit r12-8472-g47ea22015c90df31eae763c6c9e3e4b1fb801c3a
Author: Patrick Palka 
Date:   Fri Jun 3 14:58:22 2022 -0400

c++: value-dep but not type-dep decltype expr [PR105756]

Here during ahead of time instantiation of the value-dependent but not
type-dependent decltype expression (5 % N) == 0, cp_build_binary_op folds
the operands of the == via cp_fully_fold, which performs speculative
constexpr evaluation, and from which we crash for (5 % N) due to the
value-dependence.

Since the operand folding performed by cp_build_binary_op appears to
be solely for sake of diagnosing overflow, and since these diagnostics
are suppressed when in an unevaluated context, this patch avoids this
crash by suppressing cp_build_binary_op's operand folding accordingly.

PR c++/105756

gcc/cp/ChangeLog:

* typeck.cc (cp_build_binary_op): Don't fold operands
when c_inhibit_evaluation_warnings.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/decltype82.C: New test.

(cherry picked from commit 0ecb6b906f215ec56df1a555139abe9ad95414fb)

[Bug libstdc++/105926] Using a spaceship operator on an optional of a type derived from optional causes infinite constraint recursion

2022-06-11 Thread ville.voutilainen at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105926

Ville Voutilainen  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ville.voutilainen at 
gmail dot com

--- Comment #1 from Ville Voutilainen  ---
I have a patch for it, and this likely needs a LWG issue and probably a LEWG
ack. I'll submit the latter soon-ish, and will post the patch for review, too.

[Bug libstdc++/105926] New: Using a spaceship operator on an optional of a type derived from optional causes infinite constraint recursion

2022-06-11 Thread ville.voutilainen at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105926

Bug ID: 105926
   Summary: Using a spaceship operator on an optional of a type
derived from optional causes infinite constraint
recursion
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ville.voutilainen at gmail dot com
  Target Milestone: ---

Testcase:

#include 

struct oink : std::optional {
};

bool operator<(const oink&, const oink) {return false;}
bool operator>(const oink&, const oink) {return true;}
bool operator<=(const oink&, const oink) {return false;}
bool operator>=(const oink&, const oink) {return true;}
bool operator==(const oink&, const oink) {return false;}
bool operator!=(const oink&, const oink) {return true;}

int main()
{
oink a;
std::optional b;
b <=> a;
}

Ends up, eventually, with..

error: satisfaction of atomic constraint 'requires(const typename
std::remove_reference<_Tp>::type& __t, const typename
std::remove_reference<_Arg>::type& __u) {{__t < __u} -> decltype(auto)
[requires std::__detail::__boolean_testable<, >];{__t > __u} ->
decltype(auto) [requires std::__detail::__boolean_testable<,
>];{__t <= __u} -> decltype(auto) [requires
std::__detail::__boolean_testable<, >];{__t >= __u} ->
decltype(auto) [requires std::__detail::__boolean_testable<,
>];{__u < __t} -> decltype(auto) [requires
std::__detail::__boolean_testable<, >];{__u > __t} ->
decltype(auto) [requires std::__detail::__boolean_testable<,
>];{__u <= __t} -> decltype(auto) [requires
std::__detail::__boolean_testable<, >];{__u >= __t} ->
decltype(auto) [requires std::__detail::__boolean_testable<, >];}
[with _Up = oink; _Tp = oink]' depends on itself
  302 | = requires(const remove_reference_t<_Tp>& __t,
  |   ^~~~
  303 |const remove_reference_t<_Up>& __u) {
  |~
  304 |   { __t <  __u } -> __boolean_testable;
  |   ~
  305 |   { __t >  __u } -> __boolean_testable;
  |   ~
  306 |   { __t <= __u } -> __boolean_testable;
  |   ~
  307 |   { __t >= __u } -> __boolean_testable;
  |   ~
  308 |   { __u <  __t } -> __boolean_testable;
  |   ~
  309 |   { __u >  __t } -> __boolean_testable;
  |   ~
  310 |   { __u <= __t } -> __boolean_testable;
  |   ~
  311 |   { __u >= __t } -> __boolean_testable;
  |   ~
  312 | };

The relevant bit being "satisfaction of atomic constraint foo depends on
itself".

[Bug driver/100830] Multilib directory picking logic handles default arguments oddly

2022-06-11 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100830

Thomas Schwinge  changed:

   What|Removed |Added

 CC||kito at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

--- Comment #2 from Thomas Schwinge  ---
(Beware: I'm still learning GCC's multilib magic, but) has this, by chance,
been fixed by Kito's r11-5530-g3a5d8ed231a0329822b7c032ba0834991732d2a0 "Fix
print_multilib_info when default arguments appear in the option list with '!'"?
 (That's a GCC 11 commit; this PR is for "Version: 10.2.0".)

[Bug rtl-optimization/105747] Scheduler can take a long time on arm-linux sometimes

2022-06-11 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105747

--- Comment #8 from David Binderman  ---
(In reply to David Binderman from comment #7)
> Created attachment 53119 [details]
> C source code
> 
> Another one. Over 15 minutes this time.

Flag -O2 required.

[Bug rtl-optimization/105747] Scheduler can take a long time on arm-linux sometimes

2022-06-11 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105747

--- Comment #7 from David Binderman  ---
Created attachment 53119
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53119=edit
C source code

Another one. Over 15 minutes this time.

[Bug c++/105925] New: [11.1 Regression] Could not convert '{{0, 0.0}}' from '' to 'X'

2022-06-11 Thread jehova at existiert dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105925

Bug ID: 105925
   Summary: [11.1 Regression] Could not convert '{{0, 0.0}}' from
'' to 'X'
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jehova at existiert dot net
  Target Milestone: ---

struct V
{
int i;
double d;
};

struct X
{
union
{
int x;
V y;
};
};

X foo()
{
return {.y = {0, 0.0}};
}

Compilation with 'g++ -std=c++20' works in 11.1 but fails for 11.2 and newer,
particularly in 12.1.

See https://godbolt.org/z/foq9aEs57

The code is also accepted by other major compilers (clang and MSVC).

[Bug middle-end/101836] __builtin_object_size(P->M, 1) where M is an array and the last member of a struct fails

2022-06-11 Thread kees at outflux dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836

--- Comment #17 from Kees Cook  ---
(In reply to qinzhao from comment #16)
> additional work are needed in order to make this task complete:
> 
> 1. add one more new gcc option:
> 
> -fstrict-flex-arrays
> 
> when it's on, only treat the following cases as flexing array:
> 
> trailing array with size 0;
> trailing array with size 1;
> trailing flexible array;
> 
> all other trailing arrays with size > 1 will be treated as normal arrays. 

Under -fstrict-flex-arrays, arrays of size 0 and 1 should *not* be treated as
flex arrays. Only "[]" should be a flexible array. Everything else should be
treated as having the literal size given.

[Bug fortran/105924] New: false floating point exception when evaluating exponential function

2022-06-11 Thread yelinhui at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105924

Bug ID: 105924
   Summary: false floating point exception when evaluating
exponential function
   Product: gcc
   Version: 7.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yelinhui at hotmail dot com
  Target Milestone: ---

$ gfortran --version
GNU Fortran (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.

$ cat test.f
  program test
  implicit none
  real*8 :: e
  e=-4.d2  
  print *, e, exp(-e*e)
  end program test

$ gfortran -O0 -g -fbacktrace
-ffpe-trap=zero,overflow,underflow,invalid,denormal test.f

$./a.out

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic
operation.

Backtrace for this error:
#0  0x1554d03f82ed in ???
#1  0x1554d03f7503 in ???
#2  0x1554cfc8cf0f in ???
#3  0x1554d00c5ea4 in ???
#4  0x1554d004e1fe in ???
#5  0x55b38e68a9d5 in test
at /tmp/test.f:7
#6  0x55b38e68aa54 in main
at /tmp/test.f:9
Floating point exception (core dumped)