[Bug libstdc++/113504] High memory usage for parallel `std::sort`

2024-01-22 Thread ruben.laso at tuwien dot ac.at via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113504

--- Comment #2 from Ruben Laso  ---
(In reply to Jonathan Wakely from comment #1)
> The parallel algos are taken from
> https://github.com/llvm/llvm-project/tree/main/pstl so I would file an issue
> upstream rather than here. The Intel PSTL developers are the right people to
> ask.

I will ask there, then. Thank you!

[Bug libstdc++/113500] Using std::format with float or double based std::chrono::time_point causes error: no match for 'operator<<'

2024-01-22 Thread Hirthammer--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113500

--- Comment #11 from hirtham...@allterra-dno.de ---
(In reply to Jonathan Wakely from comment #7)
> (In reply to Jonathan Wakely from comment #6)
> > (In reply to Hirthammer from comment #5)
> > > This whole thing with std::format and std::chrono::time_point is 
> > > currently a
> > > total minefield.
> > 
> > That seems like an exaggeration.
> > 
> > > In MSVC it is even more complicated and I already reported
> > > the bug in October 2023. See:
> > > 
> > > https://developercommunity.visualstudio.com/t/Using-std::format-with-
> > > unsigned-integer-/10501153
> > > 
> > > If you change the clock to utc_clock or gps_clock the code compiles with
> > > MSVC (but not with GCC) on Compiler Explorer.
> > 
> > It compiles fine with GCC for me.
> 
> Ah, maybe you mean your original example. The one at in the MSVC bug report
> compiles fine with GCC using utc_clock and gps_clock.
> 
> Your original example doesn't, because formatting a utc_time or gps_time is
> specified in terms of a sys_time, and that's how libstdc++ implements it. So
> if the utc_time or gps_time uses a float rep, we're back to square one.
> 
> I'll ask the committee to clarify that too.

Yes, I was referring to my original example because it helped me to understand
which combinations worked and which did not. 

I wrote a wrapper class around std::chrono::time_point, because I am dealing a
lot with different time formats. I am also doing multi-platform development,
and during the testing phase, it turned out that no compiler was able to
compile all templated test cases (Clang uses libstdc++ if you do not explicitly
tell it not to do, as you pointed out on the llvm bug report). Since the error
case parameter combinations differed from MSVC and GCC, I had to use many
different compiler-specific sections to get consistent and valid behavior in
the tests. That's why I called it a minefield. Maybe it was exaggerated ;).

Anyways, thank you a lot for your effort and clarifications and especially for
the fast fix.

[Bug modula2/113511] lack of libm2 ABI compatibility on powerpc platforms

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113511

Richard Biener  changed:

   What|Removed |Added

 Target||powerpc*
   Keywords||ABI

--- Comment #1 from Richard Biener  ---
There's also the question on compatibility to libgm2 from GCC 13.

I suppose the frontend could simply not allow changing the M2 language
"long double" (however it is called) with -mabi=... (which really only
change the C language ABI!).  Of course calls to libm are subject to the
C language ABI.

Does the language standard have anything to say here?  I suppose there's
no ABI documents for M2 for various targets, so eventually C interoperability
language in the standard directs at the common sense?

[Bug target/109929] profiledbootstrap failure on aarch64-linux-gnu with graphite optimization

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109929

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Richard Biener  ---
let's close this then

[Bug tree-optimization/59859] [meta-bug] GRAPHITE issues

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59859
Bug 59859 depends on bug 109929, which changed state.

Bug 109929 Summary: profiledbootstrap failure on aarch64-linux-gnu with 
graphite optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109929

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug c/113538] New: [RISC-V] --param=riscv-vector-abi will fail some cases

2024-01-22 Thread yanzhang.wang at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113538

Bug ID: 113538
   Summary: [RISC-V] --param=riscv-vector-abi will fail some cases
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yanzhang.wang at intel dot com
  Target Milestone: ---

When removing the riscv-vector-abi, I found some cases failed. We can test it
by passing the arg to the tests like,

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c
b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-over
-zvfhmin.c
index 1d82cc8de2d..0725ca69222 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64 -O3" } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64 -O3 --param=riscv-vector-abi"
} */

 #include "riscv_vector.h"



The test result will be,

=== gcc tests ===

Schedule of variations:
riscv-sim/-march=rv64gcv_zvfh/-mabi=lp64d/-mcmodel=medlow

Running target riscv-sim/-march=rv64gcv_zvfh/-mabi=lp64d/-mcmodel=medlow
Using /mnt/install/toolchains/gnu/share/dejagnu/baseboards/riscv-sim.exp as
board description file for target.
Using /mnt/install/toolchains/gnu/share/dejagnu/config/sim.exp as generic
interface file for target.
Using /mnt/install/toolchains/gnu/share/dejagnu/baseboards/basic-sim.exp as
board description file for target.
Using
/home/yanzhang/workspace/toolchains/gnu/gcc/gcc/testsuite/config/default.exp as
tool-and-target-specific interface fil
e.
Running
/home/yanzhang/workspace/toolchains/gnu/gcc/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
...
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vsetvli\\s+[a-x0-9]+,\\s*zero,\\s*e16,\\s*mf4,\\s*t
[au],\\s*m[au] 8
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vsetvli\\s+[a-x0-9]+,\\s*zero,\\s*e16,\\s*mf2,\\s*t
[au],\\s*m[au] 2
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vle16\\.v\\s+v[0-9]+,\\s*0\\([0-9ax]+\\) 7
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vse16\\.v\\s+v[0-9]+,\\s*0\\([a-x][0-9]+\\) 6
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vl1re16\\.v\\s+v[0-9]+,\\s*0\\([a-x][0-9]+\\) 1
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vl2re16\\.v\\s+v[0-9]+,\\s*0\\([a-x][0-9]+\\) 1
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vl4re16\\.v\\s+v[0-9]+,\\s*0\\([a-x][0-9]+\\) 3
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vl8re16\\.v\\s+v[0-9]+,\\s*0\\([a-x][0-9]+\\) 1
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vs2r\\.v\\s+v[0-9]+,\\s*0\\([a-x][0-9]+\\) 1
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vs4r\\.v\\s+v[0-9]+,\\s*0\\([a-x][0-9]+\\) 3
FAIL: gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c scan-assembler-times
vs8r\\.v\\s+v[0-9]+,\\s*0\\([a-x][0-9]+\\) 5


The failed test cases almost in rvv/base with same reason.

GCC commit: 57f611604e8bab67af6c0bcfe6ea88c001408412

[Bug debug/113519] [14 Regression] ICE: in replace_child, at dwarf2out.cc:5704 with -g -fdebug-types-section -fsso-struct=big-endian (or little-endian if the target is big-endian)

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113519

--- Comment #3 from Richard Biener  ---
Hmm, -fdebug-types-section ... mumbles sth about axing that.

[Bug rtl-optimization/111267] [14 Regression] Codegen regression from i386 argument passing changes

2024-01-22 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267

--- Comment #12 from Richard Sandiford  ---
I don't object to the patch, but for the record: the current heuristics go back
a long way.  Although I reworked the pass to use rtl-ssa a few years ago, I
tried as far as possible to preserve the old heuristics (tested by making sure
that there were no unexplained differences over a large set of targets).

I wouldn't characterise the old heuristics as a logic error.  Although I didn't
write them, my understanding is that they were being deliberately conservative,
in particular due to the risk of introducing excess register pressure.

So this change seems potentially quite invasive for stage 4.  Perhaps it'll
work out — if so, great!  But if there is some fallout, I think we should lean
towards reverting the patch and revisiting in GCC 15.

[Bug target/109929] profiledbootstrap failure on aarch64-linux-gnu with graphite optimization

2024-01-22 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109929

--- Comment #5 from Xi Ruoyao  ---
The first good commit is:

commit aa2ad77a9b3994fb679e5295d9570f6f8696540d
Author: Szabolcs Nagy 
Date:   Tue May 9 11:07:05 2023 +0100

aarch64: Do not force a stack frame for EH returns

but I cannot see how it's related to the issue...

And with the parent of this commit, the bootstrap fails in stagefeedback where
build/genhooks -d ../../gcc/gcc/doc/tm.texi.in > tmp-tm.texi segfaults.  I'll
do another round of bisection to see when the stagetrain error was "fixed". 
(Just for better understanding the issue.)

[Bug ipa/113520] ICE with mismatched types with LTO (tree check: expected array_type, have integer_type in array_ref_low_bound)

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113520

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-22
  Component|tree-optimization   |ipa
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||hubicka at gcc dot gnu.org,
   ||jamborm at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org
   Keywords||lto

--- Comment #6 from Richard Biener  ---
Hmm, we are supposed to "handle" this during in-streaming.  This seems to work
but then for some reason it gets fiddled with again...

Ah, so this is reading of IPA CP summaries where it seems that we do not
apply these "tricks", gimple-streamer-out.cc has

  /* Wrap all uses of non-automatic variables inside MEM_REFs
 so that we do not have to deal with type mismatches on
 merged symbols during IL read in.  The first operand
 of GIMPLE_DEBUG must be a decl, not MEM_REF, though.  */
  if (!flag_wpa && op && (i || !is_gimple_debug (stmt)))
{
  basep = &op;
  if (TREE_CODE (*basep) == ADDR_EXPR)
basep = &TREE_OPERAND (*basep, 0);
  while (handled_component_p (*basep))
basep = &TREE_OPERAND (*basep, 0);
  if (VAR_P (*basep)
  && !auto_var_in_fn_p (*basep, fn->decl)
  && !DECL_REGISTER (*basep)) 
{
  bool volatilep = TREE_THIS_VOLATILE (*basep);
  tree ptrtype = build_pointer_type (TREE_TYPE (*basep));
  *basep = build2 (MEM_REF, TREE_TYPE (*basep),
   build1 (ADDR_EXPR, ptrtype, *basep),
   build_int_cst (ptrtype, 0));
  TREE_THIS_VOLATILE (*basep) = volatilep;
...

and gimple-streamer-in.cc undoes this when the prevailing decls are compatible:

  /* At LTO output time we wrap all global decls in MEM_REFs to
 allow seamless replacement with prevailing decls.  Undo this
 here if the prevailing decl allows for this.
 ???  Maybe we should simply fold all stmts.  */
  if (TREE_CODE (*opp) == MEM_REF
  && TREE_CODE (TREE_OPERAND (*opp, 0)) == ADDR_EXPR
  && integer_zerop (TREE_OPERAND (*opp, 1))
  && (TREE_THIS_VOLATILE (*opp)
  == TREE_THIS_VOLATILE
   (TREE_OPERAND (TREE_OPERAND (*opp, 0), 0)))
  && !TYPE_REF_CAN_ALIAS_ALL (TREE_TYPE (TREE_OPERAND (*opp, 1)))
  && (TREE_TYPE (*opp)
  == TREE_TYPE (TREE_TYPE (TREE_OPERAND (*opp, 1
  && (TREE_TYPE (*opp)
  == TREE_TYPE (TREE_OPERAND (TREE_OPERAND (*opp, 0), 0
*opp = TREE_OPERAND (TREE_OPERAND (*opp, 0), 0);

I suppose we might want to split these out so summary streaming can apply
this to streamed trees as well?

[Bug ipa/113520] ICE with mismatched types with LTO (tree check: expected array_type, have integer_type in array_ref_low_bound)

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113520

Richard Biener  changed:

   What|Removed |Added

   Keywords||diagnostic

--- Comment #7 from Richard Biener  ---
It's also a missing WPA diagnostic (OTOH one decl is just external and IIRC
we kind-of allow builtin_names[] to refer of a single element array
implemented as 'int builtin_names').

[Bug testsuite/113425] gcc.dg/fold-copysign-1.c fails on arm since g:7cbe41d35e6

2024-01-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113425

Tamar Christina  changed:

   What|Removed |Added

 CC||tnfchris at gcc dot gnu.org

--- Comment #2 from Tamar Christina  ---
It is updated for arm, but I need to know how the toolchain was configured. 
This is just a difference in default options.

So I need the configure flags to be able to do anything here.

[Bug target/82580] Optimize comparisons for __int128 on x86-64

2024-01-22 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82580

Uroš Bizjak  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #17 from Uroš Bizjak  ---
(In reply to Roger Sayle from comment #16)
> Advance warning that the testcase pr82580.c will start FAILing due to
> differences in register allocation following improvements to __int128
> parameter passing as explained in
> https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623756.html.
> We might need additional reload alternatives/preferences to ensure that we
> don't generate a movzbl.  Hopefully, Jakub and/or Uros have some suggestions
> for how best this can be fixed.
> 
> Previously, the SUBREGs and CLOBBERs generated by middle-end RTL expansion
> (unintentionally) ensured that rdx and rax would never be used for __int128
> arguments, which conveniently allowed the use of xor eax,eax; setc al in
> peephole2 as AX_REG wasn't live.  Now reload has more freedom, it elects to
> use rax as at this point the backend hasn't expressed any preference that it
> would like eax reserved for producing the result.

A different regression happens with pr82580.c, f0 function. Without the patch,
the compiler generates:

f0:
xorq%rdi, %rdx
xorq%rcx, %rsi
xorl%eax, %eax
orq %rsi, %rdx
sete%al
ret

But with the patch:

f0:
xchgq   %rdi, %rsi
movq%rdx, %r8
movq%rcx, %rax
movq%rsi, %rdx
movq%rdi, %rcx
xorq%rax, %rcx
xorq%r8, %rdx
xorl%eax, %eax
orq %rcx, %rdx
sete%al
ret

It looks to me that *concatditi3_3 ties two registers together so RA now tries
to satisfy *concatditi3_3 constraints *and* *cmpti_doubleword constraints.

The gcc.target/i386/pr43644-2.c mitigates this issue with
*addti3_doubleword_concat pattern that combines *addti3_doubleword with concat
insn, but doubleword compares (and other doubleword insn besides addti3) do not
provide these compound instructions.

So, without a common strategy to use doubleword_concat patterns for all double
word instructions, it is questionable if the complications with concat insn are
worth the pain of providing (many?) doubleword_concat patterns.

The real issue is with x86_64 doubleword arguments. Unfortunately, the ABI
specifies RDI/RSI to pass the double word argument, while the compiler regalloc
order sequence is RSI/RDI. IMO, we can try to swap RDI and RSI in the order and
RA would be able to allocate registers in the same optimal way as for x86_32
with -mregparm=3, even without synthetic concat patterns.

[Bug lto/113521] ICE when building swi-prolog-9.1.2 with LTO in verify_gimple_in_cfg

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113521

--- Comment #3 from Richard Biener  ---
It's probably the same issue though - IPA summarries not being forgiving to
decl type changes.

[Bug libgcc/113401] Memory (resource) leak in -ftrampoline-impl=heap

2024-01-22 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113401

--- Comment #4 from Florian Weimer  ---
(In reply to Iain Sandoe from comment #3)
> for platforms using pthreads as the underlying resource, then perhaps we can
> do this without thread_atexit (which I do not see in many places) by using
> pthread_cleanup_push ()

The current implementation already uses the same underlying mechanism as
pthread_cleanup_push if building with -fexceptions. It does not solve the leak
because the outermost handler deliberately does not perform a full deallocation
of the thread state.

[Bug testsuite/113425] gcc.dg/fold-copysign-1.c fails on arm since g:7cbe41d35e6

2024-01-22 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113425

--- Comment #3 from Christophe Lyon  ---
What I meant by arm-* is that we see the same issue on several of the
configurations we test, as can be seen on
https://linaro.atlassian.net/browse/GNU-1100

We have recently improved the extraction of the configure line, so now some of
the xxx/details.txt on that page include it.

The two "simplest" configurations we test are tcwg_gcc_check/master-arm and
tcwg_gnu_native_check_gcc, but both of them ran before the improvement
mentioned above; in these cases, the information is present inside
console.log.xz in the relevant CI step directory (03-build_abe-gcc for
tcwg_gcc_check/master-arm and 
04-build_abe-gcc for tcwg_gnu_native_check_gcc/master-arm, the "-gcc" suffix
meaning it's the step is which we build gcc)

Anyway, here is the GCC configure line for tcwg_gcc_check/master-arm:
/configure SHELL=/bin/bash 
--with-mpc=/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/builds/destdir/armv8l-unknown-linux-gnueabihf
--with-mpfr=/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/builds/destdir/armv8l-unknown-linux-gnueabihf
--with-gmp=/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/builds/destdir/armv8l-unknown-linux-gnueabihf
--with-gnu-as --with-gnu-ld --disable-libmudflap --enable-lto --enable-shared
--without-included-gettext --enable-nls --with-system-zlib
--disable-sjlj-exceptions --enable-gnu-unique-object --enable-linker-build-id
--disable-libstdcxx-pch --enable-c99 --enable-clocale=gnu
--enable-libstdcxx-debug --enable-long-long --with-cloog=no --with-ppl=no
--with-isl=no --disable-multilib --with-float=hard --with-fpu=neon-fp-armv8
--with-mode=thumb --with-arch=armv8-a --enable-threads=posix --enable-multiarch
--enable-libstdcxx-time=yes --enable-gnu-indirect-function
--enable-checking=yes --disable-bootstrap --enable-languages=default
--prefix=/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/builds/destdir/armv8l-unknown-linux-gnueabih

[Bug testsuite/113425] gcc.dg/fold-copysign-1.c fails on arm since g:7cbe41d35e6

2024-01-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113425

--- Comment #4 from Tamar Christina  ---
(In reply to Christophe Lyon from comment #3)
> What I meant by arm-* is that we see the same issue on several of the
> configurations we test, as can be seen on
> https://linaro.atlassian.net/browse/GNU-1100
> 
> We have recently improved the extraction of the configure line, so now some
> of the xxx/details.txt on that page include it.
> 
> The two "simplest" configurations we test are tcwg_gcc_check/master-arm and
> tcwg_gnu_native_check_gcc, but both of them ran before the improvement
> mentioned above; in these cases, the information is present inside
> console.log.xz in the relevant CI step directory (03-build_abe-gcc for
> tcwg_gcc_check/master-arm and 
> 04-build_abe-gcc for tcwg_gnu_native_check_gcc/master-arm, the "-gcc" suffix
> meaning it's the step is which we build gcc)
> 
> Anyway, here is the GCC configure line for tcwg_gcc_check/master-arm:
> /configure SHELL=/bin/bash 
> --with-mpc=/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/builds/destdir/
> armv8l-unknown-linux-gnueabihf
> --with-mpfr=/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/builds/destdir/
> armv8l-unknown-linux-gnueabihf
> --with-gmp=/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/builds/destdir/
> armv8l-unknown-linux-gnueabihf --with-gnu-as --with-gnu-ld
> --disable-libmudflap --enable-lto --enable-shared --without-included-gettext
> --enable-nls --with-system-zlib --disable-sjlj-exceptions
> --enable-gnu-unique-object --enable-linker-build-id --disable-libstdcxx-pch
> --enable-c99 --enable-clocale=gnu --enable-libstdcxx-debug
> --enable-long-long --with-cloog=no --with-ppl=no --with-isl=no
> --disable-multilib --with-float=hard --with-fpu=neon-fp-armv8
> --with-mode=thumb --with-arch=armv8-a --enable-threads=posix
> --enable-multiarch --enable-libstdcxx-time=yes
> --enable-gnu-indirect-function --enable-checking=yes --disable-bootstrap
> --enable-languages=default
> --prefix=/home/tcwg-buildslave/workspace/tcwg_gnu_1/abe/builds/destdir/
> armv8l-unknown-linux-gnueabih

Yes, but the reason I need the configure flags is because it doesn't fail with
the arm-none-linux-gnueabihf target our build scripts make.

I'll check with those options.  Immediately one big difference is the forcing
of armv8 and thumb which is likely causing the difference.

[Bug libgcc/113401] Memory (resource) leak in -ftrampoline-impl=heap

2024-01-22 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113401

--- Comment #5 from Iain Sandoe  ---
(In reply to Florian Weimer from comment #4)
> (In reply to Iain Sandoe from comment #3)
> > for platforms using pthreads as the underlying resource, then perhaps we can
> > do this without thread_atexit (which I do not see in many places) by using
> > pthread_cleanup_push ()
> 
> The current implementation already uses the same underlying mechanism as
> pthread_cleanup_push if building with -fexceptions. It does not solve the
> leak because the outermost handler deliberately does not perform a full
> deallocation of the thread state.

hmm.. I'm slightly confused here.

We certainly make the __gcc_nested_func_ptr_deleted () call a cleanup attached
to scope exits and certainly the last page of trampolines is not deallocated
(as you note for the sake of avoiding churn in m-mapping).

However, in the current code the only pthread-specific stuff I see (in, say
config/i386/heap-trampoline.c) is specific to changing protections on the
mapped pages.

What I was thinking of is attaching a thread exit cleanup using
pthread_cleanup_push() for platforms with pthreads but without Libc support for
_thread_atexit - I guess I'm missing something :)

[Bug libgcc/113401] Memory (resource) leak in -ftrampoline-impl=heap

2024-01-22 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113401

--- Comment #6 from Florian Weimer  ---
Sorry, pthread_cleanup_push is purely scope-based, like the existing handler.
It cannot be used to push a handler to some unscoped cleanup function list that
persists even after the current function returns. It's also implemented as a
macro, so it's not possible to emit it from builtin expansion.

[Bug libgcc/113401] Memory (resource) leak in -ftrampoline-impl=heap

2024-01-22 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113401

--- Comment #7 from Iain Sandoe  ---
(In reply to Florian Weimer from comment #6)
> Sorry, pthread_cleanup_push is purely scope-based, like the existing
> handler. It cannot be used to push a handler to some unscoped cleanup
> function list that persists even after the current function returns. It's
> also implemented as a macro, so it's not possible to emit it from builtin
> expansion.

Ah, then we have a documentation issue, because man pthread_cleanup_push(3)
describes running the registered functions on thread cancellation or on
thread_exit() [but not, unfortunately if the thread exits by returning - so
still not ideal].

[Bug libgcc/113401] Memory (resource) leak in -ftrampoline-impl=heap

2024-01-22 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113401

--- Comment #8 from Florian Weimer  ---
Which version of the manual page are you looking at?

https://man7.org/linux/man-pages/man3/pthread_cleanup_push.3.html seems pretty
clear about the scope-based nature (search for discussion of
break/return/goto).

[Bug tree-optimization/113539] New: [14 Regression] perlbench miscompiled on aarch64 since r14-8223-g1c1853a70f

2024-01-22 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113539

Bug ID: 113539
   Summary: [14 Regression] perlbench miscompiled on aarch64 since
r14-8223-g1c1853a70f
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

I'm seeing miscompares of perlbench (both from SPEC CPU 2006 and SPEC CPU 2017)
on aarch64 with recent trunk changes, a bisect pointed to
r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c :

commit 1c1853a70f9422169190e65e568dcccbce02d95c
Author: Richard Biener 
Date:   Thu Jan 18 10:22:34 2024

Fix memory leak in vect_analyze_loop_form

The miscompares are with the checkspam.pl workload, I see:

*** Miscompare of checkspam.2500.5.25.11.150.1.1.1.1.out

I've seen this with:

-flto=auto -fomit-frame-pointer -O3 -fno-strict-aliasing

and various -mcpu options (at least -mcpu=cortex-a72 and -mcpu=neoverse-v1).

[Bug libgcc/113401] Memory (resource) leak in -ftrampoline-impl=heap

2024-01-22 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113401

--- Comment #9 from Iain Sandoe  ---
(In reply to Florian Weimer from comment #8)
> Which version of the manual page are you looking at?
> 
> https://man7.org/linux/man-pages/man3/pthread_cleanup_push.3.html seems
> pretty clear about the scope-based nature (search for discussion of
> break/return/goto).

yeah, got it; one needs to read the union of the sections (the page I was
reading was slightly different but the same basic info).

I suppose if we were able to create a wrapper around the thread routine and the
cleanup was NOP for cases without nested fns.

Otherwise, it looks a bit tricky for platforms without thread_atexit support.

Have to think some more.

[Bug tree-optimization/113539] [14 Regression] perlbench miscompiled on aarch64 since r14-8223-g1c1853a70f

2024-01-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113539

Tamar Christina  changed:

   What|Removed |Added

 CC||tnfchris at gcc dot gnu.org

--- Comment #1 from Tamar Christina  ---
If that's the commit that's miscomparing then it's probably a bug in
early-break vect.

So I'll take a look.

+ if ((integer_zerop (may_be_zero)
+  || integer_nonzerop (may_be_zero)

is odd though, isn't that basically accepting all values of may_be_zero?

[Bug tree-optimization/113539] [14 Regression] perlbench miscompiled on aarch64 since r14-8223-g1c1853a70f

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113539

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

--- Comment #2 from Richard Biener  ---
It accepts all constant known may_be_zero - we can handle all of those later.

I suspect this just triggers a latent issue (vectorizing now simply using
a different exit as canonical in one case).

[Bug tree-optimization/113539] [14 Regression] perlbench miscompiled on aarch64 since r14-8223-g1c1853a70f

2024-01-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113539

Tamar Christina  changed:

   What|Removed |Added

   Priority|P3  |P1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-01-22
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |tnfchris at gcc dot 
gnu.org

--- Comment #3 from Tamar Christina  ---
(In reply to Richard Biener from comment #2)
> It accepts all constant known may_be_zero - we can handle all of those later.
> 
> I suspect this just triggers a latent issue (vectorizing now simply using
> a different exit as canonical in one case).

Indeed, I'll take a look.

[Bug tree-optimization/113539] [14 Regression] perlbench miscompiled on aarch64 since r14-8223-g1c1853a70f

2024-01-22 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113539

--- Comment #4 from Alex Coplan  ---
Reproduces with just -O3 -fno-strict-aliasing FWIW, no LTO or -mcpu needed.

[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'

2024-01-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Tobias Burnus :

https://gcc.gnu.org/g:13127dac106724bef3a979539a878b368b79ce56

commit r14-8332-g13127dac106724bef3a979539a878b368b79ce56
Author: Tobias Burnus 
Date:   Mon Jan 22 12:17:12 2024 +0100

[gcn] mkoffload: Fix linking with "-g"; fix file deletion; improve
diagnostic [PR111966]

With debugging enabled, '*.mkoffload.dbg.o' files are generated. The
e_flags
header of all *.o files must be the same - otherwise, the linker complains.
Since r14-4734-g56ed1055b2f40ac162ae8d382280ac07a33f789f the -march=
default
is now gfx900. If compiling without any -march= flag, the default value is
used by the compiler but not passed to mkoffload. Hence, mkoffload.cc's
uses
its own default for march - unfortunately, it still had gfx803/fiji as
default,
leading to the linker error: 'incompatible mach'. Solution: Update the
default to gfx900.

While debugging it, I saw that /tmp/cc*.mkoffload.dbg.o kept accumulating;
there were a couple of issues with the handling:
* dbgobj was always added to files_to_cleanup
* If copy_early_debug_info returned true, dbgobj was added again
  -> pointless and in theory a race if the same file was added in the
 faction of a second.
* If copy_early_debug_info returned false,
  - In exactly one case, it already deleted the file it self
(same potential race as above)
  - The pointer dbgobj was freed - such that files_to_cleanup contained
a dangling pointer - probably the reason that stale files remained.
Solution: Only if copy_early_debug_info returns true, dbgobj is added to
files_to_cleanup. If it returns false, the file is unlinked before freeing
the pointer.

When compiling, GCC warned about several fatal_error messages as having
no %<...%> or %qs quotes. This patch now silences several of those warnings
by using those quotes.

gcc/ChangeLog:

PR other/111966
* config/gcn/mkoffload.cc (elf_arch): Change default to gfx900
to match the compiler default.
(simple_object_copy_lto_debug_sections): Never unlink the outfile
on error as the caller does so.
(maybe_unlink, compile_native): Use %<...%> and %qs in fatal_error.
(main): Likewise. Fix 'mkoffload.dbg.o' cleanup.

Signed-off-by: Tobias Burnus 

[Bug rtl-optimization/113495] RISC-V: Time and memory awful consumption of SPEC2017 wrf benchmark

2024-01-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495

--- Comment #27 from Robin Dapp  ---
Following up on this:

I'm seeing the same thing Patrick does.  We create a lot of large non-sparse
sbitmaps that amount to around 33G in total.

I did local experiments replacing all sbitmaps that are not needed for LCM by
regular bitmaps.  Apart from output differences vs the original version the
testsuite is unchanged.

As expected, wrf now takes longer to compiler, 8 mins vs 4ish mins before and
we still use 2.7G of RAM for this single file (Likely because of the remaining
sbitmaps) compared to a max of 1.2ish G that the rest of the commpilation uses.

One possibility to get the best of both worlds would be to threshold based on
num_bbs * num_exprs.  Once we exceed it switch to the bitmap pass, otherwise
keep sbitmaps for performance. 

Messaging with Juzhe offline, his best guess for the LICM time is that he
enabled checking for dataflow which slows down this particular compilation by a
lot.  Therefore it doesn't look like a generic problem.

[Bug rtl-optimization/113495] RISC-V: Time and memory awful consumption of SPEC2017 wrf benchmark

2024-01-22 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495

--- Comment #28 from JuzheZhong  ---
(In reply to Robin Dapp from comment #27)
> Following up on this:
> 
> I'm seeing the same thing Patrick does.  We create a lot of large non-sparse
> sbitmaps that amount to around 33G in total.
> 
> I did local experiments replacing all sbitmaps that are not needed for LCM
> by regular bitmaps.  Apart from output differences vs the original version
> the testsuite is unchanged.
> 
> As expected, wrf now takes longer to compiler, 8 mins vs 4ish mins before
> and we still use 2.7G of RAM for this single file (Likely because of the
> remaining sbitmaps) compared to a max of 1.2ish G that the rest of the
> commpilation uses.
> 
> One possibility to get the best of both worlds would be to threshold based
> on num_bbs * num_exprs.  Once we exceed it switch to the bitmap pass,
> otherwise keep sbitmaps for performance. 
> 
> Messaging with Juzhe offline, his best guess for the LICM time is that he
> enabled checking for dataflow which slows down this particular compilation
> by a lot.  Therefore it doesn't look like a generic problem.

Thanks. I don't think replacing sbitmap is the best solution.
Let's me first disable DF check and reproduce 33G memory consumption in my
local
machine.

I think the best way to optimize the memory consumption is to optimize the
VSETLV PASS algorithm and codes. I have an idea to optimize.
I am gonna work on it.

Thanks for reporting.

[Bug rtl-optimization/113495] RISC-V: Time and memory awful consumption of SPEC2017 wrf benchmark

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495

--- Comment #29 from Richard Biener  ---
(In reply to rguent...@suse.de from comment #26)
> On Fri, 19 Jan 2024, juzhe.zhong at rivai dot ai wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495
> > 
> > --- Comment #22 from JuzheZhong  ---
> > (In reply to Richard Biener from comment #21)
> > > I once tried to avoid df_reorganize_refs and/or optimize this with the
> > > blocks involved but failed.
> > 
> > I am considering whether we should disable LICM for RISC-V by default if 
> > vector
> > is enabled ?
> > Since the compile time explode 10 times is really horrible.
> 
> I think that's a bad idea.  It only explodes for some degenerate cases.
> The best would be to fix invariant motion to keep DF up-to-date so
> it can stop using df_analyze_loop and instead analyze the whole function.
> Or maybe change it to use the rtl-ssa framework instead.
> 
> There's already param_loop_invariant_max_bbs_in_loop:
> 
>   /* Process the loops, innermost first.  */
>   for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
> {
>   curr_loop = loop;
>   /* move_single_loop_invariants for very large loops is time 
> consuming
>  and might need a lot of memory.  For -O1 only do loop invariant
>  motion for very small loops.  */
>   unsigned max_bbs = param_loop_invariant_max_bbs_in_loop;
>   if (optimize < 2)
> max_bbs /= 10;
>   if (loop->num_nodes <= max_bbs)
> move_single_loop_invariants (loop);
> }
> 
> it might be possible to restrict invariant motion to innermost loops
> when the overall number of loops is too large (with a new param
> for that).  And when the number of innermost loops also exceeds
> the limit avoid even that?  The above also misses a
> optimize_loop_for_speed_p (loop) check (probably doesn't make
> a difference, but you could try).

Ah, sorry - I was mis-matching LICM to invariant motion above, still
invariant motion is the biggest offender (might be due to DF checking
if you enabled that).

As for sbitmap vs. bitmap it's a difficult call.  When there's big
profile hits on individual bit operations (bitmap_bit_p, bitmap_set_bit)
it might may off to use bitmap but with tree view.  There's also
sparseset but that requires even more memory.

[Bug tree-optimization/113466] ICE: verify_flow_info failed: returns_twice call is not first in basic block 7 with a __returns_twice__ function with _BitInt() argument

2024-01-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113466

Jakub Jelinek  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
The question is what we can do about it.

bitint_large_huge::lower_call wants for the large/huge BITINT_TYPE SSA_NAME
call arguments (with the exception of uninitialized ones) add a load before the
call, which loads the argument from some VAR_DECL or PARM_DECL etc.

And the CFG requirements for returns_twice calls is that there is an abnormal
edge from the .ABNORMAL_DISPATCHER block to the start of the call, so we can't
insert anything
before the call.

Now, in fixes like PR109410 this was easy because reassoc is adding those
statements to the start of the function, so we can easily split the ENTRY ->
bb2 edge and insert stuff there.

But here it is much more complicated.
In the easier case, we have just one EDGE_FALLTHRU predecessor edge plus the
EDGE_ABNORMAL edge.
I guess we can in that case insert on that EDGE_FALLTHRU edge, but then there
is a question if one can just use the SSA_NAME in the return argument or not.
If there is just one call like in the #c0 case, that is most likely the case,
but what about say:
void foo (_BitInt(6321)) __attribute__((returns_twice));
void baz (void);

void
bar (_BitInt(6321) x)
{
  foo (x);
  baz ();
  foo (x + 1);
  baz ();
}
One can insert the load from x on the entry edge because that dominates the
.ABNORMAL_DISPATCHER bb, but guess for the _1 (x + 1) load we need some PHI and
it isn't clear to me what to use on the edge from the abnormal dispatcher (and
whether to use some PHI on the .ABNORMAL_DISPATCHER bb as well).

And, if the bb with returns_twice call contains multiple predecessor edges and
even worse say next to the .ABNORMAL_DISPATCHER abnormal edge some EDGE_EH or
similar incoming edges, probably need to add some bb before the returns_twice
bb but then no idea what to do with PHIs etc.

Or we could for the time being just sorry on returns_twice calls with
large/huge _BitInt arguments.

[Bug tree-optimization/113441] [13/14 Regression] Fail to fold the last element with multiple loop

2024-01-22 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #5 from JuzheZhong  ---
Confirm at Nov, 1. The regression is gone.

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=eac0917bd3d2ead4829d56c8f2769176087c7b3d

This commit is ok, which has no regressions.

Still bisecting manually.

[Bug tree-optimization/113441] [13/14 Regression] Fail to fold the last element with multiple loop

2024-01-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #6 from Tamar Christina  ---
Hello,

I can bisect it if you want. it should only take a few seconds.

[Bug tree-optimization/113441] [13/14 Regression] Fail to fold the last element with multiple loop

2024-01-22 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #7 from JuzheZhong  ---
(In reply to Tamar Christina from comment #6)
> Hello,
> 
> I can bisect it if you want. it should only take a few seconds.

Ok. Thanks a lot ...

I take 2 hours to bisect it manually  but still didn't locate the accurate
commit
which causes regression...

It's great that you can bisect it easily.

[Bug tree-optimization/113466] ICE: verify_flow_info failed: returns_twice call is not first in basic block 7 with a __returns_twice__ function with _BitInt() argument

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113466

--- Comment #3 from Richard Biener  ---
Well, this simply highlights that the CFG doesn't really match "returns-twice".
The "returns-twice" part is just

 (void) // no return value

but only the SJLJ __builtin_setjmp_setup/receiver has this properly handled.

If we wanted to apply this in a more general form then a function

T __attribute__((returns_twice)) fn (ARGS ...);

would have to be represented like



  fn (ARGS ...);


  T retval = .RECEIVE ();

where there's two incoming edges into BB 3 (one abnormal) and just a
fallthru from BB2 to BB3.  IIRC the two outgoing edges from the receive
part are just a code motion barrier.  So there should never be PHIs
necessary for the call arguments.

You could make sure to put the correct argument on the fallthru to the
call and simply put uninit SSA names on the abnormal entry.  I think that
should work as far as correctness is concerned.

[Bug middle-end/112600] Failed to optimize saturating addition using __builtin_add_overflow

2024-01-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600

Tamar Christina  changed:

   What|Removed |Added

 CC||tnfchris at gcc dot gnu.org

--- Comment #5 from Tamar Christina  ---
Yeah, this is hurting us a lot on vectors as well:

https://godbolt.org/z/ecnGadxcG

The first one isn't vectorizable and the second one we generates too
complicated code as the pattern vec_cond is expanded to something quite
complicated.

It was too complicated for the intern we had at the time, but I think basically
we should still do the conclusion of this thread no?
https://www.mail-archive.com/gcc@gcc.gnu.org/msg95398.html

i.e. we should just make proper saturating IFN.

The only remaining question is, should we make them optab backed or can we do
something reasonably better for most target with better fallback code.

This seems to indicate yes since the REALPART_EXPR seems to screw things up a
bit.

[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2

2024-01-22 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507

--- Comment #3 from H.J. Lu  ---
(In reply to Kewen Lin from comment #2)
> Guessing /usr/local/bin/ld is a gnu ld? Based on what I heard before, gnu ld
> has some problems on aix, people pass object files to aix system and use aix
> ld there. Not sure if the understanding still holds.

I am building a cross compiler.  No AIX tools are involved.

[Bug target/109092] [13 Regression] ICE: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1932 when building libgcc on riscv64

2024-01-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109092

--- Comment #9 from GCC Commits  ---
The trunk branch has been updated by Lehua Ding :

https://gcc.gnu.org/g:f625c017e60b6e05675b7d6280f2c7677ce691c3

commit r14-8333-gf625c017e60b6e05675b7d6280f2c7677ce691c3
Author: Juzhe-Zhong 
Date:   Mon Jan 22 17:05:07 2024 +0800

RISC-V: Fix regressions due to 86de9b66480b710202a2898cf513db105d8c432f

This patch fixes the recent regression:

FAIL: gcc.dg/torture/float32-tg-2.c   -O1  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg-2.c   -O1  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg-2.c   -O2  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg-2.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg-2.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error: in reg_or_subregno, at
jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg-2.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg-2.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error: in reg_or_subregno, at
jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg-2.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg-2.c   -O3 -g  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg-2.c   -O3 -g  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg-2.c   -Os  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg-2.c   -Os  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg.c   -O1  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg.c   -O1  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg.c   -O2  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error: in reg_or_subregno, at
jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error: in reg_or_subregno, at
jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg.c   -O3 -g  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg.c   -O3 -g  (test for excess errors)
FAIL: gcc.dg/torture/float32-tg.c   -Os  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/float32-tg.c   -Os  (test for excess errors)
FAIL: gcc.dg/torture/pr48124-4.c   -O1  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/pr48124-4.c   -O1  (test for excess errors)
FAIL: gcc.dg/torture/pr48124-4.c   -O2  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/pr48124-4.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/pr48124-4.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error: in reg_or_subregno, at
jump.cc:1895)
FAIL: gcc.dg/torture/pr48124-4.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr48124-4.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error: in reg_or_subregno, at
jump.cc:1895)
FAIL: gcc.dg/torture/pr48124-4.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.dg/torture/pr48124-4.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/pr48124-4.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: gcc.dg/torture/pr48124-4.c   -O3 -g  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/pr48124-4.c   -O3 -g  (test for excess errors)
FAIL: gcc.dg/torture/pr48124-4.c   -Os  (internal compiler error: in
reg_or_subregno, at jump.cc:1895)
FAIL: gcc.dg/torture/pr48124-4.c   -Os  (test for excess errors)

due to commit 86de9b66480b710202a2898cf513db105d8c432f.

The root cause is register_operand and reg_or_subregno are consistent so we
reach the assertion fail.

We shouldn't worry about subreg:...VL_REGNUM since it's impossible that we
can have such situation,
that is, we only have (set (reg) (reg:VL_REGNUM)) which generate "csrr vl"
ASM for first fault load instructions (vleff).
So, 

[Bug tree-optimization/113441] [13/14 Regression] Fail to fold the last element with multiple loop

2024-01-22 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #8 from JuzheZhong  ---
I believe the change between Nov and Dec causes regression.

But I don't continue on bisection.

Hope this information can help with your bisection.

Thanks.

[Bug rtl-optimization/113495] RISC-V: Time and memory awful consumption of SPEC2017 wrf benchmark

2024-01-22 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495

--- Comment #30 from JuzheZhong  ---
Ok. I believe m_avl_def_in && m_avl_def_out can be removed with a better
algorthm.

Then the memory-hog should be fixed soon.

I am gonna rewrite avl_vl_unmodified_between_p and trigger full coverage
testingl
Since it's going to be a big change there.

[Bug target/113087] [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc

2024-01-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087

--- Comment #37 from Robin Dapp  ---
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113206#c9
> Using 4a0a8dc1b88408222b88e10278017189f6144602, the spec run failed on:
> zvl128b (All runtime fails):
> 527.cam4 (Runtime)
> 531.deepsjeng (Runtime)
> 521.wrf (Runtime)
> 523.xalancbmk (Runtime)

I tried reproducing the xalanc fail first but with the current trunk I don't
see a runtime fail.  Going to try deepsjeng next.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|13.3|14.0
Summary|[13/14 Regression] Fail to  |[14 Regression] Fail to
   |fold the last element with  |fold the last element with
   |multiple loop   |multiple loop

[Bug middle-end/113514] Wrong __builtin_dynamic_object_size when using a set local variable

2024-01-22 Thread siddhesh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113514

--- Comment #5 from Siddhesh Poyarekar  ---
What seems to be happening is that early_objsz bails out since the subobject
size at that point is not a constant; I remember concluding that it's safest to
stick to constant sizes here, but I can't remember why I came to that
conclusion.  Then in constant propagation (literally the next pass in -O2), the
reference gets folded into a MEM_REF and we have the classic case of the
subobject reference being lost, due to which we see the whole object size there
instead of the subobject size.

I need to try and remember why I decided against generating expressions in
early_objsz.

[Bug middle-end/113514] Imprecise __builtin_dynamic_object_size when using a set local variable

2024-01-22 Thread siddhesh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113514

Siddhesh Poyarekar  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-22
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

[Bug c/113518] ICE: in gimplify_expr, at gimplify.cc:18596 with atomic_fetch_or_explicit() on _BitInt()

2024-01-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113518

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Created attachment 57183
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57183&action=edit
gcc14-pr113518.patch

Untested fix.

[Bug middle-end/113540] New: missing -Warray-bounds warning with malloc and a simple loop

2024-01-22 Thread vincent-gcc at vinc17 dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113540

Bug ID: 113540
   Summary: missing -Warray-bounds warning with malloc and a
simple loop
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vincent-gcc at vinc17 dot net
  Target Milestone: ---

Consider the following code:

#include 

int main (void)
{
  volatile char *t;
  t = malloc (4);
  for (int i = 0; i <= 4; i++)
t[i] = 0;
  return 0;
}

With -O2 -Warray-bounds, I do not get any warning.

Replacing the loop by "t[4] = 0;" gives a warning "array subscript 4 is outside
array bounds of 'volatile char[4]'" as expected.

Or replacing the use of malloc() by "volatile char t[4];" also gives a warning.

Tested with gcc (Debian 20240117-1) 14.0.1 20240117 (experimental) [master
r14-8187-gb00be6f1576]. But previous versions do not give any warning either.

[Bug target/113087] [14] RISC-V rv64gcv vector: Runtime mismatch with rv64gc

2024-01-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113087

--- Comment #38 from Robin Dapp  ---
deepsjeng also looks ok here.

[Bug debug/112718] [11/12/13/14 Regression] ICE: in add_dwarf_attr, at dwarf2out.cc:4501 with -g -fdebug-types-section -flto -ffat-lto-objects

2024-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112718

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #2 from Richard Biener  ---
I have a patch, but other issues with -fdebug-types-section and -flto will
prevail.

[Bug rtl-optimization/113495] RISC-V: Time and memory awful consumption of SPEC2017 wrf benchmark

2024-01-22 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113495

--- Comment #31 from JuzheZhong  ---
machine dep reorg  : 403.69 ( 56%)  23.48 ( 93%) 427.17 ( 57%) 
5290k (  0%)

Confirm remove RTL DF checking, LICM is no longer be compile-time hog issue.

VSETVL PASS count 56% compile-time.

Even though I can' see memory-hog in GGC -ftime-report, I can see 33G memory
usage
in htop.

Confirm both compile-hog and memory-hog are VSETVL PASS issue.

I will work on optimize compile-time as well as memory-usage of VSETVL PASS.

[Bug rtl-optimization/113533] [14 Regression] Code generation regression after change for pr111267

2024-01-22 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113533

Roger Sayle  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-22
 Status|UNCONFIRMED |NEW
 CC||roger at nextmovesoftware dot 
com
 Ever confirmed|0   |1

--- Comment #6 from Roger Sayle  ---
To help diagnose the problem, I came up with this simple patch:
diff --git a/gcc/fwprop.cc b/gcc/fwprop.cc
index 7872609b336..dc563ac2ca1 100644
--- a/gcc/fwprop.cc
+++ b/gcc/fwprop.cc
@@ -492,6 +492,9 @@ try_fwprop_subst_pattern (obstack_watermark &attempt,
insn_change &use_change,
   " (cost %d -> cost %d)\n", old_cost, new_cost);
ok = false;
  }
+   else if (dump_file)
+ fprintf (dump_file, "change is profitable"
+  " (cost %d -> cost %d)\n", old_cost, new_cost);
   }

   if (!ok)

which then helps reveal that on sh3-linux-gnu with -O1 we see:
propagating insn 6 into insn 12, replacing:
(set (reg:SI 174 [ _1 ])
(sign_extend:SI (reg:QI 169 [ *a_7(D) ])))
successfully matched this instruction to *extendqisi2_compact_snd:
(set (reg:SI 174 [ _1 ])
(sign_extend:SI (mem:QI (reg/v/f:SI 168 [ aD.1817 ]) [0 *a_7(D)+0 S1 A8])))
change is profitable (cost 4 -> cost 1)

which confirms Andrew's and Oleg's analyses above; the sh_rtx_costs function is
a little odd... Reading from memory is four times faster than using a pseudo!?
I'm investigating a "costs" patch for the SH backend.  My apologies for the
temporary inconvenience, and thanks to Jeff for catching/spotting this.

[Bug c++/113541] New: Rejects __attribute__((section)) on explicit instantiation declaration of ctor/dtor

2024-01-22 Thread arthur.j.odwyer at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113541

Bug ID: 113541
   Summary: Rejects __attribute__((section)) on explicit
instantiation declaration of ctor/dtor
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arthur.j.odwyer at gmail dot com
  Target Milestone: ---

// https://godbolt.org/z/34Wdj1ox8

template
struct S {
S(int) {}
void operator=(int) {}
void f(int) {}
~S() {}
};
template __attribute__((section("TEST"))) S::S(int); // error
template __attribute__((section("TEST"))) void S::f(int); // OK
template __attribute__((section("TEST"))) void S::operator=(int); // OK
template __attribute__((section("TEST"))) S::~S(); // error

===

: In instantiation of 'S::S(int) [with T = int]':
:9:56:   required from here
:3:5: error: section of alias 'S::S(int) [with T = int]' must match
section of its target
3 | S(int) {}
  | ^

The problem seems to be only with the constructor and destructor, i.e., the two
kinds of functions that codegen two object-code definitions (base object xtor
and complete object xtor) for a single C++ declaration.

Somehow, giving `S` a virtual base class (`struct S : virtual B`) fixes the
problem. Then both codegenned xtors correctly wind up in the "TEST" section.

GCC 4.9.4 is happy with the code as written. The bug started happening with GCC
5.

(This was noted on Slack in June 2019, but never reported on Bugzilla AFAICT
until now: https://cpplang.slack.com/archives/C5GN4SP41/p1560800562026000 )

[Bug debug/113382] FAIL: gcc.dg/debug/btf/btf-bitfields-3.c scan-assembler-times [\t ]0x6000004[\t ]+[^\n]*btt_info 1

2024-01-22 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113382

--- Comment #4 from John David Anglin  ---
dtd->dtd_enum_unsigned is set in ctf_add_enum:
dtd->dtd_enum_unsigned = eunsigned;

  /* Generate a CTF type for the enumeration.  */
  enumeration_type_id = ctf_add_enum (ctfc, CTF_ADD_ROOT,
  enum_name, bit_size / 8,
  (signedness == DW_ATE_unsigned),
  enumeration);

signedness = 0

(gdb) bt
#0  ctf_add_enum (ctfc=0x83ffbfea7c00, flag=1,
name=0x83ffbfe6b188 "foo", size=4, eunsigned=false,
die=0x83ffbfea2320) at ../../gcc/gcc/ctfc.cc:591
#1  0x4082df34 in gen_ctf_enumeration_type (
enumeration=0x83ffbfea2320, ctfc=0x83ffbfea7c00)
at ../../gcc/gcc/dwarf2ctf.cc:762
#2  gen_ctf_type (ctfc=0x83ffbfea7c00, die=0x83ffbfea2320)
at ../../gcc/gcc/dwarf2ctf.cc:899
#3  0x4082e8b8 in ctf_do_die (die=0x83ffbfea2320)
at ../../gcc/gcc/dwarf2ctf.cc:978
#4  0x4088f9b0 in ctf_debug_do_cu (die=)
at ../../gcc/gcc/dwarf2out.cc:33017
#5  ctf_debug_do_cu (die=) at ../../gcc/gcc/dwarf2out.cc:33010
#6  dwarf2out_early_finish (filename=0x83ffbfea7c00 "▒\362\004\002")
at ../../gcc/gcc/dwarf2out.cc:33146
#7  0x407de578 in symbol_table::finalize_compilation_unit (
this=0x83ffbfe6b188) at ../../gcc/gcc/cgraphunit.cc:2579
#8  0x40d55338 in compile_file () at ../../gcc/gcc/toplev.cc:474

[Bug rtl-optimization/113542] New: gcc.target/arm/bics_3.c regression after change for pr111267

2024-01-22 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Bug ID: 113542
   Summary: gcc.target/arm/bics_3.c regression after change for
pr111267
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: roger at nextmovesoftware dot com
  Target Milestone: ---

This patch is a placeholder for tracking the reported failures of
FAIL: gcc.target/arm/bics_3.c scan-assembler-times bics\tr[0-9]+, r[0-9]+,
r[0-9]+ 2
FAIL: gcc.target/arm/bics_3.c scan-assembler-times bics\tr[0-9]+, r[0-9]+,
r[0-9]+, .sl #2 1
See https://linaro.atlassian.net/browse/GNU-1117

Alas, I've been unable to reproduce the failure on cross compilers to either
arm-linux-gnueabihf nor armv8l-unknown-linux-gnueabihf, so I suspect that
there's some configuration option or compile-time flag I'm missing that's
required to trigger these failures (which I'm hoping is "missed optimization"
rather than "wrong code").

Hopefully, filing this PR provides a mechanism to allow folks to help me
investigate this issue.  My apologies for the temporary inconvenience.
Setting the component to "rtl-optimization" until this is confirmed to be a
target (ARM backend) issue.

[Bug tree-optimization/113239] [13/14 regression] After 822a11a1e64, bogus -Warray-bounds warnings in std::vector

2024-01-22 Thread fche at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113239

--- Comment #7 from Frank Ch. Eigler  ---
Wonder if this similar but different diagnostic is closely related:

https://kojipkgs.fedoraproject.org//work/tasks/6259/112176259/build.log

[...]
inlined from ‘mutatee::instrument_dynprobe_target(BPatch_object*,
dynprobe_target const&)’ at mutatee.cxx:444:22:
/usr/include/c++/14/bits/stl_algobase.h:438:30: error: ‘memmove’ writing
between 9 and 9223372036854775800 bytes into a region of size 0 overflows the
destination [-Werror=stringop-overflow=]
  438 | __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
  | ~^~~
In file included from
/usr/include/c++/14/x86_64-redhat-linux/bits/c++allocator.h:33,
 from /usr/include/c++/14/bits/allocator.h:46,
 from /usr/include/c++/14/string:43:
In member function ‘std::__new_allocator::allocate(unsigned
long, void const*)’,
[...]

where the c++ code in question is a straight

vector<> foo;
vector<> bar;
foo.insert(foo.end(), bar.begin(), bar.end());

[Bug c++/102626] [c++20] compiler crash when invoking constexpr function in the constructor of template class

2024-01-22 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102626

Hans-Peter Nilsson  changed:

   What|Removed |Added

 CC||hp at gcc dot gnu.org
   Last reconfirmed|2021-10-11 00:00:00 |2024-01-14 0:00

--- Comment #4 from Hans-Peter Nilsson  ---
Searching for a constexpr-related bug (not this one) I can confirm that (for
cris-elf at least) the bug is still there at r14-7232-gb468821eea8d
(the test-case in comment #2 with "-std=c++20 -O3")

[Bug rtl-optimization/113533] [14 Regression] Code generation regression after change for pr111267

2024-01-22 Thread olegendo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113533

--- Comment #7 from Oleg Endo  ---
(In reply to Roger Sayle from comment #6)
> To help diagnose the problem, I came up with this simple patch:

Thanks for looking into it!

> which then helps reveal that on sh3-linux-gnu with -O1 we see:

I think this will also happen on all sh-elf sub-targets, not necessarily
sh3-linux... if it helps anything ... 

> propagating insn 6 into insn 12, replacing:
> (set (reg:SI 174 [ _1 ])
> (sign_extend:SI (reg:QI 169 [ *a_7(D) ])))
> successfully matched this instruction to *extendqisi2_compact_snd:
> (set (reg:SI 174 [ _1 ])
> (sign_extend:SI (mem:QI (reg/v/f:SI 168 [ aD.1817 ]) [0 *a_7(D)+0 S1
> A8])))
> change is profitable (cost 4 -> cost 1)
> 
> which confirms Andrew's and Oleg's analyses above; the sh_rtx_costs function
> is a little odd... Reading from memory is four times faster than using a
> pseudo!?
> I'm investigating a "costs" patch for the SH backend.

Looks like sh_rtx_costs function assumes that the costs of the whole RTX are
summed up outside in the caller.

In sh_rtx_costs SIGN_EXTEND, ZERO_EXTEND, the 'sh_address_cost' is returned
directly for the MEM_P case. It should probably have went through COSTS_N_INSN
to get it into the same scale as for the arith_reg_operand case.

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop

2024-01-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #9 from Tamar Christina  ---
So on SVE the change is cost modelling.

Bisect landed on g:33c2b70dbabc02788caabcbc66b7baeafeb95bcf which changed the
compiler's defaults to using the new throughput matched cost modelling used be
newer cores.

It looks like this changes which mode the compiler picks for when using a fixed
register size.

This is because the new cost model (correctly) models the costs for FMAs and
promotions.

Before:

array1[0][_1] 1 times scalar_load costs 1 in prologue
int) _2 1 times scalar_stmt costs 1 in prologue

after:

array1[0][_1] 1 times scalar_load costs 1 in prologue 
(int) _2 1 times scalar_stmt costs 0 in prologue 

and the cost goes from:

Vector inside of loop cost: 125

to

Vector inside of loop cost: 83 

so far, nothing sticks out, and in fact the profitability for VNx4QI drops from

Calculated minimum iters for profitability: 5

to

Calculated minimum iters for profitability: 3

This causes a clash, as this is now exactly the same cost as VNx2QI which used
to be what it preferred before.

Which then leads it to pick the higher VF.

In the end smaller VF shows:

;; Guessed iterations of loop 4 is 0.500488. New upper bound 1.

and now we get:

Vectorization factor 16 seems too large for profile prevoiusly believed to be
consistent; reducing.  
;; Guessed iterations of loop 4 is 0.500488. New upper bound 0.
;; Scaling loop 4 with scale 66.6% (guessed) to reach upper bound 0

which I guess is the big difference.

There is a weird costing going on in the PHI nodes though:

m_108 = PHI  1 times vector_stmt costs 0 in body 
m_108 = PHI  2 times scalar_to_vec costs 0 in prologue

they have collapsed to 0. which can't be right..

[Bug tree-optimization/113239] [13/14 regression] After 822a11a1e64, bogus -Warray-bounds warnings in std::vector

2024-01-22 Thread dimitry--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113239

--- Comment #8 from Dimitry Andric  ---
(In reply to Frank Ch. Eigler from comment #7)
> Wonder if this similar but different diagnostic is closely related:
...
> where the c++ code in question is a straight
> 
> vector<> foo;
> vector<> bar;
> foo.insert(foo.end(), bar.begin(), bar.end());

I can't reproduce the warning here with a vector example, the function is
entirely optimized away too. But even if I return the result, e.g.:

std::vector f(std::vector bar)
{
  std::vector foo;
  foo.insert(foo.end(), bar.begin(), bar.end());
  return foo;
}

still no warning. But I think you might need to reduce the mutatee.cxx case.

That said, the warning you show is triggered in a different place, and the
"between 9 and 9223372036854775800 bytes" is also different.

[Bug c++/113543] New: Poor codegen for bit-counting functions (countl_zero, countl_one, countr_zero, countr_one)

2024-01-22 Thread janschultke at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113543

Bug ID: 113543
   Summary: Poor codegen for bit-counting functions (countl_zero,
countl_one, countr_zero, countr_one)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: janschultke at googlemail dot com
  Target Milestone: ---

## Code to Reproduce (https://godbolt.org/z/qPeszhaPv)

#include 

template 
T countr_zero(T x) {
return std::countr_zero(x);
}

template unsigned char countr_zero(unsigned char);
template unsigned short countr_zero(unsigned short);
template unsigned int countr_zero(unsigned int);
template unsigned long countr_zero(unsigned long);
template unsigned long long countr_zero(unsigned long long);

template 
T countr_one(T x) {
return std::countr_one(x);
}

template unsigned char countr_one(unsigned char);
template unsigned short countr_one(unsigned short);
template unsigned int countr_one(unsigned int);
template unsigned long countr_one(unsigned long);
template unsigned long long countr_one(unsigned long long);


template 
T countl_zero(T x) {
return std::countl_zero(x);
}

template unsigned char countl_zero(unsigned char);
template unsigned short countl_zero(unsigned short);
template unsigned int countl_zero(unsigned int);
template unsigned long countl_zero(unsigned long);
template unsigned long long countl_zero(unsigned long long);

template 
T countl_one(T x) {
return std::countl_zero(x);
}

template unsigned char countl_one(unsigned char);
template unsigned short countl_one(unsigned short);
template unsigned int countl_one(unsigned int);
template unsigned long countl_one(unsigned long);
template unsigned long long countl_one(unsigned long long);


## Summary

GCC consistently emits much more code for these function than clang.
For example, GCC:

> unsigned int countl_one(unsigned int):
>   xor eax, eax
>   lzcnt   eax, edi
>   ret

Clang does not emit the extra xor instruction. I don't really know why. LZCNT
has a wide contract and should be equivalent to std::countl_zero.

It gets a lot worse though:

> unsigned short countl_zero(unsigned short):
>   mov eax, 16
>   testdi, di
>   je  .L23
>   movzx   edi, di
>   lzcnt   edi, edi
>   lea eax, [rdi-16]
> .L23:
>   ret

I don't really know what all of this schmutz is. Clang emits lzcnt and ret in
this case.


Another bit of disappointing codegen is this:
> unsigned char countr_zero(unsigned char):
>   movzx   eax, dil
>   xor edx, edx
>   tzcnt   edx, eax
>   testdil, dil
>   mov eax, 8
>   cmovne  eax, edx
>   ret

Clang emits:
>   or  edi, 256
>   tzcnt   eax, edi
>   ret

This clang codegen is very clever. It simply adds a bit on the left, so that
the 32-bit routine can be re-used with only one additional instruction.

[Bug c++/113544] New: [14 Regression] bogus incomplete type error with dependent data member in local class in generic lambda since r14-278

2024-01-22 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113544

Bug ID: 113544
   Summary: [14 Regression] bogus incomplete type error with
dependent data member in local class in generic lambda
since r14-278
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ppalka at gcc dot gnu.org
  Target Milestone: ---

template
void f() {
  [](auto parm) {
struct type {
  decltype(parm) x;
};
  };
}

template void f();

: In instantiation of ‘struct f()type’:
:6:5:   required from ‘void f() [with T = int]’
:10:22:   required from here
:5:22: error: ‘f()type::x’ has incomplete type
:5:22: error: invalid use of dependent type ‘decltype (parm)’

[Bug c++/113544] [14 Regression] bogus incomplete type error with dependent data member in local class in generic lambda since r14-278

2024-01-22 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113544

Patrick Palka  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Target Milestone|--- |14.0
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-22

[Bug c++/113545] New: ICE in label_matches with constexpr function with switch-statement and converted (nonconstant, cast address) input

2024-01-22 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113545

Bug ID: 113545
   Summary: ICE in label_matches with constexpr function with
switch-statement and converted (nonconstant, cast
address) input
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hp at gcc dot gnu.org
  Target Milestone: ---

For the following test-case, g++ ICEs from at least gcc-10 and forward (i.e.
apparently not a regression):
```
char foo;

constexpr unsigned char bar(__UINTPTR_TYPE__ baz)
{
  switch (baz)
{
case 13:
  return 11;
case 14:
  return 78;
case 2048:
  return 13;
default:
  return 42;
}
}

__attribute__ ((__noipa__))
void xyzzy(int x)
{
  if (x != 42)
__builtin_abort ();
}

int main()
{
  unsigned const char c = bar(reinterpret_cast<__UINTPTR_TYPE__>(&foo));
  xyzzy(c);
}
'''

Example backtrace with -std=c++20 -O3:

../n.cc: In function 'int main()':
../n.cc:27:30:   in 'constexpr' expansion of 'bar(((long unsigned int)(&
foo)))'
../n.cc:5:3: internal compiler error: in label_matches, at cp/constexpr.cc:6925
5 |   switch (baz)
  |   ^~
0x63894c label_matches
/gcctop/gcc/cp/constexpr.cc:6925
0xa0bb3d cxx_eval_constant_expression
/gcctop/gcc/cp/constexpr.cc:7319
0xa0d2b2 cxx_eval_statement_list
/gcctop/gcc/cp/constexpr.cc:6969
0xa0d2b2 cxx_eval_constant_expression
/gcctop/gcc/cp/constexpr.cc:8316
0xa1e782 cxx_eval_switch_expr
/gcctop/gcc/cp/constexpr.cc:7115
0xa0cb6b cxx_eval_constant_expression
/gcctop/gcc/cp/constexpr.cc:8412
0xa0aae6 cxx_eval_call_expression
/gcctop/gcc/cp/constexpr.cc:3288
0xa0c2ef cxx_eval_constant_expression
/gcctop/gcc/cp/constexpr.cc:7524
0xa17d9a cxx_eval_outermost_constant_expr
/gcctop/gcc/cp/constexpr.cc:8822
0xa1d28f maybe_constant_value(tree_node*, tree_node*, mce_value)
/gcctop/gcc/cp/constexpr.cc:9110
0xa49e40 cp_fully_fold
/gcctop/gcc/cp/cp-gimplify.cc:2831
0xa49ed9 cp_fully_fold
/gcctop/gcc/cp/cp-gimplify.cc:2825
0xa49ed9 cp_fully_fold_init(tree_node*)
/gcctop/gcc/cp/cp-gimplify.cc:2861
0xc7a204 store_init_value(tree_node*, tree_node*, vec**, int)
/gcctop/gcc/cp/typeck2.cc:926
0xa6ca32 check_initializer
/gcctop/gcc/cp/decl.cc:7810
0xa941be cp_finish_decl(tree_node*, tree_node*, bool, tree_node*, int,
cp_decomp*)
/gcctop/gcc/cp/decl.cc:8842
0xb95477 cp_parser_init_declarator
/gcctop/gcc/cp/parser.cc:23618
0xb6ac98 cp_parser_simple_declaration
/gcctop/gcc/cp/parser.cc:15890
0xb8f830 cp_parser_declaration_statement
/gcctop/gcc/cp/parser.cc:14926
0xb97215 cp_parser_statement
/gcctop/gcc/cp/parser.cc:12882
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug target/113543] Poor codegen for bit-counting functions (countl_zero, countl_one, countr_zero, countr_one)

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113543

Andrew Pinski  changed:

   What|Removed |Added

  Component|c++ |target

--- Comment #1 from Andrew Pinski  ---
>Clang does not emit the extra xor instruction. I don't really know why. 

This is a performance errata in some Intel cores and GCC implements that while
LLVM/clang does NOT. See PR 62011 on that.

[Bug target/113543] Poor codegen for bit-counting functions (countl_zero, countl_one, countr_zero, countr_one)

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113543

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
The rest are a dup of bug 110679.

*** This bug has been marked as a duplicate of bug 110679 ***

[Bug tree-optimization/110679] Missed optimization opportunity with countr_zero

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110679

Andrew Pinski  changed:

   What|Removed |Added

 CC||janschultke at googlemail dot 
com

--- Comment #2 from Andrew Pinski  ---
*** Bug 113543 has been marked as a duplicate of this bug. ***

[Bug other/111966] GCN '--with-arch=[...]' not considered for 'mkoffload' default 'elf_arch'

2024-01-22 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111966

Tobias Burnus  changed:

   What|Removed |Added

 CC||burnus at gcc dot gnu.org
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Tobias Burnus  ---
FIXED on mainline/GCC 14.

[Bug c++/113544] [14 Regression] bogus incomplete type error with dependent data member in local class in generic lambda since r14-278

2024-01-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113544

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1
 CC||jakub at gcc dot gnu.org

[Bug c++/102626] [c++20] compiler crash when invoking constexpr function in the constructor of template class

2024-01-22 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102626

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=86933,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=92969

--- Comment #5 from Patrick Palka  ---
PR86933, PR92969 and this all seem related.  GCC seems to mishandle a type
template parameter pack appearing in the return type of a pointer to data
member NTTP pack:

typename ...Ts, Ts S::* ...ms

[Bug rtl-optimization/113546] New: aarch64: bootstrap-debug-lean broken with -fcompare-debug failure

2024-01-22 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113546

Bug ID: 113546
   Summary: aarch64: bootstrap-debug-lean broken with
-fcompare-debug failure
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: acoplan at gcc dot gnu.org
  Target Milestone: ---

I tried a bootstrap --with-build-config=bootstrap-debug-lean on aarch64 and it
failed with an -fcompare-debug failure building libiberty/regex.c:

make[3]: Entering directory '/data/ajc/toolchain/builds/bstrap-lean/libiberty'
if [ x"-fPIC" != x ]; then \
  /home/alecop01/toolchain/builds/bstrap-lean/./prev-gcc/xgcc
-B/home/alecop01/toolchain/builds/bstrap-lean/./prev-gcc/
-B/home/alecop01/toolchain/builds/bstrap-lean/aarch64-unknown-linux-gnu/bin/
-B/home/alecop01/toolchain/builds/bstrap-lean/aarch64-unknown-linux-gnu/bin/
-B/home/alecop01/toolchain/builds/bstrap-lean/aarch64-unknown-linux-gnu/lib/
-isystem
/home/alecop01/toolchain/builds/bstrap-lean/aarch64-unknown-linux-gnu/include
-isystem
/home/alecop01/toolchain/builds/bstrap-lean/aarch64-unknown-linux-gnu/sys-include
  -fchecking=1 -c -DHAVE_CONFIG_H -g -O2 -fchecking=1 -fcompare-debug  -I.
-I/home/alecop01/toolchain/src/gcc/libiberty/../include  -W -Wall
-Wwrite-strings -Wc++-compat -Wstrict-prototypes -Wshadow=local -pedantic 
-D_GNU_SOURCE  -fPIC /home/alecop01/toolchain/src/gcc/libiberty/regex.c -o
pic/regex.o; \
else true; fi
xgcc: error: /home/alecop01/toolchain/src/gcc/libiberty/regex.c:
‘-fcompare-debug’ failure
Makefile:1219: recipe for target 'regex.o' failed
make[3]: *** [regex.o] Error 1
make[3]: Leaving directory '/data/ajc/toolchain/builds/bstrap-lean/libiberty'
Makefile:11725: recipe for target 'all-stage3-libiberty' failed
make[2]: *** [all-stage3-libiberty] Error 2
make[2]: Leaving directory '/data/ajc/toolchain/builds/bstrap-lean'
Makefile:26292: recipe for target 'stage3-bubble' failed
make[1]: *** [stage3-bubble] Error 2
make[1]: Leaving directory '/data/ajc/toolchain/builds/bstrap-lean'
Makefile:1099: recipe for target 'all' failed
make: *** [all] Error 2

Here is a reduced testcase for that:

$ cat t.c
int x;
void f() {
fail:
  switch (x) { case 0: goto fail;; }
}
$ ./xgcc -B . -c t.c -fcompare-debug -O -S -o /dev/null
xgcc: error: t.c: ‘-fcompare-debug’ failure

[Bug c++/113545] ICE in label_matches with constexpr function with switch-statement and converted (nonconstant, cast address) input

2024-01-22 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113545

Hans-Peter Nilsson  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-01-22
 Ever confirmed|0   |1

[Bug target/113030] parsecpu.awk's chkarch/chkcpu commands is broken for aliases

2024-01-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113030

--- Comment #6 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:41caf6b0d603408a829b37f7f7fb09d64d814d48

commit r14-8337-g41caf6b0d603408a829b37f7f7fb09d64d814d48
Author: Andrew Pinski 
Date:   Sat Jan 20 23:12:31 2024 -0800

arm: Fix parsecpu.awk for aliases [PR113030]

So the problem here is the 2 functions check_cpu and check_arch use
the wrong variable to check if an alias is valid for that cpu/arch.
check_cpu uses cpu_optaliases instead of cpu_opt_alias. cpu_optaliases
is an array of index'ed by the cpuname that contains all of the valid
aliases
for that cpu but cpu_opt_alias is an double index array which is index'ed
by cpuname and the alias which provides what is the alias for that option.
Similar thing happens for check_arch and arch_optaliases vs
arch_optaliases.

Tested by running:
```
awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+simd"
config/arm/arm-cpus.in
awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+neon"
config/arm/arm-cpus.in
awk -f config/arm/parsecpu.awk -v cmd="chkarch armv7-a+neon-vfpv3"
config/arm/arm-cpus.in
```
And they don't return error back.

gcc/ChangeLog:

PR target/113030
* config/arm/parsecpu.awk (check_cpu): Use cpu_opt_alias
instead of cpu_optaliases.
(check_arch): Use arch_opt_alias instead of arch_optaliases.

Signed-off-by: Andrew Pinski 

[Bug target/113030] parsecpu.awk's chkarch/chkcpu commands is broken for aliases

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113030

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |14.0

--- Comment #7 from Andrew Pinski  ---
Fixed.

[Bug fortran/113152] Fortran 2023 half-cycle trigonometric functions

2024-01-22 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113152

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

 CC||anlauf at gcc dot gnu.org

--- Comment #16 from anlauf at gcc dot gnu.org ---
(In reply to Steve Kargl from comment #14)
> On Sun, Jan 21, 2024 at 09:52:39PM +, anlauf at gcc dot gnu.org wrote:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113152
> > 
> > I think that you cannot do
> > 
> > +  if (MPFR_HALF_CYCLE)
> > 
> > you really must use
> > 
> > #if MPFR_HALF_CYCLE
> > 
> 
> #include 
> #include "mpfr.h"
> 
> #define MPFR_HALF_CYCLE (MPFR_VERSION_MAJOR * 100 + MPFR_VERSION_MINOR >=
> 402)
> 
> int
> main(void)
> {
>if (MPFR_HALF_CYCLE)
>   printf("here\n");
>else
>   printf("there\n");
>return (0);
> }
> 
> % cc -o z -I/usr/local/include a.c && ./z

This does not test the right thing.

% cat sgk.cc
#include 

#define MPFR_HALF_CYCLE 0

int
main(void)
{
   if (MPFR_HALF_CYCLE)
  printf_not_declared_if_0 ("here\n");
   else
  printf ("there\n");
   return (0);
}

% g++ sgk.cc
sgk.cc: In function 'int main()':
sgk.cc:9:7: error: 'printf_not_declared_if_0' was not declared in this scope
   printf_not_declared_if_0 ("here\n");
   ^~~~

[Bug tree-optimization/113476] [14 Regression] irange::maybe_resize leaks memory via IPA VRP

2024-01-22 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113476

--- Comment #4 from Martin Jambor  ---
The right place where to free stuff in lattices post-IPA would be in
ipa_node_params::~ipa_node_params() where we should iterate over lattices and
deinitialize them or perhaps destruct the array because since ipcp_vr_lattice
directly contains Value_Range which AFAIU directly contains int_range_max which
has a virtual destructor... does not look like a POD anymore.  This has escaped
me when I was looking at the IPA-VR changes but hopefully it should not be too
difficult to deal with.

[Bug fortran/113152] Fortran 2023 half-cycle trigonometric functions

2024-01-22 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113152

--- Comment #17 from Steve Kargl  ---
On Mon, Jan 22, 2024 at 05:35:41PM +, anlauf at gcc dot gnu.org wrote:
> --- Comment #16 from anlauf at gcc dot gnu.org ---
> (In reply to Steve Kargl from comment #14)
> > On Sun, Jan 21, 2024 at 09:52:39PM +, anlauf at gcc dot gnu.org wrote:
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113152
> > > 
> > > I think that you cannot do
> > > 
> > > +  if (MPFR_HALF_CYCLE)
> > > 
> > > you really must use
> > > 
> > > #if MPFR_HALF_CYCLE
> > > 
> > 
> > #include 
> > #include "mpfr.h"
> > 
> > #define MPFR_HALF_CYCLE (MPFR_VERSION_MAJOR * 100 + MPFR_VERSION_MINOR >=
> > 402)
> > 
> > int
> > main(void)
> > {
> >if (MPFR_HALF_CYCLE)
> >   printf("here\n");
> >else
> >   printf("there\n");
> >return (0);
> > }
> > 
> > % cc -o z -I/usr/local/include a.c && ./z
> 
> This does not test the right thing.
> 
> % cat sgk.cc
> #include 
> 
> #define MPFR_HALF_CYCLE 0

This is not what the pre-processor should be doing
(on at least FreeBSD).  See below.


> int
> main(void)
> {
>if (MPFR_HALF_CYCLE)
>   printf_not_declared_if_0 ("here\n");
>else
>   printf ("there\n");
>return (0);
> }
> 
> % g++ sgk.cc
> sgk.cc: In function 'int main()':
> sgk.cc:9:7: error: 'printf_not_declared_if_0' was not declared in this scope
>printf_not_declared_if_0 ("here\n");
>^~~~

Of course, it will fail.  You need to actually have a
printf_not_declared_if_0 function defined during parsing.

#include 
#include 

#define MPFR_HALF_CYCLE 1
#define printf_not_declared_if_0(a) abort()

int
main(void)
{
   if (MPFR_HALF_CYCLE)
  printf_not_declared_if_0 ("here\n");
   else
  printf ("there\n");
   return (0);
}

~/work/x/bin/g++ -I/usr/local/include -o z a.cc && ./z
Abort (core dumped)

Changing 1 to 0 the MPFR_HALF_CYCLE define.

 ~/work/x/bin/g++ -I/usr/local/include -o z a.cc && ./z
there

Going back to my original example and g++ from master, I'm seeing

% ~/work/x/bin/g++ -I/usr/local/include -E a.cc

int
main(void)
{
   if ((
# 9 "a.cc" 3
  4 
# 9 "a.cc"
  * 100 + 
# 9 "a.cc" 3
  2 
# 9 "a.cc"
  >= 402))
  printf("here\n");
   else
  printf("there\n");
   return (0);
}

and with clang++

% c++ -E -I/usr/local/include a.cc
int
main(void)
{
   if ((4 * 100 + 2 >= 402))
  printf("here\n");
   else
  printf("there\n");
   return (0);
}

Is there something that is different between your OS and FreeBSD?
Or is there some fundamental difference between C and C++ that
I am unaware of?

[Bug debug/113382] FAIL: gcc.dg/debug/btf/btf-bitfields-3.c scan-assembler-times [\t ]0x6000004[\t ]+[^\n]*btt_info 1

2024-01-22 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113382

--- Comment #5 from John David Anglin  ---
The problem seems to be DW_AT_encoding is not found in this call:
static ctf_id_t
gen_ctf_enumeration_type (ctf_container_ref ctfc, dw_die_ref enumeration)
{
  const char *enum_name = get_AT_string (enumeration, DW_AT_name);
  unsigned int bit_size = ctf_die_bitsize (enumeration);
  unsigned int signedness = get_AT_unsigned (enumeration, DW_AT_encoding);

get_AT() returns NULL.

This is because dwarf_strict is 1:
  if (!dwarf_strict)
add_AT_unsigned (type_die, DW_AT_encoding,
 TYPE_UNSIGNED (type)
 ? DW_ATE_unsigned
 : DW_ATE_signed);

I believe we need to add -gno-strict-dwarf option on hppa*64*-*-hpux*.

[Bug c++/113529] Incorrect result of requires-expression in case of function call ambiguity and `operator<=>`

2024-01-22 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113529

Patrick Palka  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 CC||ppalka at gcc dot gnu.org
 Status|NEW |ASSIGNED

[Bug rtl-optimization/113546] [13/14 Regression] aarch64: bootstrap-debug-lean broken with -fcompare-debug failure since r13-2921-gf1adf45b17f7f1

2024-01-22 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113546

Alex Coplan  changed:

   What|Removed |Added

   Keywords||compare-debug-failure
Summary|aarch64:|[13/14 Regression] aarch64:
   |bootstrap-debug-lean broken |bootstrap-debug-lean broken
   |with -fcompare-debug|with -fcompare-debug
   |failure |failure since
   ||r13-2921-gf1adf45b17f7f1
 Target||aarch64-*-*

--- Comment #1 from Alex Coplan  ---
The reduced testcase started failing with
r13-2921-gf1adf45b17f7f1ede463524d80032bb2ec866ead:

commit f1adf45b17f7f1ede463524d80032bb2ec866ead
Author: Eugene Rozenfeld 
Date:   Thu Apr 21 23:42:15 2022

Add instruction level discriminator support.

This is the first in a series of patches to enable discriminator support
in AutoFDO.

[Bug rtl-optimization/113546] [13/14 Regression] aarch64: bootstrap-debug-lean broken with -fcompare-debug failure since r13-2921-gf1adf45b17f7f1

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113546

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=107169

--- Comment #2 from Andrew Pinski  ---
(In reply to Alex Coplan from comment #1)
> The reduced testcase started failing with
> r13-2921-gf1adf45b17f7f1ede463524d80032bb2ec866ead:
> 
> commit f1adf45b17f7f1ede463524d80032bb2ec866ead
> Author: Eugene Rozenfeld 
> Date:   Thu Apr 21 23:42:15 2022
> 
> Add instruction level discriminator support.
> 
> This is the first in a series of patches to enable discriminator support
> in AutoFDO.

That means this is most likely a dup of bug 107169.

[Bug rtl-optimization/113546] [13/14 Regression] aarch64: bootstrap-debug-lean broken with -fcompare-debug failure since r13-2921-gf1adf45b17f7f1

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113546

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=100733

--- Comment #3 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #2)
> That means this is most likely a dup of bug 107169.

And PR 100733.

[Bug rtl-optimization/113546] [13/14 Regression] aarch64: bootstrap-debug-lean broken with -fcompare-debug failure since r13-2921-gf1adf45b17f7f1

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113546

--- Comment #4 from Andrew Pinski  ---
Note the reduced testcase might NOT be a representative of the original issue
though ...

[Bug fortran/113152] Fortran 2023 half-cycle trigonometric functions

2024-01-22 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113152

--- Comment #18 from anlauf at gcc dot gnu.org ---
(In reply to Steve Kargl from comment #17)
> Is there something that is different between your OS and FreeBSD?
> Or is there some fundamental difference between C and C++ that
> I am unaware of?

You should not expect everybody to have the latest MPFR installed.
That's the whole point.

Please use #if / #else / #endif

[Bug ada/113536] valid reduction expression rejected by -gnatVo

2024-01-22 Thread devotus at yahoo dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113536

--- Comment #1 from Jack Perry  ---
Per Simon Wright, gcc 14.0.0 does not fail on this, whereas gcc 14.0.1 does, in
the same location, but with a different error: `expected type "Value"... found
type "Standard.Character"`

I edited his message to conform with the types I used in the example below, but
I've also observed it on godbolt's compiler explorer when using gnat "trunk".

[Bug debug/113382] FAIL: gcc.dg/debug/btf/btf-bitfields-3.c scan-assembler-times [\t ]0x6000004[\t ]+[^\n]*btt_info 1

2024-01-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113382

--- Comment #6 from GCC Commits  ---
The master branch has been updated by John David Anglin :

https://gcc.gnu.org/g:bc77c035c45bb224790b1c03d06a64c8a1cc51c5

commit r14-8338-gbc77c035c45bb224790b1c03d06a64c8a1cc51c5
Author: John David Anglin 
Date:   Mon Jan 22 19:07:32 2024 +

Add -gno-strict-dwarf to dg-options in various btf enum tests

The -gno-strict-dwarf option is needed to ensure enum signedness
is added to type_die.

2024-01-22  John David Anglin  

gcc/testsuite/ChangeLog:

PR debug/113382
* gcc.dg/debug/btf/btf-bitfields-3.c: Add -gno-strict-dwarf
option to dg-options.
* gcc.dg/debug/btf/btf-enum-1.c: Likewise.
* gcc.dg/debug/btf/btf-enum-small.c: Likewise.
* gcc.dg/debug/btf/btf-enum64-1.c: Likewise.

[Bug debug/113382] FAIL: gcc.dg/debug/btf/btf-bitfields-3.c scan-assembler-times [\t ]0x6000004[\t ]+[^\n]*btt_info 1

2024-01-22 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113382

John David Anglin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from John David Anglin  ---
Fixed on trunk.

[Bug c++/113547] New: c++: In function ‘std::vector package_b_info()’: cc1plus: internal compiler error: Segmentation fault

2024-01-22 Thread csfore at posteo dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113547

Bug ID: 113547
   Summary: c++: In function ‘std::vector package_b_info()’:
cc1plus: internal compiler error: Segmentation fault
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: csfore at posteo dot net
  Target Milestone: ---

Created attachment 57184
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57184&action=edit
original preprocessed file

Originally reported downstream in Gentoo at https://bugs.gentoo.org/920322 when
building package =dev-util/build2-0.14.0

Command line required to trigger for the original:

x86_64-pc-linux-gnu-g++ -std=c++20 -c -fdirectives-only manifest-utility.o.ii

Command line required for the minimized version:

x86_64-pc-linux-gnu-g++ manifest-utility.o.ii


$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/13/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-13.2.1_p20240113-r1/work/gcc-13-20240113/configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/13
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/13/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/13
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/13/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/13/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/13/include/g++-v13
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/13/python
--enable-languages=c,c++,go,fortran --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --enable-nls --without-included-gettext
--disable-libunwind-exceptions --enable-checking=release
--with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo
13.2.1_p20240113-r1 p12' --with-gcc-major-version-only --enable-libstdcxx-time
--enable-lto --disable-libstdcxx-pch --enable-shared --enable-threads=posix
--enable-__cxa_atexit --enable-clocale=gnu --enable-multilib
--with-multilib-list=m32,m64 --disable-fixed-point --enable-targets=all
--enable-libgomp --disable-libssp --disable-libada --disable-cet
--disable-systemtap --disable-valgrind-annotations --disable-vtable-verify
--disable-libvtv --with-zstd --without-isl --enable-default-pie
--enable-default-ssp
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.1 20240113 (Gentoo 13.2.1_p20240113-r1 p12)

[Bug c++/113547] c++: In function ‘std::vector package_b_info()’: cc1plus: internal compiler error: Segmentation fault

2024-01-22 Thread csfore at posteo dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113547

--- Comment #1 from Christopher Fore  ---
Created attachment 57185
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57185&action=edit
minimized file with cvise

[Bug c++/113547] [13 Regression] c++: In function ‘std::vector package_b_info()’: cc1plus: internal compiler error: Segmentation fault

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113547

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |13.3
Summary|c++: In function|[13 Regression] c++: In
   |‘std::vector|function ‘std::vector
   |package_b_info()’: cc1plus: |package_b_info()’: cc1plus:
   |internal compiler error:|internal compiler error:
   |Segmentation fault  |Segmentation fault

[Bug c++/113547] [13 Regression] c++: In function ‘std::vector package_b_info()’: cc1plus: internal compiler error: Segmentation fault

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113547

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113347

--- Comment #2 from Andrew Pinski  ---
Most likely a dup of bug 113347.

[Bug libgomp/113513] [OpenMP] libgomp: cuCtxGetDevice error with OMP_DISPLAY_ENV=true OMP_TARGET_OFFLOAD="mandatory" for libgomp.c/target-52.c

2024-01-22 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113513

--- Comment #2 from Tobias Burnus  ---
Patch:
  https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643648.html

[Bug c++/113541] Rejects __attribute__((section)) on explicit instantiation declaration of ctor/dtor

2024-01-22 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113541

Marek Polacek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||mpolacek at gcc dot gnu.org
   Last reconfirmed||2024-01-22

--- Comment #1 from Marek Polacek  ---
The error started with r5-1210-ge257a17cb9cc4d.

[Bug tree-optimization/113548] New: ICE vect-ifcvt-19 in build2, at tree.cc:5097

2024-01-22 Thread nightstrike at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113548

Bug ID: 113548
   Summary: ICE vect-ifcvt-19 in build2, at tree.cc:5097
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nightstrike at gmail dot com
  Target Milestone: ---

Created attachment 57186
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57186&action=edit
Preprocessed source from -freport-bug

ICE during testsuite run for vect-ifcvt-19.  Many similarly titled bugs, I
think the current title of this one is equally unhelpful, so feel free to
change this PR title to something better.

Running linux 64 to windows 64 cross compiler, fails on 11, 12, 13, 14.  I
didn't test prior versions.

Backtrace:

0x8bb855 build2(tree_code, tree_node*, tree_node*, tree_node*)
../../gcc/tree.cc:5097
0xa4dd1f build2_loc(unsigned int, tree_code, tree_node*, tree_node*,
tree_node*)
../../gcc/tree.h:4750
0xa4dd1f c_parser_gimple_binary_expression
../../gcc/c/gimple-parser.cc:1068
0xa4ec71 c_parser_gimple_statement
../../gcc/c/gimple-parser.cc:878
0xa4f95a c_parser_gimple_compound_statement
../../gcc/c/gimple-parser.cc:669
0xa51a58 c_parser_parse_gimple_body(c_parser*, char*, c_declspec_il,
profile_count)
../../gcc/c/gimple-parser.cc:253
0xa3d3f4 c_parser_declaration_or_fndef
../../gcc/c/c-parser.cc:3011
0xa4764b c_parser_external_declaration
../../gcc/c/c-parser.cc:2046
0xa48035 c_parser_translation_unit
../../gcc/c/c-parser.cc:1900
0xa48035 c_parse_file()
../../gcc/c/c-parser.cc:26815
0xabf271 c_common_parse_file()
../../gcc/c-family/c-opts.cc:1301

[Bug testsuite/113548] gcc.dg/vect/vect-ifcvt-19.c ICEs on LLP64 target

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113548

Andrew Pinski  changed:

   What|Removed |Added

  Component|tree-optimization   |testsuite
   Keywords||testsuite-fail
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=108954
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
Summary|ICE vect-ifcvt-19 in|gcc.dg/vect/vect-ifcvt-19.c
   |build2, at tree.cc:5097 |ICEs on LLP64 target
 Target|x86_64-w64-mingw32  |*-*-mingw
   Last reconfirmed||2024-01-22

--- Comment #1 from Andrew Pinski  ---
Note this is just the bug for the testcase issue rather than the ICE, the ICE
is PR 108954 .

We should change  the type of _2 and _1  to __SIZE_TYPE__ from `unsigned long`
as size_type on mingw (and some other targets) is NOT the same size as
`unsigned long`.

[Bug target/109929] profiledbootstrap failure on aarch64-linux-gnu with graphite optimization

2024-01-22 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109929

--- Comment #6 from Xi Ruoyao  ---
The first commit deferring the failure to stagefeedback is:

commit 575858508090b18dcbc176db285c9f55227ca4c0
Author: Richard Sandiford 
Date:   Tue Oct 17 23:46:33 2023 +0100

aarch64: Use vecs to store register save order

[Bug testsuite/113548] gcc.dg/vect/vect-ifcvt-19.c ICEs on LLP64 target

2024-01-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113548

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #2 from Andrew Pinski  ---
I will fix this testcase today or tomorrow.  It should not be hard.

[Bug c++/109642] False Positive -Wdangling-reference with std::span-like classes

2024-01-22 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109642

--- Comment #14 from Marek Polacek  ---
(In reply to Miro Palmu from comment #11)
> I'm not sure if this is useful information but, using span with a view in a
> ranged-based for loop triggers false positive -Wdangling-referene on gcc
> 14.0.1 20240117 but not on gcc 13.2.
> 
> // On godbold: https://godbolt.org/z/x9jKh4MoW
> #include 
> #include 
> #include 
> 
> int main() {
> const auto vec = std::vector{ 1, 2, 3 };
> const auto s = std::span{vec};
> 
> // -Wwaring=dangling-reference on gcc 14.0.1 20240117 but not on gcc 13.2
> for ([[maybe_unused]] auto _ : s | std::views::take(2)) { }
> 
> // No warning
> for ([[maybe_unused]] auto _ : vec | std::views::take(2)) { }
> 
> // No warning
> const auto s_view = s | std::views::take(2);
> for ([[maybe_unused]] auto _ : s_view) { }
> }

This should be fixed now.  I'm going to expand Wdangling-reference17.C with
this test though.  Thanks.

[Bug target/109929] profiledbootstrap failure on aarch64-linux-gnu with graphite optimization

2024-01-22 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109929

--- Comment #7 from Richard Sandiford  ---
Hmm, yeah, like you say, neither of those commits should have made a different
to whether bootstrap works.  I guess the problem is just latent now.

[Bug target/113549] New: float simd crash on windows in gcc.dg/vect/vect-simd-clone-16b.c

2024-01-22 Thread nightstrike at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113549

Bug ID: 113549
   Summary: float simd crash on windows in
gcc.dg/vect/vect-simd-clone-16b.c
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nightstrike at gmail dot com
  Target Milestone: ---

Created attachment 57187
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57187&action=edit
Assembly output

The vect-simd-clone-16b.c test runs the vect-simd-clone-16.c test with the TYPE
set to float.  The default type is int, which works fine.  Reducing that
testcase yields the following:


```
#define TYPE float
#pragma omp declare simd inbranch
TYPE __attribute__((noinline))
foo (TYPE a)
{
  return a + 1;
}

void
masked_fixed (TYPE * a, TYPE * b)
{
  #pragma omp simd
  for (int i = 0; i < 128; i++)
b[i] = a[i]<1 ? foo(a[i]) : a[i];
}

int main() {
TYPE a[1024] = {0};
TYPE b[1024] = {0};
masked_fixed(a, b);
return 0;
}
```

The noipa attribute and the __restrict keywords were removed from masked_fixed.
 noinline is required on foo.


Minimal set of compile arguments required to trigger the problem:
$ x86_64-w64-mingw32-gcc  a.c -fopenmp-simd -O2 -mavx

Note that dropping to -01 or removing -mavx avoids the crash.  

Assembly from -save-temps -fverbose-asm attached.

This is technically running under wine 8.0.  This is the backtrace provided by
wine:

```
wine: Unhandled page fault on read access to  at address
00014000163F (thread 0024), starting debugger...
Unhandled exception: page fault on read access to 0x in 64-bit
code (0x014000163f).
Register dump:
 rip:00014000163f rsp:0021dc50 rbp:0021dcd0 eflags:00010246
(  R- --  I  Z- -P- )
 rax: rbx: rcx:0021dcf0
rdx:0021dcd0
 rsi:0021ed70 rdi:0021dd70  r8:0021dcb0 
r9:00c92000 r10:00c90330
 r11: r12:0021dcb0 r13:0021dcf0
r14: r15:
Stack dump:
0x21dc50:   
0x21dc60:   
0x21dc70:   
0x21dc80:   
0x21dc90:   
0x21dca0:   
0x21dcb0:   
0x21dcc0:   
0x21dcd0:   
0x21dce0:   
0x21dcf0:   
0x21dd00:   
Backtrace:
=>0 0x014000163f in a (+0x163f) (0x21dcd0)
  1 0x0140003384 in a (+0x3384) (0x21fdf0)
  2 0x0140001340 in a (+0x1340) (0x21fdf0)
  3 0x0140001146 in a (+0x1146) (0x21fe30)
  4 0x007b647b51 BaseThreadInitThunk+0x11(unknown=,
entry=, arg=)
[H:\home\user\p\gcc\src\wine-8.0-rc4p2p3\dlls\kernel32\thread.c:61] in kernel32
(0x0
00021fe60)
0x014000163f a+0x163f: ldsl %esp,%edi
```

[Bug target/113549] float simd crash on windows in gcc.dg/vect/vect-simd-clone-16b.c

2024-01-22 Thread nightstrike at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113549

--- Comment #1 from nightstrike  ---
Created attachment 57188
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57188&action=edit
Failing source for easier copying

  1   2   >