[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #8 from Sergei Trofimovich  ---
Looks like it's mainly -O0.

Why not try to use at least -O1 for bootstrap? Perhaps it was a safe default to
workaround host compiler bugs in C days.

But nowadays gcc uses -std=c++11 with quite a bit of abstractions to remove at
-O0. Maybe having a disableable -O1 (or even default -O2) would be a better
default?

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #7 from Andrew Pinski  ---
I am not sure there is not much to be done here really since the issue is
profilingbootstrap will use -O0 for stage1 to make sure we don't run into bugs
in host compiler (though we still run into issues there).

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #6 from Sergei Trofimovich  ---
And here is fomr completeness default checking with CC='gcc -g -O2' CXX='g++ -g
-O2':

$ ~/dev/git/gcc/configure --disable-multilib --enable-languages=c,c++ 'CC=gcc
-g -O2' 'CXX=g++ -g -O2'

$ /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/ -v
Reading specs from /tmp/gb/./prev-gcc/specs
COLLECT_GCC=/tmp/gb/./prev-gcc/xg++
COLLECT_LTO_WRAPPER=/tmp/gb/./prev-gcc/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/slyfox/dev/git/gcc/configure --disable-multilib
--enable-languages=c,c++ CC='gcc -g -O2' CXX='g++ -g -O2'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230926 (experimental) (GCC)

Result is 1m57s:

$ time /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/
-B/usr/local/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include
-I/home/slyfox/dev/git/gcc/libstdc++-v3/libsupc++
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g
-O2 -fno-checking -gtoggle -fprofile-generate -DIN_GCC -fno-exceptions
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Wconditionally-supported
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -fno-PIE -I. -I.
-I/home/slyfox/dev/git/gcc/gcc -I/home/slyfox/dev/git/gcc/gcc/.
-I/home/slyfox/dev/git/gcc/gcc/../include
-I/home/slyfox/dev/git/gcc/gcc/../libcpp/include
-I/home/slyfox/dev/git/gcc/gcc/../libcody
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber/bid -I../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libbacktrace -o insn-recog.o -MT insn-recog.o
-MMD -MP -MF ./.deps/insn-recog.TPo insn-recog.cc

real1m57,549s
user1m56,617s
sys 0m0,780s

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #5 from Sergei Trofimovich  ---
(In reply to Andrew Pinski from comment #3)
> Note prev-gcc/cc1plus is compiled at -O0 also which definitely makes things
> worse here.

Also tried with: '--enable-checking=release -O2 -g' as:

$ ~/dev/git/gcc/configure --disable-multilib --enable-languages=c,c++
--enable-checking=release 'CC=gcc -g -O2' 'CXX=g++ -g -O2'

$ /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/ -v
Reading specs from /tmp/gb/./prev-gcc/specs
COLLECT_GCC=/tmp/gb/./prev-gcc/xg++
COLLECT_LTO_WRAPPER=/tmp/gb/./prev-gcc/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/slyfox/dev/git/gcc/configure --disable-multilib
--enable-languages=c,c++ --enable-checking=release CC='gcc -g -O2' CXX='g++ -g
-O2'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230926 (experimental) (GCC)

Result is a lot better: 1m55s:

$ time /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/
-B/usr/local/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include
-I/home/slyfox/dev/git/gcc/libstdc++-v3/libsupc++
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g
-O2 -fno-checking -gtoggle -fprofile-generate -DIN_GCC -fno-exceptions
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Wconditionally-supported
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -DHAVE_CONFIG_H -fno-PIE -I. -I.
-I/home/slyfox/dev/git/gcc/gcc -I/home/slyfox/dev/git/gcc/gcc/.
-I/home/slyfox/dev/git/gcc/gcc/../include
-I/home/slyfox/dev/git/gcc/gcc/../libcpp/include
-I/home/slyfox/dev/git/gcc/gcc/../libcody
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber/bid -I../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libbacktrace -o insn-recog.o -MT insn-recog.o
-MMD -MP -MF ./.deps/insn-recog.TPo insn-recog.cc

real1m55,334s
user1m54,146s
sys 0m0,993s

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #4 from Sergei Trofimovich  ---
(In reply to Andrew Pinski from comment #2)
> Can you also try with --enable-checking=release to double check that it is
> not the extra compile time checks which is causing issues ...

Added --enable-checking=release:

$ /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/ -v
Reading specs from /tmp/gb/./prev-gcc/specs
COLLECT_GCC=/tmp/gb/./prev-gcc/xg++
COLLECT_LTO_WRAPPER=/tmp/gb/./prev-gcc/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/slyfox/dev/git/gcc/configure --disable-multilib
--enable-languages=c,c++ --enable-checking=release
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230926 (experimental) (GCC)

Result did not change much:

$ time /tmp/gb/./prev-gcc/xg++ -B/tmp/gb/./prev-gcc/
-B/usr/local/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-B/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu
-I/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include
-I/home/slyfox/dev/git/gcc/libstdc++-v3/libsupc++
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-L/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g
-O2 -fno-checking -gtoggle -fprofile-generate -DIN_GCC -fno-exceptions
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmissing-format-attribute -Wconditionally-supported
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -Werror -DHAVE_CONFIG_H -fno-PIE -I. -I.
-I/home/slyfox/dev/git/gcc/gcc -I/home/slyfox/dev/git/gcc/gcc/.
-I/home/slyfox/dev/git/gcc/gcc/../include
-I/home/slyfox/dev/git/gcc/gcc/../libcpp/include
-I/home/slyfox/dev/git/gcc/gcc/../libcody
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libdecnumber/bid -I../libdecnumber
-I/home/slyfox/dev/git/gcc/gcc/../libbacktrace -o insn-recog.o -MT insn-recog.o
-MMD -MP -MF ./.deps/insn-recog.TPo insn-recog.cc

real12m18,994s
user12m17,085s
sys 0m1,001s

[Bug target/111533] [14 Regression] ICE: RTL check: expected code 'reg', have 'const_int' in rhs_regno, at rtl.h:1934

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111533

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Li Xu :

https://gcc.gnu.org/g:110ffb2d8d3a64b32dd56ac995c2e30e8f64d4dc

commit r14-4301-g110ffb2d8d3a64b32dd56ac995c2e30e8f64d4dc
Author: xuli 
Date:   Thu Sep 28 01:29:12 2023 +

RISC-V: Bugfix for RTL check[PR111533]

Consider the flowing situation:
BB5: local_dem(RVV Insn 1, AVL(reg zero))
RVV Insn 1: vmv.s.x, AVL (const_int 1)
RVV Insn 2: vredsum.vs, AVL(reg zero)

vmv.s.x has vl operand, the following code will get
avl (cosnt_int) from RVV Insn 1.
rtx avl = has_vl_op (insn->rtl ()) ? get_vl (insn->rtl ())
   : dem.get_avl ();

If use REGNO for const_int, the compiler will crash:

during RTL pass: vsetvl
res_debug.c: In function '__dn_count_labels':
res_debug.c:1050:1: internal compiler error: RTL check: expected code
'reg',
have 'const_int' in rhs_regno, at rtl.h:1934
 1050 | }
  | ^
0x8fb169 rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int,
char const*)
../.././gcc/gcc/rtl.cc:770
0x1399818 rhs_regno(rtx_def const*)
../.././gcc/gcc/rtl.h:1934
0x1399818 anticipatable_occurrence_p
../.././gcc/gcc/config/riscv/riscv-vsetvl.cc:348

So in this case avl should be obtained from dem.

Another issue is caused by the following code:
HOST_WIDE_INT diff = INTVAL (builder.elt (i)) - i;

during RTL pass: expand
../../.././gcc/libgfortran/generated/matmul_c4.c: In function 'matmul_c4':
../../.././gcc/libgfortran/generated/matmul_c4.c:2906:39: internal compiler
error: RTL check:
expected code 'const_int', have 'const_poly_int' in expand_const_vector,
at config/riscv/riscv-v.cc:1149

The builder.elt (i) can be either const_int or const_poly_int.

PR target/111533

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Fix bug.
* config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix
bug.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr111533-1.c: New test.
* gcc.target/riscv/rvv/base/pr111533-2.c: New test.

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-09-27 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Kewen Lin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #15 from Kewen Lin  ---
(In reply to Richard Biener from comment #14)
> (In reply to Kewen Lin from comment #13)
> > Thanks again for the reduced test case and the information!
> > 
> > I tried to bisect it but encountered some build failures on _Float32 error
> > etc., through grepping the log I switched to start from r13-2887 (good) to
> > r13-7206 (bad).
> > 
> > The bisection shows the culprit commit is r13-3378-gf6c168f8c06047 which was
> > backported to GCC-12, it seems to match the observation new gcc-12 fail
> > while gcc-11 pass.
> 
> Note this change likely triggers a latent issue but it might help analyzing
> the issue.

Thanks for the hint! Yeah, I tried -fdisable-tree-esra and -fdisable-tree-sra,
the failure is still there, I supposed that commit only takes effect when SRA
is enabled. I'll continue to investigate it. btw, I'm just starting two weeks
vacation so may respond slowly.

[Bug target/111466] RISC-V: redundant sign extensions despite ABI guarantees

2023-09-27 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111466

--- Comment #1 from Vineet Gupta  ---
So there are various aspects to tackling this issue.

#1. REE reports failure as "missing definition(s)".

This is because function args don't have an explicit def, they are just there.

Cannot eliminate extension:
(insn 12 6 13 2 (set (reg:DI 16 a6 [orig:138 n.1_15 ] [138])
(sign_extend:DI (reg:SI 11 a1 [orig:141 n ] [141])))  {extendsidi2}
 (nil))
 because of missing definition(s)

#2. At Expand time there's an explicit sign_extend for the incoming function
arg which is not needed per RISC-V ABI. Not generating these to begin with will
require less fixup needs in REE and/or CSE.

(insn 3 2 4 2 (set (reg/v:DI 141 [ n ])
(reg:DI 11 a1 [ n ]))

(insn 12 6 13 2 (set (reg:DI 138 [ n.1_15 ])
(sign_extend:DI (subreg/u:SI (reg/v:DI 141 [ n ]) 0)))

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #3 from Andrew Pinski  ---
Note prev-gcc/cc1plus is compiled at -O0 also which definitely makes things
worse here.

[Bug rtl-optimization/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #2 from Andrew Pinski  ---
Can you also try with --enable-checking=release to double check that it is not
the extra compile time checks which is causing issues ...

[Bug target/111619] [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

--- Comment #1 from Sergei Trofimovich  ---
-ftime-report breakdown:

time /tmp/gb/./prev-gcc/cc1plus -quiet -nostdinc++ -I
/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu -I
/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include -I
/home/slyfox/dev/git/gcc/libstdc++-v3/libsupc++ -I . -I . -I
/home/slyfox/dev/git/gcc/gcc -I /home/slyfox/dev/git/gcc/gcc/. -I
/home/slyfox/dev/git/gcc/gcc/../include -I
/home/slyfox/dev/git/gcc/gcc/../libcpp/include -I
/home/slyfox/dev/git/gcc/gcc/../libcody -I
/home/slyfox/dev/git/gcc/gcc/../libdecnumber -I
/home/slyfox/dev/git/gcc/gcc/../libdecnumber/bid -I ../libdecnumber -I
/home/slyfox/dev/git/gcc/gcc/../libbacktrace -iprefix
/tmp/gb/prev-gcc/../lib/gcc/x86_64-pc-linux-gnu/14.0.0/ -isystem
/tmp/gb/./prev-gcc/include -isystem /tmp/gb/./prev-gcc/include-fixed -MMD
insn-recog.d -MF ./.deps/insn-recog.TPo -MP -MT insn-recog.o -D_GNU_SOURCE -D
IN_GCC -D HAVE_CONFIG_H insn-recog.cc -quiet -dumpbase insn-recog.cc
-dumpbase-ext .cc -mtune=generic -march=x86-64 -g -gtoggle -O2 -Wextra -Wall
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wsuggest-attribute=format
-Wconditionally-supported -Woverloaded-virtual=2 -Wpedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -fno-checking
-fprofile-generate -fno-exceptions -fno-rtti -fasynchronous-unwind-tables
-fno-common -fno-PIE -o /run/user/1000/ccQK54tL.s -ftime-report

Time variable   usr   sys  wall
  GGC
 phase setup:   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
 1892k (  0%)
 phase parsing  :  22.49 (  3%)   1.58 ( 35%)  24.09 (  3%)
  903M ( 22%)
 phase lang. deferred   :   0.06 (  0%)   0.01 (  0%)   0.07 (  0%)
 2268k (  0%)
 phase opt and generate : 791.23 ( 97%)   2.90 ( 65%) 794.84 ( 97%)
 3111M ( 77%)
 |name lookup   :   1.20 (  0%)   0.09 (  2%)   1.23 (  0%)
 3296k (  0%)
 |overload resolution   :   3.40 (  0%)   0.18 (  4%)   3.69 (  0%)
  107M (  3%)
 garbage collection :   5.82 (  1%)   0.08 (  2%)   5.86 (  1%)
0  (  0%)
 dump files :   0.24 (  0%)   0.00 (  0%)   0.15 (  0%)
0  (  0%)
 callgraph construction :   4.41 (  1%)   0.14 (  3%)   4.74 (  1%)
  329M (  8%)
 callgraph optimization :   1.01 (  0%)   0.03 (  1%)   1.02 (  0%)
 2938k (  0%)
 callgraph functions expansion  : 734.71 ( 90%)   2.08 ( 46%) 737.44 ( 90%)
 2238M ( 56%)
 callgraph ipa passes   :  50.35 (  6%)   0.71 ( 16%)  51.10 (  6%)
  437M ( 11%)
 ipa function summary   :   1.89 (  0%)   0.00 (  0%)   1.90 (  0%)
 5969k (  0%)
 ipa dead code removal  :   0.22 (  0%)   0.00 (  0%)   0.22 (  0%)
0  (  0%)
 ipa devirtualization   :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
0  (  0%)
 ipa cp :   0.55 (  0%)   0.00 (  0%)   0.56 (  0%)
 3831k (  0%)
 ipa inlining heuristics:   0.57 (  0%)   0.03 (  1%)   0.46 (  0%)
   20M (  1%)
 ipa comdats:   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
0  (  0%)
 ipa reference  :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)
0  (  0%)
 ipa profile:   5.98 (  1%)   0.07 (  2%)   6.11 (  1%)
  108M (  3%)
 ipa pure const :   0.57 (  0%)   0.01 (  0%)   0.55 (  0%)
 1080  (  0%)
 ipa icf:   1.37 (  0%)   0.00 (  0%)   1.37 (  0%)
   45k (  0%)
 ipa SRA:   4.22 (  1%)   0.01 (  0%)   4.27 (  1%)
 6213k (  0%)
 ipa free lang data :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
0  (  0%)
 ipa free inline summary:   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
0  (  0%)
 ipa modref :   1.33 (  0%)   0.00 (  0%)   1.33 (  0%)
 1893k (  0%)
 cfg construction   :   0.19 (  0%)   0.00 (  0%)   0.13 (  0%)
   12M (  0%)
 cfg cleanup:   3.35 (  0%)   0.00 (  0%)   3.71 (  0%)
 9974k (  0%)
 trivially dead code:   0.90 (  0%)   0.01 (  0%)   0.77 (  0%)
0  (  0%)
 df scan insns  :   1.45 (  0%)   0.00 (  0%)   1.39 (  0%)
   95k (  0%)
 df reaching defs   :   1.79 (  0%)   0.00 (  0%)   1.83 (  0%)
0  (  0%)
 df live regs   :   6.03 (  1%)   0.01 (  0%)   5.78 (  1%)
0  (  0%)
 df live regs   :   2.55 (  0%)   0.00 (  0%)   2.49 (  0%)
0  (  0%)
 df must-initialized regs   :   0.19 (  0%)   0.00 (  0%)   0.20 (  0%)
0  (  0%)
 df use-def / def-use chains:   1.13 (  0%)   0.00 (  0%)   1.05 (  0%)
0  (  0%)
 df reg dead/unused notes   :   2.89 (  0%)   0.01 (  0%)   2.79 (  0%)
   34M (  1%)
 register information   :   0.45 (  0%)   0.00 (  0%)   

[Bug fortran/111618] ICE in associate construction

2023-09-27 Thread antoine.lemoine--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111618

--- Comment #1 from Antoine Lemoine  ---
Error message using gfortran 13.2 on Compiler Explorer:

f951: internal compiler error: Segmentation fault
0x1bec57e internal_error(char const*, ...)
???:0
0x7d4c85 gfc_expression_rank(gfc_expr*)
???:0
0x7d4e62 gfc_op_rank_conformable(gfc_expr*, gfc_expr*)
???:0
0x79a447 gfc_match_expr(gfc_expr**)
???:0
0x790f48 gfc_match(char const*, ...)
???:0
0x792f01 gfc_match_assignment()
???:0
0x7c7795 gfc_parse_file()
???:0

Looks like a duplicate of pr109948.

[Bug target/111619] New: [14 regression] 'make profiledbootstrap' makes 10+ minutes on insn-recog.cc (x86_64-linux)

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111619

Bug ID: 111619
   Summary: [14 regression] 'make profiledbootstrap' makes 10+
minutes on insn-recog.cc (x86_64-linux)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slyfox at gcc dot gnu.org
  Target Milestone: ---

The reproducer on gcc from r14-4300-g1fab05a885a308:

$ ~/dev/git/gcc/configure --disable-multilib --enable-languages=c,c++
$ make profiledbootstrap

insn-recog.o takes ~13 min to build on `AMD Ryzen 9 5950X` CPU:

$ time /tmp/gb/./prev-gcc/cc1plus -quiet -nostdinc++ -I
/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu -I
/tmp/gb/prev-x86_64-pc-linux-gnu/libstdc++-v3/include -I
/home/slyfox/dev/git/gcc/libstdc++-v3/libsupc++ -I . -I . -I
/home/slyfox/dev/git/gcc/gcc -I /home/slyfox/dev/git/gcc/gcc/. -I
/home/slyfox/dev/git/gcc/gcc/../include -I
/home/slyfox/dev/git/gcc/gcc/../libcpp/include -I
/home/slyfox/dev/git/gcc/gcc/../libcody -I
/home/slyfox/dev/git/gcc/gcc/../libdecnumber -I
/home/slyfox/dev/git/gcc/gcc/../libdecnumber/bid -I ../libdecnumber -I
/home/slyfox/dev/git/gcc/gcc/../libbacktrace -iprefix
/tmp/gb/prev-gcc/../lib/gcc/x86_64-pc-linux-gnu/14.0.0/ -isystem
/tmp/gb/./prev-gcc/include -isystem /tmp/gb/./prev-gcc/include-fixed -MMD
insn-recog.d -MF ./.deps/insn-recog.TPo -MP -MT insn-recog.o -D_GNU_SOURCE -D
IN_GCC -D HAVE_CONFIG_H insn-recog.cc -quiet -dumpbase insn-recog.cc
-dumpbase-ext .cc -mtune=generic -march=x86-64 -g -gtoggle -O2 -Wextra -Wall
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wsuggest-attribute=format
-Wconditionally-supported -Woverloaded-virtual=2 -Wpedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -fno-checking
-fprofile-generate -fno-exceptions -fno-rtti -fasynchronous-unwind-tables
-fno-common -fno-PIE -o /run/user/1000/ccQK54tL.s

real13m39,864s
user13m38,263s
sys 0m0,823s

`insn-recog.cc` is 8.3MB.

$ ./prev-gcc/xgcc -Bprev-gcc -v
Reading specs from prev-gcc/specs
COLLECT_GCC=./prev-gcc/xgcc
COLLECT_LTO_WRAPPER=prev-gcc/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/slyfox/dev/git/gcc/configure --disable-multilib
--enable-languages=c,c++
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230926 (experimental) (GCC)

[Bug fortran/111618] New: ICE in associate construction

2023-09-27 Thread antoine.lemoine--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111618

Bug ID: 111618
   Summary: ICE in associate construction
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: antoine.lemo...@bordeaux-inp.fr
  Target Milestone: ---

An ICE occurs with this code:

program prog
   implicit none
   type foo
  double precision, dimension(3) :: long_a
  double precision, dimension(3) :: long_b
   end type
   type(foo) :: the_foo
   double precision :: d
   associate(a => the_foo%long_a, b => the_foo%long_b)
  a = 2d0
  b = 1d0
  d = hypot(b(1), b(2)) ! No ICE without this line.
  b = a - b
   end associate
end program

No ICE when writing 'b = a - b(:)' or commenting out 'd = hypot(b(1), b(2))'.

[Bug fortran/67740] Wrong association status of allocatable character pointer in derived types

2023-09-27 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67740

--- Comment #7 from anlauf at gcc dot gnu.org ---
The following snippet in gfc_trans_pointer_assignment looks suspicious:

  if (expr1->ts.type == BT_CHARACTER
  && expr1->symtree->n.sym->ts.deferred
  && expr1->symtree->n.sym->ts.u.cl->backend_decl
  && VAR_P (expr1->symtree->n.sym->ts.u.cl->backend_decl))
{
  tmp = expr1->symtree->n.sym->ts.u.cl->backend_decl;
  if (expr2->expr_type != EXPR_NULL)
gfc_add_modify (, tmp,
fold_convert (TREE_TYPE (tmp), strlen_rhs));
  else
gfc_add_modify (, tmp, build_zero_cst (TREE_TYPE (tmp)));
}

I wonder whether it should read:

  if (expr1->ts.type == BT_CHARACTER
  && expr1->ts.deferred
...

Furthermore, expr1->ts.u.cl->backend_decl appears not set properly,
and I fail to see why.

[Bug libgcc/109685] [13/14 Regression] Memory leak in `__deregister_frame`

2023-09-27 Thread markus.boeck02 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109685

Markus Böck  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Markus Böck  ---
Fixed

[Bug target/111617] Unnecessary instructions generated when comparing mixed-sign small integers

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111617

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=92821,
   ||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=46942
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Andrew Pinski  ---
Also see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92821#c2. clang/LLVM is
still not following the ABI after years of reprorting to them they are wrong.

[Bug c++/111606] [11/12/13/14 Regression] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread markus at oberhumer dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

--- Comment #6 from Markus F.X.J. Oberhumer  ---
@Andrew Pinksi Many thanks for cleaning up the bug case!

cvise (https://github.com/marxin/cvise) did correctly reduce the original from
~5 lines to 18 lines, but the result looked extremely strange...

[Bug target/111617] Unnecessary instructions generated when comparing mixed-sign small integers

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111617

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=98425

--- Comment #2 from Andrew Pinski  ---
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98425#c3 also.

[Bug target/111617] Unnecessary instructions generated when comparing mixed-sign small integers

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111617

Andrew Pinski  changed:

   What|Removed |Added

 Target||x86_64-linux-gnu
   Keywords||ABI
  Component|c   |target

--- Comment #1 from Andrew Pinski  ---
There is a disagreement on the ABI between GCC and clang and even what the ABI
says. GCC assumes the upper bits are not zero/sign extended while clang thinks
they are.

[Bug c/111617] New: Unnecessary instructions generated when comparing mixed-sign small integers

2023-09-27 Thread davidfromonline at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111617

Bug ID: 111617
   Summary: Unnecessary instructions generated when comparing
mixed-sign small integers
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: davidfromonline at gmail dot com
  Target Milestone: ---

Compiling with `-std=c2x -O3`

```c
bool a(signed char x, unsigned char y) {
return x == y;
}

bool b(short x, unsigned short y) {
return x == y;
}

bool c(int x, unsigned y) {
return x == y;
}

bool d(long x, unsigned long y) {
return x == y;
}

bool e(long long x, unsigned long long y) {
return x == y;
}
```

causes gcc to generate

```asm
a:
movsx   edi, dil
movzx   esi, sil
cmp edi, esi
seteal
ret
b:
movsx   edi, di
movzx   esi, si
cmp edi, esi
seteal
ret
c:
cmp edi, esi
seteal
ret
d:
cmp rdi, rsi
seteal
ret
e:
cmp rdi, rsi
seteal
ret
```

The `movsx` and `movzx` seem unnecessary, and are not emitted by clang.

See it live: https://godbolt.org/z/dfc93f7Pv

[Bug tree-optimization/111614] [14 Regression] ICE at -O2: verify_gimple failed since r14-2282-gf703d2fd3f0

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111614

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-09-27

--- Comment #1 from Andrew Pinski  ---
Looks like a latent bug in reassoc:
```

  vector(2) unsigned int _11;
  intD.6 _6;
  intD.6 _8;

  vect__15.28_19 = VIEW_CONVERT_EXPR(_3);
  _4 = BIT_FIELD_REF ;
  _35 = BIT_FIELD_REF ;
  _30 = _35 & _4;
  _28 = _30 & d_lsm.16_31;
...
  _62 = VIEW_CONVERT_EXPR(vect_i_7.24_45);
  _11 = _62 + { 11, 11 };
  _6 = BIT_FIELD_REF <_11, 32, 0>;
  _8 = BIT_FIELD_REF <_11, 32, 32>;
  _21 = _8 & _6;
  _34 = _21 & _28;
```

basically BIT_FIELD_REF has a type of `int` but the inner vector type of _11 is
`vector unsigned int`.

The VCE was removed by the following match pattern:
```
(simplify
 (BIT_FIELD_REF (view_convert @0) @1 @2)
 (BIT_FIELD_REF @0 @1 @2))
```

Which you expect really.

Confirmed.

[Bug tree-optimization/111614] [14 Regression] ICE at -O2: verify_gimple failed since r14-2282-gf703d2fd3f0

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111614

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ice-checking,
   ||ice-on-valid-code
   Target Milestone|--- |14.0

[Bug c++/111606] [11/12/13/14 Regression] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

--- Comment #5 from Andrew Pinski  ---
Actually here is the full backtrace:
#2  0x03524383 in error_recursion (context=0x4d471a0
) at
/home/apinski/src/upstream-gcc/gcc/gcc/diagnostic.cc:2265
#3  0x035217be in diagnostic_report_diagnostic (context=0x4d471a0
, diagnostic=0x7fff9e20) at
/home/apinski/src/upstream-gcc/gcc/gcc/diagnostic.cc:1543
#4  0x03522102 in diagnostic_impl (richloc=0x7fff9f00,
metadata=0x0, opt=-1, gmsgid=0x39302e0 "explicit instantiation of %qD but no
definition available", ap=0x7fff9ee8, kind=DK_PERMERROR) at
/home/apinski/src/upstream-gcc/gcc/gcc/diagnostic.cc:1770
#5  0x03523434 in permerror (location=2147483649, gmsgid=0x39302e0
"explicit instantiation of %qD but no definition available") at
/home/apinski/src/upstream-gcc/gcc/gcc/diagnostic.cc:2037
#6  0x010eb12c in instantiate_decl (d=,
defer_ok=false, expl_inst_class_mem_p=false) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:27362
#7  0x00ec2acd in maybe_instantiate_decl (decl=) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/decl2.cc:
#8  0x00ebee57 in decl_constant_var_p (decl=) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/decl2.cc:4569
#9  0x00ef4a05 in constant_value_1 (decl=, strict_p=true, return_aggregate_cst_ok_p=true, unshare_p=false) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/init.cc:2526
#10 0x00ef4e59 in decl_really_constant_value (decl=, unshare_p=false) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/init.cc:2617
#11 0x00dced8b in cxx_eval_constant_expression (ctx=0x7fffa7b0,
t=, lval=vc_prvalue,
non_constant_p=0x7fffa730, overflow_p=0x7fffa731, jump_target=0x0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:7216
#12 0x00dd1261 in cxx_eval_constant_expression (ctx=0x7fffa7b0,
t=, lval=vc_prvalue, non_constant_p=0x7fffa730,
overflow_p=0x7fffa731, jump_target=0x0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:7827
#13 0x00dd4f51 in cxx_eval_outermost_constant_expr (t=, allow_non_constant=true, strict=true,
manifestly_const_eval=mce_value::mce_true, constexpr_dtor=false, object=) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8517
#14 0x00dd60be in maybe_constant_value (t=,
decl=, manifestly_const_eval=mce_value::mce_true) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8806
#15 0x00dd6736 in fold_non_dependent_expr (t=,
complain=0, manifestly_const_eval=true, object=) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8945
#16 0x011bf9c9 in check_narrowing (type=, init=, complain=0, const_only=true) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/typeck2.cc:993
#17 0x00d5a284 in convert_like_internal (convs=0x4dea410,
expr=, fn=, argnum=0,
issue_conversion_warnings=true, c_cast_p=false, nested_p=false, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8879
#18 0x00d5a5ca in convert_like (convs=0x4dea410, expr=, fn=, argnum=0, issue_conversion_warnings=true,
c_cast_p=false, nested_p=false, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8944
#19 0x00d5a63b in convert_like (convs=0x4dea410, expr=, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8959
#20 0x00d4a9ee in build_converted_constant_expr_internal
(type=, expr=,
flags=5, complain=0) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:4808
#21 0x00d4aa85 in build_converted_constant_expr (type=, expr=, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:4838
#22 0x01084794 in convert_nontype_argument (type=, expr=, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:7414
#23 0x01089897 in convert_template_argument (parm=, arg=, args=, complain=0, i=0, in_decl=) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:8713
#24 0x0108b976 in coerce_template_parms (parms=, args=, in_decl=, complain=0, require_all_args=true) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:9205
#25 0x0108e57f in lookup_template_class (d1=, arglist=, in_decl=, context=, entering_scope=1, complain=0) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:9980
#26 0x010a2e5c in tsubst_aggr_type_1 (t=, args=, complain=0, in_decl=,
entering_scope=1) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:14055
#27 0x010a2ca5 in tsubst_aggr_type (t=, args=, complain=16384, in_decl=,
entering_scope=1) at /home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:14019
#28 0x010b3d52 in tsubst (t=,
args=, complain=0, in_decl=) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:16589
#29 0x00ecdda9 in dump_template_bindings (pp=0x4b8f220
, parms=, args=,
typenames=0x77409ca8 = {...}) at
/home/apinski/src/upstream-gcc/gcc/gcc/cp/error.cc:492
#30 0x00ed47ac in dump_substitution (pp=0x4b8f220
, t=,
template_parms=, template_args=, flags=132) at

[Bug c++/111606] [11/12/13/14 Regression] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

--- Comment #4 from Andrew Pinski  ---
(In reply to Markus F.X.J. Oberhumer from comment #0)
> Test case has been reduced by cvise.
> 
> Might be related to / duplicate of
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90747
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100557

Looks unrelated to those two.

Full backtrace:
t.c: In instantiation of ‘const bool integral_constant<1>::value’:
t.c:10:3:   required by substitution of ‘template Span::Span(U,
typename enable_if::value>::type) [with U = int]’
t.c:13:11:   required from here
t.c:10:3: error: explicit instantiation of ‘integral_constant<1>::value’ but no
definition available [-fpermissive]
   10 |   Span(U, typename enable_if::value>::type
= 1){}
  |   ^~~~
‘
internal compiler error: error reporting routines re-entered.
0x10eb12b instantiate_decl(tree_node*, bool, bool)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:27362
0xec2acc maybe_instantiate_decl(tree_node*)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/decl2.cc:
0xebee56 decl_constant_var_p(tree_node*)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/decl2.cc:4569
0xef4a04 constant_value_1
/home/apinski/src/upstream-gcc/gcc/gcc/cp/init.cc:2526
0xef4e58 decl_really_constant_value(tree_node*, bool)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/init.cc:2617
0xdced8a cxx_eval_constant_expression
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:7216
0xdd1260 cxx_eval_constant_expression
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:7827
0xdd4f50 cxx_eval_outermost_constant_expr
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8517
0xdd60bd maybe_constant_value(tree_node*, tree_node*, mce_value)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8806
0xdd6735 fold_non_dependent_expr(tree_node*, int, bool, tree_node*)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/constexpr.cc:8945
0x11bf9c8 check_narrowing(tree_node*, tree_node*, int, bool)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/typeck2.cc:993
0xd5a283 convert_like_internal
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8879
0xd5a5c9 convert_like
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8944
0xd5a63a convert_like
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:8959
0xd4a9ed build_converted_constant_expr_internal
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:4808
0xd4aa84 build_converted_constant_expr(tree_node*, tree_node*, int)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/call.cc:4838
0x1084793 convert_nontype_argument
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:7414
0x1089896 convert_template_argument
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:8713
0x108b975 coerce_template_parms(tree_node*, tree_node*, tree_node*, int, bool)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:9205
0x108e57e lookup_template_class(tree_node*, tree_node*, tree_node*, tree_node*,
int, int)
/home/apinski/src/upstream-gcc/gcc/gcc/cp/pt.cc:9980
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-09-27 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600

--- Comment #8 from Andreas Schwab  ---
Native on HiFive Unleashed.

[Bug c++/111606] [11/12/13/14 Regression] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Target Milestone|--- |11.5
  Known to fail||6.3.0
Summary|[ICE] internal compiler |[11/12/13/14 Regression]
   |error: error reporting  |[ICE] internal compiler
   |routines re-entered.|error: error reporting
   ||routines re-entered.
 Ever confirmed|0   |1
   Severity|normal  |trivial
   Last reconfirmed||2023-09-27
  Known to work||6.2.0

--- Comment #3 from Andrew Pinski  ---
Confirmed.

[Bug c++/111606] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

--- Comment #2 from Andrew Pinski  ---
Created attachment 56005
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56005=edit
Reduced further

Attached is the testcase reduced further, and adding back to make it more valid
code.

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-09-27 Thread palmer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600

palmer at gcc dot gnu.org changed:

   What|Removed |Added

 CC||palmer at gcc dot gnu.org,
   ||vineetg at gcc dot gnu.org

--- Comment #7 from palmer at gcc dot gnu.org ---
(In reply to Andreas Schwab from comment #3)
> Here are the build times of the stage1 compiler:
> 
> 20230714  21573
> 20230722  19932   -7.6%
> 20230728  21608   +8.4%
> 20230804  21841   +1.0%
> 20230811  25016   +14.5%
> 20230818  25429   +1.7%
> 20230825  25872   +1.7%
> 20230901  25965   +0.4%
> 20230908  28824   +11.0%
> 20230915  30926   +7.3%
> 20230922  40180   +30.0%

Did anything else change?  The latest binutils has better debug support, so I
could imagine us ending up with some longer compiler times as a result -- there
has to be more than just that here, though.

Aside from that we have had a ton of vector codegen go in over the last few
months, but this is a pretty huge increase so I agree it's worrisome.  I'm
adding Vineet to the CC list, as he's been doing some SPEC runs.  I don't think
we've had any major runtime regressions, but looks like dwarf2out.cc times have
crept up a bit which is also worrisome.

Also what exactly are you timing?  Native boostraps on QEMU?

[Bug target/111609] Zero shift in ARM NEON vshll_n_s8 intrinsic produces an error

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111609

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||12.1.0, 4.5.4
   Last reconfirmed||2023-09-27
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug libstdc++/111511] Incorrect ADL in std::to_array in GCC 11/12/13

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111511

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |13.3

--- Comment #10 from Jonathan Wakely  ---
std::to_array is fixed on all branches, thanks for the reports.

[Bug libstdc++/111511] Incorrect ADL in std::to_array in GCC 11/12/13

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111511

--- Comment #9 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:97a33ab114187e7c6cd6c6c0f06cd8225e8aeef5

commit r11-11021-g97a33ab114187e7c6cd6c6c0f06cd8225e8aeef5
Author: Jonathan Wakely 
Date:   Thu Sep 21 09:14:57 2023 +0100

libstdc++: Prevent unwanted ADL in std::to_array [PR111512]

Qualify the calls to the __to_array helper to prevent ADL, so we don't
try to complete associated classes.

libstdc++-v3/ChangeLog:

PR libstdc++/111511
PR c++/111512
* include/std/array (to_array): Qualify calls to __to_array.
* testsuite/23_containers/array/creation/111512.cc: New test.

(cherry picked from commit 77cf3773021b0a20d89623e09d620747a05588ec)

[Bug c++/111512] GCC's __builtin_memcpy can trigger ADL

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111512

--- Comment #6 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:97a33ab114187e7c6cd6c6c0f06cd8225e8aeef5

commit r11-11021-g97a33ab114187e7c6cd6c6c0f06cd8225e8aeef5
Author: Jonathan Wakely 
Date:   Thu Sep 21 09:14:57 2023 +0100

libstdc++: Prevent unwanted ADL in std::to_array [PR111512]

Qualify the calls to the __to_array helper to prevent ADL, so we don't
try to complete associated classes.

libstdc++-v3/ChangeLog:

PR libstdc++/111511
PR c++/111512
* include/std/array (to_array): Qualify calls to __to_array.
* testsuite/23_containers/array/creation/111512.cc: New test.

(cherry picked from commit 77cf3773021b0a20d89623e09d620747a05588ec)

[Bug libstdc++/111102] illegal pointer arithmetic invoked by std::format("L{:65536}",1)

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=02

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #5 from Jonathan Wakely  ---
Fixed for 13.3, thanks for the report and patches.

[Bug libstdc++/108046] The dot in the floating-point alternative form has wrong position

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108046

Jonathan Wakely  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Jonathan Wakely  ---
Fixed for 13.3, thanks for the report.

[Bug libstdc++/111102] illegal pointer arithmetic invoked by std::format("L{:65536}",1)

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=02

--- Comment #4 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:9853ad876bd3d9d4685126466f74402e567664b3

commit r13-7918-g9853ad876bd3d9d4685126466f74402e567664b3
Author: Paul Dreik 
Date:   Thu Aug 24 11:43:43 2023 +0100

libstdc++: Add test for illegal pointer arithmetic in format [PR02]

libstdc++-v3/ChangeLog:

PR libstdc++/02
* testsuite/std/format/string.cc: Check wide character format
strings with out-of-range widths.

(cherry picked from commit 7564fe98657ad5ede34bd08f5279778fa8698865)

[Bug c++/59526] [C++11] Defaulted special member functions don't accept noexcept if a member has a non-trivial noexcept operator in the corresponding special member function

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59526

--- Comment #5 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:0547f663ee09aa5887dcd1bb0ea48eba24a30485

commit r13-7917-g0547f663ee09aa5887dcd1bb0ea48eba24a30485
Author: François Dumont 
Date:   Wed Aug 23 19:15:43 2023 +0200

libstdc++: [_GLIBCXX_INLINE_VERSION] Fix  friend declaration

GCC do not consider the inline namespace in friend function declarations.
This is PR c++/59526, we need to explicit this namespace.

libstdc++-v3/ChangeLog:

* include/std/format (std::__format::_Arg_store): Explicit version
namespace on make_format_args friend declaration.

(cherry picked from commit 92456291849fe88303bbcab366f41dcd4a885ad5)

[Bug libstdc++/111102] illegal pointer arithmetic invoked by std::format("L{:65536}",1)

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=02

--- Comment #3 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:183eea6029be2f6c9f416d6ffe751c469237ff2d

commit r13-7916-g183eea6029be2f6c9f416d6ffe751c469237ff2d
Author: Paul Dreik 
Date:   Thu Aug 24 11:43:43 2023 +0100

libstdc++: fix illegal pointer arithmetic in format [PR02]

When parsing a format string, the width is parsed into an unsigned short
but the result is not checked in the case the format string is not a
char string (such as a wide string). In case the parse fails, a null
pointer is returned which is used for pointer arithmetic which is
undefined behaviour.

Signed-off-by: Paul Dreik 

libstdc++-v3/ChangeLog:

PR libstdc++/02
* include/std/format (__format::__parse_integer): Check for
non-null pointer.

(cherry picked from commit dd4bdb9eea436bf06f175d8dbfc2190377455be4)

[Bug libstdc++/108046] The dot in the floating-point alternative form has wrong position

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108046

--- Comment #5 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:da1ba03245c212ef1ba100e7806588802f3ad46f

commit r13-7914-gda1ba03245c212ef1ba100e7806588802f3ad46f
Author: Jonathan Wakely 
Date:   Thu Jul 27 14:07:09 2023 +0100

libstdc++: Fix std::format alternate form for floating-point [PR108046]

A decimal point was being added to the end of the string for {:#.0}
because the __expc character was not being set, for the _Pres_none
presentation type, so __s.find(__expc) didn't the 'e' in "1e+01" and so
we created "1e+01." by appending the radix char to the end.

This can be fixed by ensuring that __expc='e' is set for the _Pres_none
case. I realized we can also set __expc='P' and __expc='E' when needed,
to save a call to std::toupper later.

For the {:#.0g} format, __expc='e' was being set and so the 'e' was
found in "1e+10" but then __z = __prec - __sigfigs would wraparound to
SIZE_MAX. That meant we would decide not to add a radix char because the
number of extra characters to insert would be 1+SIZE_MAX i.e. zero.

This can be fixed by using __z == 0 when __prec == 0.

libstdc++-v3/ChangeLog:

PR libstdc++/108046
* include/std/format (__formatter_fp::format): Ensure __expc is
always set for all presentation types. Set __z correctly for
zero precision.
* testsuite/std/format/functions/format.cc: Check problem cases.

(cherry picked from commit 50bc490c090cc95175e6068ed7438788d7fd7040)

[Bug fortran/90608] Inline non-scalar minloc/maxloc calls

2023-09-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

Tamar Christina  changed:

   What|Removed |Added

 CC||tnfchris at gcc dot gnu.org,
   ||toon at gcc dot gnu.org

--- Comment #6 from Tamar Christina  ---
This is the ticket I meant toon.

Do you or Thomas have any ideas how we can inline this?

[Bug middle-end/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread gardner.ben at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

--- Comment #6 from Ben Gardner  ---
(In reply to Andrew Pinski from comment #5)
> extern void *memmem (const void *__haystack, size_t __haystacklen,
>const void *__needle, size_t __needlelen)
>  __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__pure__))
> __attribute__ ((__nonnull__ (1, 3)));
> 
> 
> memmem is declared with nonnull for the 1st and 3rd argument. If those
> arguments are null, the behavior is undefined and the values of those
> arguments can be assumed as not null afterwards too.
> 
> If you don't want that behavior you can use -fno-delete-null-pointer-checks .
> 
> Otherwise the behavior you are seeing is correct behavior based on well
> defined code.

Thanks for the info. That makes sense. I didn't check the header file, so I
didn't know that memmem() was declared with nonnull.

Also, thanks for the tip about -fno-delete-null-pointer-checks.

[Bug middle-end/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #5 from Andrew Pinski  ---
extern void *memmem (const void *__haystack, size_t __haystacklen,
   const void *__needle, size_t __needlelen)
 __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__pure__))
__attribute__ ((__nonnull__ (1, 3)));


memmem is declared with nonnull for the 1st and 3rd argument. If those
arguments are null, the behavior is undefined and the values of those arguments
can be assumed as not null afterwards too.

If you don't want that behavior you can use -fno-delete-null-pointer-checks .

Otherwise the behavior you are seeing is correct behavior based on well defined
code.

[Bug middle-end/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

--- Comment #4 from Jonathan Wakely  ---
Anything passed to memmem (or memcmpy, or memcpy, etc.) is considered to be a
non-null pointer, because that's a requirement of those functions. And so if
it's a non-null pointer, any null checks for it can be removed.

[Bug middle-end/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread gardner.ben at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

--- Comment #3 from Ben Gardner  ---
The issue isn't with memmem(). It is with the value passed into pr_str() from
the structure. I suspect memmem() is a distraction.
I'll try to further reduce the test case to eliminate memmem(), if possible.

[Bug middle-end/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

--- Comment #2 from Andrew Pinski  ---
I don't think this is a bug. memmem is defined such that a null pointer
argument is undefined even if the len is 0.

[Bug target/111616] New: On Zen2 7% 519.lbm_r regression between g:1d17d58c284fa8c3 (2023-09-14 02:39) and g:c8e9a75085f9725c (2023-09-18 13:09)

2023-09-27 Thread fkastl at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111616

Bug ID: 111616
   Summary: On Zen2 7% 519.lbm_r regression between
g:1d17d58c284fa8c3 (2023-09-14 02:39) and
g:c8e9a75085f9725c (2023-09-18 13:09)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: needs-bisection
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fkastl at suse dot cz
CC: mjambor at suse dot cz
  Target Milestone: ---
  Host: x86_64-linux
Target: x86_64-linux

On x86_64 AMD Zen2 machine with Ofast LTO PGO march=native mtune=native between
commits g:1d17d58c284fa8c3 (2023-09-14 02:39) and g:c8e9a75085f9725c
(2023-09-18 13:09) there is a 519.lbm_r 7% execution time regression.

Here is a plot of recent measurements:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=286.477.0

I confirmed this on another Zen2 machine. This time I measured 9% slowdown.

[Bug c/111615] NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread gardner.ben at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

--- Comment #1 from Ben Gardner  ---
Created attachment 56004
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56004=edit
Build script.

[Bug c/111615] New: NULL check incorrectly skipped at O2 and O3

2023-09-27 Thread gardner.ben at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111615

Bug ID: 111615
   Summary: NULL check incorrectly skipped at O2 and O3
   Product: gcc
   Version: 11.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gardner.ben at gmail dot com
  Target Milestone: ---

Created attachment 56003
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56003=edit
Source file that produces the issue.

The attached source code has a function (pr_str()) that prints something
different if the parameter is NULL.
When passed a NULL (const char *) value from a static const structure, the NULL
check is skipped and the first printf() is executed.


static void pr_str(const char *s)
{
   /* BUG: this NULL check is skipped/wrong at O2 and O3 for
* vec->haystack and vec->needle.
*/
   if (s != NULL)
   {
  printf("'%s' %p %d", s, s, (int)(intptr_t)s);
   }
   else
   {
  printf("(nil)");
   }
}


This occurs at O2 and O3, but not at O0, O1, or Os.
If the program prints "h=(nil)" on the 2nd to last line when executed, then it
worked.
The the program prints "h='(null)' (nil) 0" on the 2nd to last line, then if
failed.


Build script:
#!/bin/sh
build_it() {
OP=$1
gcc -g -Wall -O$OP -c -o memmem_test.O$OP.o memmem_test.c
gcc memmem_test.O$OP.o -o memmem_test.O$OP
}
build_it 0
build_it 1
build_it 2
build_it 3
build_it s


GCC detailed info:
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
11.4.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-11
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib
--enable-libphobos-checking=release --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet
--with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32
--enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
--with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)

[Bug ipa/111283] [14 Regression] gnat profilebootstrap broken on trunk 20230902 on 32bit targets

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111283

--- Comment #9 from Sergei Trofimovich  ---
Proposed conservative fix as
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631526.html

[Bug gcov-profile/111559] [14 regression] ICE when building Python with PGO

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111559

--- Comment #7 from Sergei Trofimovich  ---
Proposed conservative fix as
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631526.html

[Bug tree-optimization/109088] GCC does not always vectorize conditional reduction

2023-09-27 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109088

--- Comment #13 from JuzheZhong  ---
Hi, Richi. This is my draft approach to enhance the finding more potential
condtional reduction.

diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index a8c915913ae..c25d2038f16 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -1790,8 +1790,72 @@ is_cond_scalar_reduction (gimple *phi, gimple **reduc,
tree arg_0, tree arg_1,
   std::swap (r_op1, r_op2);
   std::swap (r_nop1, r_nop2);
 }
-  else if (r_nop1 != PHI_RESULT (header_phi))
-return false;
+  else if (r_nop1 == PHI_RESULT (header_phi))
+;
+  else
+{
+  /* Analyze the statement chain of STMT so that we could teach generate
+better if-converison code sequence.  We are trying to catch this
+following situation:
+
+  loop-header:
+  reduc_1 = PHI <..., reduc_2>
+  ...
+  if (...)
+  tmp1 = reduc_1 + rhs1;
+  tmp2 = tmp1 + rhs2;
+  tmp3 = tmp2 + rhs3;
+  ...
+  reduc_3 = tmpN-1 + rhsN-1;
+
+  reduc_2 = PHI 
+
+  and convert to
+
+  reduc_2 = PHI <0, reduc_1>
+  tmp1 = rhs1 + rhs2;
+  tmp2 = tmp1 + rhs3;
+  tmp3 = tmp2 + rhs4;
+  ...
+  tmpN-1 = tmpN-2 + rhsN;
+  ifcvt = cond_expr ? tmpN-1 : 0
+  reduc_1 = tmpN-1 +/- ifcvt;  */
+  if (num_imm_uses (PHI_RESULT (header_phi)) != 2)
+   return false;
+  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, PHI_RESULT (header_phi))
+   {
+ gimple *use_stmt = USE_STMT (use_p);
+ if (is_gimple_assign (use_stmt))
+   {
+ if (gimple_assign_rhs_code (use_stmt) != reduction_op)
+   return false;
+ if (TREE_CODE (gimple_assign_lhs (use_stmt)) != SSA_NAME)
+   return false;
+
+ bool visited_p = false;
+ while (!visited_p)
+   {
+ use_operand_p use;
+ if (!single_imm_use (gimple_assign_lhs (use_stmt), ,
+  _stmt)
+ || gimple_bb (use_stmt) != gimple_bb (stmt)
+ || !is_gimple_assign (use_stmt)
+ || TREE_CODE (gimple_assign_lhs (use_stmt)) != SSA_NAME
+ || gimple_assign_rhs_code (use_stmt) != reduction_op)
+   return false;
+
+ if (gimple_assign_lhs (use_stmt) == gimple_assign_lhs (stmt))
+   {
+ r_op2 = r_op1;
+ r_op1 = PHI_RESULT (header_phi);
+ visited_p = true;
+   }
+   }
+   }
+ else if (use_stmt != phi)
+   return false;
+   }
+}


My approach is doing the check as follows:

   tmp1 = reduc_1 + rhs1;
   tmp2 = tmp1 + rhs2;
   tmp3 = tmp2 + rhs3;
   ...
   reduc_3 = tmpN-1 + rhsN-1;

Start the iteration check from "tmp1 = reduc_1 + rhs1;" until "reduc_3 = tmpN-1
+ rhsN-1;"

Make sure each statement are PLUS_EXPR for reduction sum.
Does it look reasonable ?

It succeed on vectorization.

[Bug ipa/111613] [12/13/14 Regression] Bit field stores can be incorrectly optimized away when -fstore-merging is in effect

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111613

--- Comment #2 from Richard Biener  ---
It's the late IPA modref that mis-analyzes the store-merged sequence I think.

[Bug libstdc++/111589] Use relaxed atomic increment (but not decrement!) in shared_ptr

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111589

--- Comment #2 from Jonathan Wakely  ---
The interesting question is whether all of these can be relaxed or if we need
to stop using __atomic_add_dispatch for shared_ptr copies:

include/bits/cow_string.h: 
__gnu_cxx::__atomic_add_dispatch(>_M_refcount, 1);
include/bits/cow_string.h:   
__gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 1);
include/bits/ios_base.h:  _M_add_reference() {
__gnu_cxx::__atomic_add_dispatch(&_M_refcount, 1); }
include/bits/locale_classes.h:{
__gnu_cxx::__atomic_add_dispatch(&_M_refcount, 1); }
include/bits/locale_classes.h:{
__gnu_cxx::__atomic_add_dispatch(&_M_refcount, 1); }
include/bits/shared_ptr_base.h:  {
__gnu_cxx::__atomic_add_dispatch(&_M_use_count, 1); }
include/bits/shared_ptr_base.h:  {
__gnu_cxx::__atomic_add_dispatch(&_M_weak_count, 1); }
include/ext/atomicity.h:  // __atomic_add_dispatch
include/ext/atomicity.h:  __atomic_add_dispatch(_Atomic_word* __mem, int __val)
include/ext/pool_allocator.h:   __atomic_add_dispatch(&_S_force_new,
1);
include/ext/pool_allocator.h:   __atomic_add_dispatch(&_S_force_new,
-1);
include/ext/rc_string_base.h: __atomic_add_dispatch(&_M_info._M_refcount,
1);
include/tr1/shared_ptr.h:  {
__gnu_cxx::__atomic_add_dispatch(&_M_use_count, 1); }
include/tr1/shared_ptr.h:  {
__gnu_cxx::__atomic_add_dispatch(&_M_weak_count, 1); }
libsupc++/eh_atomics.h:__gnu_cxx::__atomic_add_dispatch (__count, 1);
src/c++98/ios_init.cc:  __gnu_cxx::__atomic_add_dispatch(&_S_refcount, 1);

[Bug ipa/111613] [12/13/14 Regression] Bit field stores can be incorrectly optimized away when -fstore-merging is in effect

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111613

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Keywords||wrong-code
   Target Milestone|--- |12.4
 Ever confirmed|0   |1
   Priority|P3  |P1
Summary|Bit field stores can be |[12/13/14 Regression] Bit
   |incorrectly optimized away  |field stores can be
   |when -fstore-merging is in  |incorrectly optimized away
   |effect  |when -fstore-merging is in
   ||effect
   Last reconfirmed||2023-09-27
  Component|c   |ipa
 CC||hubicka at gcc dot gnu.org,
   ||jamborm at gcc dot gnu.org,
   ||marxin at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
Confirmed.

[Bug c++/85861] g++ -Wconversion misses int to size_t

2023-09-27 Thread albertnetymk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85861

Albert Netymk  changed:

   What|Removed |Added

 CC||albertnetymk at gmail dot com

--- Comment #16 from Albert Netymk  ---
I am aware that, in C++, it is intended that -Wconversion doesn't imply
-Wsign-conversion, and the corresponding documentation is super clear about
this fact.

According to the documentation
(https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html), "Warnings about
conversions between signed and unsigned integers are disabled by default in C++
unless -Wsign-conversion is explicitly enabled."

However, from an end-user perspective, this inconsistency (implying
sign-conversion in C but not in C++) is quite surprising. Additionally, in
clang, -Wconversion always implies -Wsign-conversion.

Therefore, having -Wconversion unconditionally imply -Wsign-conversion would
provide a more consistent interface, and the documentation wouldn't need to
point out the inconsistency anymore.

[Bug target/111610] Cannot build cross compiler to darwin targets after r14-4108-g47346acb72b50d

2023-09-27 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111610

Iain Sandoe  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #5 from Iain Sandoe  ---
so, should be fixed.

[Bug target/111610] Cannot build cross compiler to darwin targets after r14-4108-g47346acb72b50d

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111610

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Iain D Sandoe :

https://gcc.gnu.org/g:2ecab2f32b9e9a75bf563f80752d5b44dcd26b98

commit r14-4298-g2ecab2f32b9e9a75bf563f80752d5b44dcd26b98
Author: Iain Sandoe 
Date:   Wed Sep 27 11:05:31 2023 +0100

Darwin, configure: Allow for an unrecognisable dsymutil [PR111610].

We had a catch-all configuration case for missing or unrecognised dsymutil
but it was setting the dsymutil source to "UNKNOWN" which is not usable in
this context (since it clashes with an existing enum).  We rename this to
DET_UNKNOWN (for Darwin External Toolchain).

PR target/111610

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: Rename the missing dsymutil case to "DET_UNKNOWN".

Signed-off-by: Iain Sandoe 

[Bug tree-optimization/111614] New: [14 Regression] ICE at -O2: verify_gimple failed since r14-2282-gf703d2fd3f0

2023-09-27 Thread shaohua.li at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111614

Bug ID: 111614
   Summary: [14 Regression] ICE at -O2: verify_gimple failed since
r14-2282-gf703d2fd3f0
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: shaohua.li at inf dot ethz.ch
CC: rguenth at gcc dot gnu.org
  Target Milestone: ---

gcc at -O2 crashes.

Bisected to r14-2282-gf703d2fd3f0

Compiler explorer: https://godbolt.org/z/xG5Tosvnc

$ cat a.c
int a, b, c, d, e;
static void f() {
  int *g = 
  b = 1;
  for (; b >= 0; b--) {
c = 0;
for (; c <= 1; c++)
  e = 0;
for (; e <= 1; e++) {
  int h, i = h = 13;
  for (; h; h--)
i = i << a;
  d &= i + c + 9 + *g;
}
  }
}
int main() {
  f();
  for (;;)
;
}
$
$ gcc -O3 a.c
a.c: In function ‘main’:
a.c:17:5: error: type mismatch in binary expression
   17 | int main() {
  | ^~~~
vector(2) int

vector(2) int

vector(2) unsigned int

_15 = vect__15.28_19 & _11;
during GIMPLE pass: reassoc
a.c:17:5: internal compiler error: verify_gimple failed
0x7f6a86416082 __libc_start_main
../csu/libc-start.c:308
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
$

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-09-27 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600

--- Comment #6 from Andreas Schwab  ---
$ wc -l gcc-*/Build/gcc/insn-opinit.cc
   6996 gcc-20230714/Build/gcc/insn-opinit.cc
   6591 gcc-20230722/Build/gcc/insn-opinit.cc
   6809 gcc-20230728/Build/gcc/insn-opinit.cc
   6967 gcc-20230804/Build/gcc/insn-opinit.cc
  10451 gcc-20230811/Build/gcc/insn-opinit.cc
  11227 gcc-20230818/Build/gcc/insn-opinit.cc
  11464 gcc-20230825/Build/gcc/insn-opinit.cc
  11664 gcc-20230901/Build/gcc/insn-opinit.cc
  12166 gcc-20230908/Build/gcc/insn-opinit.cc
  13016 gcc-20230915/Build/gcc/insn-opinit.cc
  16788 gcc-20230922/Build/gcc/insn-opinit.cc
$ wc -l gcc-*/Build/gcc/insn-output.cc
   858964 gcc-20230714/Build/gcc/insn-output.cc
   708403 gcc-20230722/Build/gcc/insn-output.cc
   753003 gcc-20230728/Build/gcc/insn-output.cc
   753971 gcc-20230804/Build/gcc/insn-output.cc
   879098 gcc-20230811/Build/gcc/insn-output.cc
   903026 gcc-20230818/Build/gcc/insn-output.cc
   920033 gcc-20230825/Build/gcc/insn-output.cc
   948627 gcc-20230901/Build/gcc/insn-output.cc
   993341 gcc-20230908/Build/gcc/insn-output.cc
  1213716 gcc-20230915/Build/gcc/insn-output.cc
  1527729 gcc-20230922/Build/gcc/insn-output.cc
$ wc -l gcc-*/Build/gcc/insn-emit.cc
   633220 gcc-20230714/Build/gcc/insn-emit.cc
   521442 gcc-20230722/Build/gcc/insn-emit.cc
   551484 gcc-20230728/Build/gcc/insn-emit.cc
   553655 gcc-20230804/Build/gcc/insn-emit.cc
   695596 gcc-20230811/Build/gcc/insn-emit.cc
   715442 gcc-20230818/Build/gcc/insn-emit.cc
   723656 gcc-20230825/Build/gcc/insn-emit.cc
   740419 gcc-20230901/Build/gcc/insn-emit.cc
   776695 gcc-20230908/Build/gcc/insn-emit.cc
   860129 gcc-20230915/Build/gcc/insn-emit.cc
  1093024 gcc-20230922/Build/gcc/insn-emit.cc

[Bug target/111610] Cannot build cross compiler to darwin targets after r14-4108-g47346acb72b50d

2023-09-27 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111610

--- Comment #3 from Iain Sandoe  ---
(In reply to Martin Jambor from comment #2)
> (In reply to Iain Sandoe from comment #1)
> > As a matter of record, we do not really support cross-compilers targeting an
> > unknown Darwin version (the idea of xxx-apple-darwin [without a specific
> > version] was to support building natively on macOS).  What will happen is
> > you will get the earliest supported OS version for the target arch (which
> > might not really be very representative)
> > 
> > It would likely be more representative/useful to choose some suitable OS
> > version:
> > 
> > e.g. powerpc-apple-darwin9 (latest) i686-apple-darwin17 (last 32b support)
> > x86_64-apple-darwin21 (up to date) .. and eventually aarch64-apple-darwin2x
> > 
> > 
> > Of course, the build should not fail so we must fix it - but just pointing
> > out that the results from the current builds are from a configuration that
> > will be issuing warnings about choice of OS version.
> 
> IIUC, the test script takes all targets listed in contrib/config-list.mk and
> tries the above configuration and make all-host steps on all of those
> targets that are not explicitly excluded (currently only powerpc-freebsd13
> because of PR 108491). I don't really know how (or if) the list in that file
> is maintained, but it looks like if they should be removed, they should be
> removed from there?  Of course, we can exclude anything on our end too.

OK. Perhaps that list should be edited to reflect modern practice - but, if not
it's still better to have an old configuration tested than nothing (after all
it found this issue).  As it happens, we did already check for the
missing/unknown case for dsymutil but the enumeration clashes with another RTL
use of "UNKNOWN".  I'll land the fix shortly.

[Bug c/111611] Auto-Vectorize Compiler Optimization Causing Exception / Crash

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111611

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Richard Biener  ---
GCC 10.2.1 is quite old, the newest GCC 10 based release is GCC 10.5 and we've
stopped maintaining that.

Your code is buggy, you declare rbtdb_version_t as having 64 byte alignment but
do not ensure pointers coming from malloc are properly aligned.

Use posix_memalign.

[Bug gcov-profile/111559] [14 regression] ICE when building Python with PGO

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111559

--- Comment #6 from Sergei Trofimovich  ---
Uninitialized value comes from `ipa_merge_profiles()` for our `rule1_same()`
alias and `rule1()` functions:

  // in gcc/ipa-icf.cc:

  else if (create_alias)
{
  alias->icf_merged = true;

  /* Remove the function's body.  */
  ipa_merge_profiles (original, alias);
  // ...

If I comment out `ipa_merge_profiles (original, alias);` call to leave
`original` as is then failure does not happen. Which means at least
`original`'s profile is fine.

Tracing through `ipa_merge_profiles()` we generate uninitialized probalility
profile when divide by zero at:

  // in gcc/ipa-utils.cc

void
ipa_merge_profiles (struct cgraph_node *dst,
struct cgraph_node *src,
bool preserve_body)
// ...

  /* TODO: merge also statement histograms.  */
  FOR_ALL_BB_FN (srcbb, srccfun)
{
  unsigned int i;

  if (copy_counts) { /* snip: ireelevant */ }
  else
{
  for (i = 0; i < EDGE_COUNT (srcbb->succs); i++)
{
  edge srce = EDGE_SUCC (srcbb, i);
  edge dste = EDGE_SUCC (dstbb, i);
  dste->probability =
dste->probability * dstbb->count.ipa ().probability_in
 (dstbb->count.ipa ()
  + srccount.ipa ())
+ srce->probability * srcbb->count.ipa ().probability_in
 (dstbb->count.ipa ()
  + srccount.ipa ());
}
  dstbb->count = dstbb->count.ipa () + srccount.ipa ();
}
}
// ...

Here `dstbb->count.ipa () + srccount.ipa ()` is zero.

This assert should expose it as well:

--- a/gcc/ipa-utils.cc
+++ b/gcc/ipa-utils.cc
@@ -651,13 +651,15 @@ ipa_merge_profiles (struct cgraph_node *dst,
{
  edge srce = EDGE_SUCC (srcbb, i);
  edge dste = EDGE_SUCC (dstbb, i);
+
+ profile_count den = dstbb->count.ipa () + srccount.ipa ();
+ gcc_assert(den.nonzero_p());
+
  dste->probability =
dste->probability * dstbb->count.ipa ().probability_in
-(dstbb->count.ipa ()
- + srccount.ipa ())
+(den)
+ srce->probability * srcbb->count.ipa ().probability_in
-(dstbb->count.ipa ()
- + srccount.ipa ());
+(den);
}
  dstbb->count = dstbb->count.ipa () + srccount.ipa ();
}

If we attach `gdb` it agrees we exercise these edges 0 times.

(gdb) call dstbb->count.debug()
0 (precise)
(gdb) call srccount.ipa ().debug()
0 (precise)

For comparison we are trying to clobber `always` probability with `undefined`:

(gdb) call dste->probability.debug()
always

What edge is that?

(gdb) call debug_edge(srce)
edge (bb_3, bb_4)

__attribute__((noinline))
void rule1 ()
{
  int p.0_1;

   [count: 2]:
  p.0_1 = p;
  if (p.0_1 != 0)
goto ; [0.00%]
  else
goto ; [100.00%]

   [count: 0]:
  edge ();

   [count: 2]:
  return;

}

`always` should valid for `bb_3->bb_4`. But for our data input it's `never`.

[Bug c/111613] New: Bit field stores can be incorrectly optimized away when -fstore-merging is in effect

2023-09-27 Thread gcc at kempniu dot pl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111613

Bug ID: 111613
   Summary: Bit field stores can be incorrectly optimized away
when -fstore-merging is in effect
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at kempniu dot pl
  Target Milestone: ---

Created attachment 56002
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56002=edit
Preprocessed reproducer

This is the simplest reproducer I could come up with:

$ cat bitfield.c
#include 
#include 

struct bitfield {
unsigned int field1 : 1;
unsigned int field2 : 1;
unsigned int field3 : 1;
};

__attribute__((noinline)) static void
set_field1_and_field2(struct bitfield *b) {
b->field1 = 1;
b->field2 = 1;
}

__attribute__((noinline)) static struct bitfield *
new_bitfield(void) {
struct bitfield *b = (struct bitfield *)malloc(sizeof(*b));
b->field3 = 1;
set_field1_and_field2(b);
return b;
}

int main(void) {
struct bitfield *b = new_bitfield();
printf("%d\n", b->field3);
return 0;
}
$ gcc -O2 -o bitfield bitfield.c
$ ./bitfield
0
$ gcc -O2 -fno-store-merging -o bitfield bitfield.c
$ ./bitfield
1

Removing one of the stores from set_field1_and_field2() makes the issue
go away.  Moving "b->field3 = 1;" below the call to
set_field1_and_field2() also makes the issue go away.

This was originally found for the following GCC version:

$ gcc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/13.2.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure
--enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-bootstrap
--prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/
--with-build-config=bootstrap-lto --with-linker-hash-style=gnu
--with-system-zlib --enable-__cxa_atexit --enable-cet=auto
--enable-checking=release --enable-clocale=gnu --enable-default-pie
--enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object
--enable-libstdcxx-backtrace --enable-link-serialization=1
--enable-linker-build-id --enable-lto --enable-multilib --enable-plugin
--enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch
--disable-werror
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.1 20230801 (GCC) 

However, I have subsequently confirmed that it also happens on the
current master branch.

A git bisect points at the following commit as the culprit (first
included in GCC 12.1.0):

commit 22c242342e38ebffa6bbf7e86e7a1e4abdf0d686
Author: Martin Liska 
Date:   Thu Nov 18 17:50:19 2021 +0100

IPA: fix reproducibility in IPA MOD REF

gcc/ChangeLog:

* ipa-modref.c (analyze_function): Do not execute the code
only if dump_file != NULL.

Reverting this change on top of current master makes the issue
disappear, so it looks legit to me.

Disassembly of new_bitfield() for "gcc -O2":

   0x1190 <+0>: sub$0x8,%rsp
   0x1194 <+4>: mov$0x4,%edi
   0x1199 <+9>: call   0x1040 
   0x119e <+14>:mov%rax,%rdi
   0x11a1 <+17>:call   0x1180 
   0x11a6 <+22>:add$0x8,%rsp
   0x11aa <+26>:ret

Disassembly of new_bitfield() for "gcc -O2 -fno-store-merging":

   0x1190 <+0>: sub$0x8,%rsp
   0x1194 <+4>: mov$0x4,%edi
   0x1199 <+9>: call   0x1040 
   0x119e <+14>:orb$0x4,(%rax)
   0x11a1 <+17>:mov%rax,%rdi
   0x11a4 <+20>:call   0x1180 
   0x11a9 <+25>:add$0x8,%rsp
   0x11ad <+29>:ret

[Bug middle-end/111612] New: GCC twice as slow as Clang for minisweep (SPEC HPC 2021)

2023-09-27 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111612

Bug ID: 111612
   Summary: GCC twice as slow as Clang for minisweep (SPEC HPC
2021)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org
CC: rguenth at gcc dot gnu.org
  Target Milestone: ---

The discussion came out during this year's GNU Tools Cauldron during the
OpenMP/OpenACC/offloading talks, i.e.
https://gcc.gnu.org/wiki/cauldron2023#cauldron2023talks.openacc_openmp_offloading_and_gcc

In that talk, using MPI with 8 ranks gave the following
(--define model=mpi --ranks 8):

3855 s (~1.071 h) - Nvidia HPC SDK  23.5 (May 2023): 
4076 s (~1.132 h) - LLVM 17 (pre) commit 34cf263e6 (2023-08-07):
4900 s (~1.361 h) or/up to 6624 s (~1.840 h) - GCC og13 commit b003e6511
(2023-07-19)

* * *

I just tried it myself as follows - using the non SPEC-ified version
and a modified input from how-to-run readme. I have not checked whether there
are any gotchas, but it should be identical and without OpenMP, MPI or similar.

Namely:

  git clone https://github.com/wdj/minisweep.git
  cmake -DCMAKE_C_FLAGS=-O2 -DCMAKE_C_COMPILER=/usr/bin/clang-14 ../..

And likewise for GCC mainline, also with -O2.

Running then:
time ./sweep --ncell_x  4 --ncell_y 8 --ncell_z 32

GCC mainline:
Normsq result: 2.82234163e+12  diff: 0.000e+00  PASS  time: 7.817  GF/s: 0.315
real0m8,124s / user0m7,943s / sys 0m0,180s

Clang/LLVM-14:
Normsq result: 2.82234163e+12  diff: 0.000e+00  PASS  time: 3.036  GF/s: 0.812
real0m3,223s / user0m3,085s / sys 0m0,137s


Using -O3 -flto, I get: 2.070s (GCC) vs. 1.053s (Clang/LLVM)

[Bug c/111611] New: Auto-Vectorize Compiler Optimization Causing Exception / Crash

2023-09-27 Thread markus.vervier--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111611

Bug ID: 111611
   Summary: Auto-Vectorize Compiler Optimization Causing Exception
/ Crash
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: markus.verv...@x41-dsec.de
  Target Milestone: ---

Created attachment 56001
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56001=edit
poc for a crash due to unaligned memory

The gcc compiler produces a binary that crashes with a segmentation fault due
to unaligned memory access in a vector instruction when using the compiler
flags `-fmarch=native -ftree-slp-vectorize`. 

On a system with a i7-1255U CPU, a crash can be reproduced reliably when
compiling and executing the attached test program with the following command:

   gcc -Wall -Wextra -O1 -march=native -ftree-slp-vectorize  rbtdb.c  -o test
&& ./test

It was found that an unaligned pointer is used in a x86_64 vector
instruction:

   vmovdqa ymmword ptr [rbx + 0x20], ymm0

Further investigation reveals that this seems to be caused by a miscompilation
due to automatic vectorization optimizations caused by the flags -march=native
-ftree-slp-vectorize, which cause the compiler to use the native instruction
set of the detected architecture and to apply auto-vectorization24 performance
optimizations.

Hardware: 12th Gen Intel(R) Core(TM) i7-12700H
System: Debian Linux 11
Output of `gcc -v`:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 10.2.1-6'
--with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-10
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib
--enable-libphobos-checking=release --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-gcn/usr,hsa
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
--with-build-config=bootstrap-lto-lean --enable-link-mutex
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.2.1 20210110 (Debian 10.2.1-6)

[Bug libstdc++/111588] Provide opt-out of shared_ptr single-threaded optimization

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111588

--- Comment #3 from Jonathan Wakely  ---
If we do want to do it, I think we'd just need something like this (and docs):

--- a/libstdc++-v3/include/ext/atomicity.h
+++ b/libstdc++-v3/include/ext/atomicity.h
@@ -48,6 +48,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 #ifndef __GTHREADS
 return true;
+#elif defined(_GLIBCXX_ASSUME_NEVER_SINGLE_THREADED)
+return false;
 #elif __has_include()
 return ::__libc_single_threaded;
 #else

[Bug tree-optimization/104165] [12 Regression] -Warray-bounds for unreachable code inlined from std::sort()

2023-09-27 Thread fchelnokov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104165

--- Comment #10 from Fedor Chelnokov  ---
This issue happens in GCC 13.2 as well:
https://godbolt.org/z/TfGx3YccG

[Bug libstdc++/111589] Use relaxed atomic increment (but not decrement!) in shared_ptr

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111589

Jonathan Wakely  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Keywords||missed-optimization
   Last reconfirmed||2023-09-27
 Status|UNCONFIRMED |NEW

--- Comment #1 from Jonathan Wakely  ---
Oh yes, this can be relaxed.

[Bug c++/111608] Cannot declare partial specialization after full specialization

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111608

--- Comment #3 from Jonathan Wakely  ---
(In reply to Julien Bernard from comment #0)
> which seems incorrect since their have different levels of specialization.

But they're not the same thing. The first one is a specialization of the member
function, and defining that triggers the implicit instantiation of the
enclosing class template. That uses the primary template, but it would have
used the partial specialization of X if that had been seen already.

I suspect this is covered by [temp.point] p7:

"If two different points of instantiation give a template specialization
different meanings according to the one-definition rule (6.3), the program is
ill-formed, no diagnostic required."

[Bug c++/111608] Cannot declare partial specialization after full specialization

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111608

--- Comment #2 from Jonathan Wakely  ---
EDG rejects this too:

"spec.cc", line 13: error: this partial specialization would have been used to
  instantiate class "X"
  struct X {
 ^

1 error detected in the compilation of "spec.cc".

[Bug target/111610] Cannot build cross compiler to darwin targets after r14-4108-g47346acb72b50d

2023-09-27 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111610

--- Comment #2 from Martin Jambor  ---
(In reply to Iain Sandoe from comment #1)
> As a matter of record, we do not really support cross-compilers targeting an
> unknown Darwin version (the idea of xxx-apple-darwin [without a specific
> version] was to support building natively on macOS).  What will happen is
> you will get the earliest supported OS version for the target arch (which
> might not really be very representative)
> 
> It would likely be more representative/useful to choose some suitable OS
> version:
> 
> e.g. powerpc-apple-darwin9 (latest) i686-apple-darwin17 (last 32b support)
> x86_64-apple-darwin21 (up to date) .. and eventually aarch64-apple-darwin2x
> 
> 
> Of course, the build should not fail so we must fix it - but just pointing
> out that the results from the current builds are from a configuration that
> will be issuing warnings about choice of OS version.

IIUC, the test script takes all targets listed in contrib/config-list.mk and
tries the above configuration and make all-host steps on all of those targets
that are not explicitly excluded (currently only powerpc-freebsd13 because of
PR 108491). I don't really know how (or if) the list in that file is
maintained, but it looks like if they should be removed, they should be removed
from there?  Of course, we can exclude anything on our end too.

[Bug tree-optimization/111478] [12/13/14 regression] aarch64 SVE ICE: in compute_live_loop_exits, at tree-ssa-loop-manip.cc:250

2023-09-27 Thread ktkachov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111478

ktkachov at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ktkachov at gcc dot gnu.org
   Target Milestone|14.0|12.4
   Priority|P3  |P1

--- Comment #3 from ktkachov at gcc dot gnu.org ---
Marking as P1. We hit this with a Fortran reproducer:
  SUBROUTINE REPRODUCER( M, A, LDA )
  IMPLICIT NONE
  INTEGERLDA, M, I
  COMPLEXA( LDA, * )
  DO I = 2, M
A( I, 1 ) = A( I, 1 ) / A( 1, 1 )
  END DO
  RETURN
  END

on aarch64 with -march=armv8-a+sve -O3
The ICE triggeres on 12.3 but compiles fine wiht 12.2

[Bug target/111610] Cannot build cross compiler to darwin targets after r14-4108-g47346acb72b50d

2023-09-27 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111610

Iain Sandoe  changed:

   What|Removed |Added

   Last reconfirmed||2023-09-27
 Status|UNCONFIRMED |NEW
   Assignee|unassigned at gcc dot gnu.org  |iains at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Iain Sandoe  ---
Hmm - so I guess you have no host-side dsymutil (e.g. built from LLVM) and do
something like symlink /usr/bin/true -> dsymutil?

I guess we have to fix configure to return an "unknown" for that.



As a matter of record, we do not really support cross-compilers targeting an
unknown Darwin version (the idea of xxx-apple-darwin [without a specific
version] was to support building natively on macOS).  What will happen is you
will get the earliest supported OS version for the target arch (which might not
really be very representative)

It would likely be more representative/useful to choose some suitable OS
version:

e.g. powerpc-apple-darwin9 (latest) i686-apple-darwin17 (last 32b support)
x86_64-apple-darwin21 (up to date) .. and eventually aarch64-apple-darwin2x


Of course, the build should not fail so we must fix it - but just pointing out
that the results from the current builds are from a configuration that will be
issuing warnings about choice of OS version.

[Bug tree-optimization/109088] GCC does not always vectorize conditional reduction

2023-09-27 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109088

--- Comment #12 from JuzheZhong  ---
(In reply to Richard Biener from comment #11)
> I don't think strip_nop_cond_scalar_reduction is the place to adjust here,
> maybe it's the caller.  I don't have time to dig into the specific issue
> right now but if we require scalar code adjustments then we need to perform
> those in if-conversion.
> 
> But to me it looks like allowing
> 
> > > STMT 1. tmp = a[i] + x;
> > > STMT 2. tmp2 = tmp + result_ssa_1;
> > > STMT 3. result_ssa_2 = mask ? tmp2 : result_ssa_1;
> 
> in vect_is_simple_reduction might also be a reasonable approach.  The
> use in the COND_EXPR isn't really a use - it's a conditional update.

Thanks Richi.

Enhancing vect_is_simple_reduction in loop vectorizer is also a good approach.
But I think it's better to recognize the scalar condition reduction
(if-conversion) as early as possible. Obviously, current if-conversion failed
to
recognize it as a feasible conditional reduction.

I think enhancing vect_is_simple_reduction is the approach that it's unlikely
we
can simplify the scalar code in if-converison to fit current loop vectorizer.

I believe we will eventually have to enhance both if-converison and loop
vectorizer in the future. And I prefer improving the if-conversion and working
on it. Will keep you posted.

Thanks a lot!

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #14 from Richard Biener  ---
(In reply to Kewen Lin from comment #13)
> Thanks again for the reduced test case and the information!
> 
> I tried to bisect it but encountered some build failures on _Float32 error
> etc., through grepping the log I switched to start from r13-2887 (good) to
> r13-7206 (bad).
> 
> The bisection shows the culprit commit is r13-3378-gf6c168f8c06047 which was
> backported to GCC-12, it seems to match the observation new gcc-12 fail
> while gcc-11 pass.

Note this change likely triggers a latent issue but it might help analyzing the
issue.

[Bug target/111610] New: Cannot build cross compiler to darwin targets after r14-4108-g47346acb72b50d

2023-09-27 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111610

Bug ID: 111610
   Summary: Cannot build cross compiler to darwin targets after
r14-4108-g47346acb72b50d
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jamborm at gcc dot gnu.org
CC: iains at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-linux
Target: x86_64-apple-darwin

We periodically try building cross-compilers (from x86_64-linux) to
most available targets in order to detect early when they don't build.
Recently we have detected failures building cross-compiler to
i686-apple-darwin, i686-apple-darwin9, i686-apple-darwin10, powerpc-darwin8,
powerpc-darwin7, powerpc64-darwin and x86_64-apple-darwin.

On x86_64-apple-darwin, we I have bisected the problem to
r14-4108-g47346acb72b50d (Darwin,debug : Switch to DWARF 3 or 4 when
dsymutil supports it):

Darwin,debug : Switch to DWARF 3 or 4 when dsymutil supports it.

The main reason that Darwin has been using DWARF2 only as debug is that
earlier debug linkers (dsymutil) did not support any extensions to this
so that the default "non-strict" mode used in GCC would cause tool errors.

There are two sources for dsymutil, those based off a closed source base
"dwarfutils" and those based off LLVM.

For dsymutil versions based off LLVM-7+ we can use up to DWARF-4, and for
versions based on dwarfutils 121+ we can use DWARF-3.

Signed-off-by: Iain Sandoe 

gcc/ChangeLog:

* config/darwin-protos.h (enum darwin_external_toolchain): New.
* config/darwin.cc (DSYMUTIL_VERSION): New.
(darwin_override_options): Choose the default debug DWARF version
depending on the configured dsymutil version.


We configure GCC with:

../src/configure --prefix=/tmp/some/prefix --enable-languages=c,c++
--enable-checking=yes --disable-bootstrap --disable-multilib --enable-obsolete
--target=x86_64-apple-darwin

and then check by running:  make -j64 all-host

The failure is:

g++  -fno-PIE -c   -g -O2   -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE  
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing
-Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H
-fno-PIE -I. -I. -I/home/mjambor/gcc/mine/src/gcc
-I/home/mjambor/gcc/mine/src/gcc/. -I/home/mjambor/gcc/mine/src/gcc/../include 
-I/home/mjambor/gcc/mine/src/gcc/../libcpp/include
-I/home/mjambor/gcc/mine/src/gcc/../libcody 
-I/home/mjambor/gcc/mine/src/gcc/../libdecnumber
-I/home/mjambor/gcc/mine/src/gcc/../libdecnumber/dpd -I../libdecnumber
-I/home/mjambor/gcc/mine/src/gcc/../libbacktrace   -o darwin.o -MT darwin.o
-MMD -MP -MF ./.deps/darwin.TPo /home/mjambor/gcc/mine/src/gcc/config/darwin.cc
In file included from ./config.h:6,
 from /home/mjambor/gcc/mine/src/gcc/config/darwin.cc:21:
./auto-host.h:106:26: error: cannot convert ‘rtx_code’ to
‘darwin_external_toolchain’ in initialization
  106 | #define DSYMUTIL_VERSION UNKNOWN,0,0,0
  |  ^~~
  |  |
  |  rtx_code
/home/mjambor/gcc/mine/src/gcc/config/darwin.cc:128:23: note: in expansion of
macro ‘DSYMUTIL_VERSION’
  128 | } dsymutil_version = {DSYMUTIL_VERSION};
  |   ^~~~

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-09-27 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Kewen Lin  changed:

   What|Removed |Added

 CC||msebor at gcc dot gnu.org

--- Comment #13 from Kewen Lin  ---
Thanks again for the reduced test case and the information!

I tried to bisect it but encountered some build failures on _Float32 error
etc., through grepping the log I switched to start from r13-2887 (good) to
r13-7206 (bad).

The bisection shows the culprit commit is r13-3378-gf6c168f8c06047 which was
backported to GCC-12, it seems to match the observation new gcc-12 fail while
gcc-11 pass.

[Bug driver/111605] Cross compilation doesn't work with `-fuse-ld=mold`

2023-09-27 Thread rui314 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111605

--- Comment #12 from Rui Ueyama  ---
> Hmm, if you configure the cross target with --with-ld=ld.mold does that then
> work (when not specifying -fuse-ld=mold)?

Sorry, I don't know, but in either case, that wouldn't solve the user-facing
problem when `-fuse-ld=mold` is explicitly passed.

> I suppose we could adjust how the driver(?) behaves, noting whether the linker
> is multi-arch or not and at least allowing as fallback to use the "host"
> specified linker.

Yes, please :)

[Bug tree-optimization/109088] GCC does not always vectorize conditional reduction

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109088

--- Comment #11 from Richard Biener  ---
I don't think strip_nop_cond_scalar_reduction is the place to adjust here,
maybe it's the caller.  I don't have time to dig into the specific issue right
now but if we require scalar code adjustments then we need to perform those in
if-conversion.

But to me it looks like allowing

> > STMT 1. tmp = a[i] + x;
> > STMT 2. tmp2 = tmp + result_ssa_1;
> > STMT 3. result_ssa_2 = mask ? tmp2 : result_ssa_1;

in vect_is_simple_reduction might also be a reasonable approach.  The
use in the COND_EXPR isn't really a use - it's a conditional update.

[Bug c++/111608] Cannot declare partial specialization after full specialization

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111608

Richard Biener  changed:

   What|Removed |Added

  Known to fail||13.2.0, 14.0, 7.5.0
   Keywords||rejects-valid
   Last reconfirmed||2023-09-27
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Richard Biener  ---
clang accepts this.  Note you can use -fpermissive to demote the error to a
warning.

[Bug driver/111605] Cross compilation doesn't work with `-fuse-ld=mold`

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111605

Richard Biener  changed:

   What|Removed |Added

Version|unknown |14.0
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-09-27

--- Comment #11 from Richard Biener  ---
Hmm, if you configure the cross target with --with-ld=ld.mold does that then
work (when not specifying -fuse-ld=mold)?

I suppose we could adjust how the driver(?) behaves, noting whether the linker
is multi-arch or not and at least allowing as fallback to use the "host"
specified linker.

[Bug c++/105606] [12 Regression] std::pair with nested struct and NSDMI and -std=c++20

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105606

--- Comment #7 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:3ba882c7b51ab1f14c62c748e989415834ccd9ce

commit r14-4293-g3ba882c7b51ab1f14c62c748e989415834ccd9ce
Author: Jakub Jelinek 
Date:   Wed Sep 27 10:38:54 2023 +0200

remove workaround for GCC 4.1-4.3 [PR105606]

While looking into vec.h, I've noticed we still have a workaround for
GCC 4.1-4.3 bugs.
As we now use C++11 and thus need to be built by GCC 4.8 or later,
I think this is now never used.

2023-09-27  Jakub Jelinek  

PR c++/105606
* system.h (BROKEN_VALUE_INITIALIZATION): Don't define.
* vec.h (vec_default_construct): Remove BROKEN_VALUE_INITIALIZATION
workaround.
* function.cc (assign_parm_find_data_types): Likewise.

[Bug target/111590] RISC-V: Multiple ICE in gfortran regression with 'V' Extension enabled

2023-09-27 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111590

JuzheZhong  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from JuzheZhong  ---
fixed

[Bug target/111590] RISC-V: Multiple ICE in gfortran regression with 'V' Extension enabled

2023-09-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111590

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Pan Li :

https://gcc.gnu.org/g:073849da3dfd5cabbfd4492a40a17b207b4a7630

commit r14-4291-g073849da3dfd5cabbfd4492a40a17b207b4a7630
Author: Juzhe-Zhong 
Date:   Wed Sep 27 06:47:12 2023 +0800

DSE: Fix ICE when the mode with access_size don't exist on the
target[PR111590]

hen doing fortran test with 'V' extension enabled on RISC-V port.
I saw multiple ICE: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111590

The root cause is on DSE:

internal compiler error: in smallest_mode_for_size, at stor-layout.cc:356
0x1918f70 smallest_mode_for_size(poly_int<2u, unsigned long>, mode_class)
../../../../gcc/gcc/stor-layout.cc:356
0x11f75bb smallest_int_mode_for_size(poly_int<2u, unsigned long>)
../../../../gcc/gcc/machmode.h:916
0x3304141 find_shift_sequence
../../../../gcc/gcc/dse.cc:1738
0x3304f1a get_stored_val
../../../../gcc/gcc/dse.cc:1906
0x3305377 replace_read
../../../../gcc/gcc/dse.cc:2010
0x3306226 check_mem_read_rtx
../../../../gcc/gcc/dse.cc:2310
0x330667b check_mem_read_use
../../../../gcc/gcc/dse.cc:2415

After investigations, DSE is trying to do optimization like this following
codes:

(insn 86 85 87 9 (set (reg:V4DI 168)
(mem/u/c:V4DI (reg/f:DI 171) [0  S32 A128])) "bug.f90":6:18 discrim
6 1167 {*movv4di}
 (expr_list:REG_EQUAL (const_vector:V4DI [
(const_int 4 [0x4])
(const_int 1 [0x1]) repeated x2
(const_int 3 [0x3])
])
(nil)))

(set (mem) (reg:V4DI 168))

Then it ICE on: auto new_mode = smallest_int_mode_for_size (access_size *
BITS_PER_UNIT);

The access_size may be 24 or 32. We don't have such integer modes with
these size so it ICE.

TODO: The better way maybe make DSE use native_encode_rtx/native_decode_rtx
  but I don't know how to do that.  So let's quickly fix this issue, we
  can improve the fix later.

PR target/111590

gcc/ChangeLog:

* dse.cc (find_shift_sequence): Check the mode with access_size
exist on the target.

[Bug target/111609] New: Zero shift in ARM NEON vshll_n_s8 intrinsic produces an error

2023-09-27 Thread power at pobox dot sk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111609

Bug ID: 111609
   Summary: Zero shift in ARM NEON vshll_n_s8 intrinsic produces
an error
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: power at pobox dot sk
  Target Milestone: ---

Using this test program:
#include 
#include 

void test(int8_t *src, int16_t *dst)
{
int8x8_t nvalue1;
int16x8_t nvalue2;

nvalue1 = vld1_s8(src);
nvalue2 = vshll_n_s8(nvalue1, 0);
vst1q_s16(dst, nvalue2);
}

The compiler produces this error:
test.s:26: Error: immediate value out of range -- `vshll.s8 q8,d16,#0'

Tested on multiple gcc versions.

According to official ARM documentation, zero shift is valid in vshll_n_s8
intrinsic. The same goes for other vshll_n_XXX intrinsics.

[Bug c++/111608] New: Cannot declare partial specialization after full specialization

2023-09-27 Thread raplonu.jb at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111608

Bug ID: 111608
   Summary: Cannot declare partial specialization after full
specialization
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: raplonu.jb at gmail dot com
  Target Milestone: ---

The following code fails to compile with GCC 13.2 and under.

// primary template
template
struct X {
void f();
};

// definition of full template specialization before partial template
specialization
template<>
void X::f() {}

// partial template specialization declaration
template
struct X {
void f();
};

int main() {
X{}.f();
}

gcc give the following message:

error: partial specialization of 'struct X' after instantiation of 'struct
X

which seems incorrect since their have different levels of specialization.

[Bug gcov-profile/111559] [14 regression] ICE when building Python with PGO

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111559

--- Comment #5 from Sergei Trofimovich  ---
Slightly shorter example that does not rely on inline:

// $ cat bug.c
__attribute__((noipa)) static void edge(void) {}

int p = 0;

__attribute__((noinline))
static void rule1(void) { if (p) edge(); }

__attribute__((noinline))
static void rule1_same(void) { if (p) edge(); }

__attribute__((noipa)) int main(void) {
rule1();
rule1_same();
}



bug.c: In function 'rule1':
bug.c:6:13: error: probability of edge 3->4 not initialized
6 | static void rule1(void) { if (p) edge(); }
  | ^
during GIMPLE pass: fixup_cfg
bug.c:6:13: internal compiler error: verify_flow_info failed

[Bug gcov-profile/111559] [14 regression] ICE when building Python with PGO

2023-09-27 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111559

--- Comment #4 from Sergei Trofimovich  ---
Looks like identical code folding creates uninitialized profile counters if
there are any edges in folded functions.

I think cvise did a decent job extracting the reproducer below. Here is a
single-file trigger on `--enable-checking=yes` `gcc` from `master`:

```
// $ cat bug.c
__attribute__((noipa)) static void edge(void) {}

static void rule1(int *p) {
edge();
if (*p) edge();
}

static void rule1_same(int *p) {
edge();
if (*p) edge();
}

__attribute__((noipa)) int main(void) {
int p = 0;
rule1();
rule1_same();
}
```

Trigger:

```
$ echo PG
$ gcc -O2 -fprofile-generate bug.c -o b -fopt-info
$ echo RUN
$ ./b
$ echo PU
$ gcc -O2 -fprofile-use -fprofile-correction bug.c -o b -fopt-info
```

Running:

```
PG
$ gcc -O2 -fprofile-generate bug.c -o b -fopt-info
bug.c:15:5: optimized:  Inlined rule1.constprop/28 into main/3 which now has
time 75.28 and size 51, net change of -6.
bug.c:16:5: optimized:  Inlined rule1_same.constprop/27 into main/3 which now
has time 94.56 and size 72, net change of -6.

RUN
$ ./b

PU
$ gcc -O2 -fprofile-use -fprofile-correction bug.c -o b -fopt-info
bug.c:3:13: optimized: Semantic equality hit:rule1/1->rule1_same/2
bug.c:3:13: optimized: Assembler symbol names:rule1/1->rule1_same/2
bug.c:15:5: optimized:  Inlined rule1.constprop/5 into main/3 which now has
time 26.00 and size 10, net change of +2.
bug.c:16:5: optimized:  Inlined rule1.constprop/4 into main/3 which now has
time 27.00 and size 12, net change of -6.

bug.c: In function 'main':
bug.c:13:28: error: probability of edge 3->4 not initialized
   13 | __attribute__((noipa)) int main(void) {
  |^~~~
bug.c:13:28: error: probability of edge 5->6 not initialized
during IPA pass: inline
bug.c:13:28: internal compiler error: verify_flow_info failed
```

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600

--- Comment #5 from Richard Biener  ---
On x86_64-linux the compile-time of the same dwarf2out.ii the
slowdown between r14-2510-g3d0ca8b55b9a88 and r14-4258-gc9837443075277
is less than 2.5% (three runs, fastest vs slowest).

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-09-27 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600

--- Comment #4 from Andreas Schwab  ---
Created attachment 56000
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56000=edit
dwarf2out.ii

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-09-27 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600

--- Comment #3 from Andreas Schwab  ---
Here are the build times of the stage1 compiler:

2023071421573
2023072219932   -7.6%
2023072821608   +8.4%
2023080421841   +1.0%
2023081125016   +14.5%
2023081825429   +1.7%
2023082525872   +1.7%
2023090125965   +0.4%
2023090828824   +11.0%
2023091530926   +7.3%
2023092240180   +30.0%

[Bug driver/111605] Cross compilation doesn't work with `-fuse-ld=mold`

2023-09-27 Thread rui314 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111605

--- Comment #10 from Rui Ueyama  ---
> This is only a problem when using a cross gcc, so why should mold proactively 
> create symlinks for dozens of targets when mold is installed?

It's because there are too many and we don't have an exhaustive list of all
possible triples. In particular, the vendor part of a triple (e.g. "none" in
"arm-none-eabi") can be anything IIUC, so we can't make an exhaustive list of
all triples.

> It seems to me that a single symlink only needs to be created by the person 
> building the cross-gcc, and installed alongside $target-gcc as part of that 
> toolchain. This is not mold's problem, and could just be documented as part 
> of gcc's installation docs.

That's the correct solution if mold is bundled as part of the cross toolchain.
But if a user of the cross gcc toolchain wanted to use the system-installed
mold, they explicitly create a symbolic link in a $PATH by themselves at this
moment. I think that's pretty inconvenient.

> Although it probably does make sense for gcc to just fallback to using 
> ld.mold without the target prefix.

Yeah, that's exactly what I want...

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600

Richard Biener  changed:

   What|Removed |Added

Summary|[14.0 regression] RISC-V|[14 Regression] RISC-V
   |bootstrap time regression   |bootstrap time regression
 CC||rguenth at gcc dot gnu.org
   Target Milestone|--- |14.0
   Keywords||needs-bisection

--- Comment #2 from Richard Biener  ---
Can you attach the dwarf2out.cc preprocessed source?

[Bug tree-optimization/109088] GCC does not always vectorize conditional reduction

2023-09-27 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109088

--- Comment #10 from JuzheZhong  ---
(In reply to Richard Biener from comment #9)
> (In reply to JuzheZhong from comment #8)
> > It's because the order of the operations we are doing:
> > 
> > For code as follows:
> > 
> > result += mask ? a[i] + x : 0;
> > 
> > GCC:
> > result_ssa_1 = PHI 
> > ...
> > STMT 1. tmp = a[i] + x;
> > STMT 2. tmp2 = tmp + result_ssa_1;
> > STMT 3. result_ssa_2 = mask ? tmp2 : result_ssa_1;
> > 
> > Here we can see both STMT 2 and STMT 3 are using 'result_ssa_1',
> > we end up with 2 uses of the PHI result. Then, we failed to vectorize.
> > 
> > Wheras LLVM:
> > 
> > result_ssa_1 = PHI 
> > ...
> > IR 1. tmp = a[i] + x;
> > IR 2. tmp2 = mask ? tmp : 0;
> > IR 3. result_ssa_2 = tmp2 + result_ssa_1.
> 
> For floating point these are not equivalent (adding zero isn't a no-op).


Yes, I agree these are not equivalent for floating-point.
But I they are equivalent if we specify -ffast-math.

I have double checked LLVM, they failed to vectorize conditionl
floating-point reduction too by default.

However, if we specify LLVM -ffast-math, it will generate the same 
if-conversion IR sequence as integer, then vectorization succeed.


> 
> > LLVM only has 1 use.
> > 
> > Is it reasonable to swap the order in match.pd ?
> 
> if-conversion could be teached to swap this (it's if-conversion creating
> the IL for conditional reductions) when valid.  IIRC Robin Dapp also has
> a patch to make if-conversion emit .COND_ADD instead which should make
> it even better to vectorize.

I knew that patch, Robin is trying fixing the issue (in-order reduction)that I
posted.

I have confirm that patch can't help since it didn't modify the code for this
case, we will end up with multiple use in conditional reduction.

The reduction failed since:

  /* If this isn't a nested cycle or if the nested cycle reduction value
 is used ouside of the inner loop we cannot handle uses of the reduction
 value.  */
  if (nlatch_def_loop_uses > 1 || nphi_def_loop_uses > 1)
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "reduction used in loop.\n");
  return NULL;
}

when  nphi_def_loop_uses  > 1, we failed to vectorize.

I have checked LLVM codes, and I think we can extend this function:

strip_nop_cond_scalar_reduction

We should be able to strip all the statement until we can reach the
use of PHI result, like this:

LLVM is able to handle this case:

for ()
  if (cond)
result += a[i] + b[i] + c[i] +  

No matter how many variables are added in the condition reduction.
They well handle that since they keep iterating all the statement until
reach the result:

result_ssa_1 = PHI <>
tmp1 = result_ssa_1 + a[i];
tmp2 = tmp1 + b[i];
tmp3 = tmp2 + c[i];


We keep iterating until find the result_ssa_1 to hold the reduction variable.

Is this LLVM's approach reasonable to GCC?

If yes, I can translate LLVM code into GCC.

Thanks.

[Bug c++/111607] New: False positive -Wdangling-reference

2023-09-27 Thread fiesh at zefix dot tv via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111607

Bug ID: 111607
   Summary: False positive -Wdangling-reference
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fiesh at zefix dot tv
  Target Milestone: ---

The following code triggers a `-Wdangling-reference` warning:

t.cpp: In function ‘consteval auto f(const V&)’:
t.cpp:19:22: warning: possibly dangling reference to a temporary
[-Wdangling-reference]
   19 | auto const & s = std::visit([](auto const & v) -> S const & {
return v.s; }, v);
  |  ^
t.cpp:19:36: note: the temporary was destroyed at the end of the full
expression ‘std::visit, const
variant&>(f(const V&)::(), (*
& v))’
   19 | auto const & s = std::visit([](auto const & v) -> S const & {
return v.s; }, v);
  | 
~~^~~~




#include 

struct S {
constexpr S(int i_) : i(i_) {}
S(S const &) = delete;
S & operator=(S const &) = delete;
S(S &&) = delete;
S & operator=(S &&) = delete;
int i;
};

struct A {
S s{0};
};

using V = std::variant;

consteval auto f(V const & v) {
auto const & s = std::visit([](auto const & v) -> S const & { return
v.s; }, v);
return s.i;
}

int main() {
constexpr V a{std::in_place_type};
constexpr auto i = f(a);
return i;
}


It makes sure the warning is wrong though by

* having S be non-copyable
* evaluating everything at compile time where UB is not allowed to happen

[Bug driver/111605] Cross compilation doesn't work with `-fuse-ld=mold`

2023-09-27 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111605

--- Comment #9 from Jonathan Wakely  ---
This is only a problem when using a cross gcc, so why should mold proactively
create symlinks for dozens of targets when mold is installed?

It seems to me that a single symlink only needs to be created by the person
building the cross-gcc, and installed alongside $target-gcc as part of that
toolchain. This is not mold's problem, and could just be documented as part of
gcc's installation docs.

Although it probably does make sense for gcc to just fallback to using ld.mold
without the target prefix.

[Bug tree-optimization/109088] GCC does not always vectorize conditional reduction

2023-09-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109088

Richard Biener  changed:

   What|Removed |Added

 CC||rdapp at gcc dot gnu.org

--- Comment #9 from Richard Biener  ---
(In reply to JuzheZhong from comment #8)
> It's because the order of the operations we are doing:
> 
> For code as follows:
> 
> result += mask ? a[i] + x : 0;
> 
> GCC:
> result_ssa_1 = PHI 
> ...
> STMT 1. tmp = a[i] + x;
> STMT 2. tmp2 = tmp + result_ssa_1;
> STMT 3. result_ssa_2 = mask ? tmp2 : result_ssa_1;
> 
> Here we can see both STMT 2 and STMT 3 are using 'result_ssa_1',
> we end up with 2 uses of the PHI result. Then, we failed to vectorize.
> 
> Wheras LLVM:
> 
> result_ssa_1 = PHI 
> ...
> IR 1. tmp = a[i] + x;
> IR 2. tmp2 = mask ? tmp : 0;
> IR 3. result_ssa_2 = tmp2 + result_ssa_1.

For floating point these are not equivalent (adding zero isn't a no-op).

> LLVM only has 1 use.
> 
> Is it reasonable to swap the order in match.pd ?

if-conversion could be teached to swap this (it's if-conversion creating
the IL for conditional reductions) when valid.  IIRC Robin Dapp also has
a patch to make if-conversion emit .COND_ADD instead which should make
it even better to vectorize.

[Bug c++/111606] [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread markus at oberhumer dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

--- Comment #1 from Markus F.X.J. Oberhumer  ---
Created attachment 55999
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55999=edit
bug.cpp

Added attachment bug.cpp

[Bug c++/111606] New: [ICE] internal compiler error: error reporting routines re-entered.

2023-09-27 Thread markus at oberhumer dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111606

Bug ID: 111606
   Summary: [ICE] internal compiler error: error reporting
routines re-entered.
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: markus at oberhumer dot com
  Target Milestone: ---

Link at Compiler Explorer:

  https://godbolt.org/z/EbPWr3qxx

I stumbled on this while compiling some invalid code during refactoring.

Test case has been reduced by cvise.

Might be related to / duplicate of

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90747
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100557