[Bug analyzer/113253] gcc -g causes -fanalyzer to issue false positive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113253 David Malcolm changed: What|Removed |Added Last reconfirmed||2024-01-31 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED
Enclosed is a voice message for your reference 835-477-0103 - GCC
Time: 1/31/2024 5:13 PM Message length is 0:19 secs
[Bug target/113684] Cross compiler without assembler and linker should assume that all assembler and linker features are available
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113684 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug c++/113687] -Warray-bounds is not emitted inside class method
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113687 Andrew Pinski changed: What|Removed |Added Keywords||diagnostic --- Comment #1 from Andrew Pinski --- The warning only happens if the vague linkage function is used. and IIRC that is by design.
[Bug c++/113674] [11/12/13/14 Regression] [[____attr____]] causes internal compiler error: in decl_attributes, at attribs.cc:776
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113674 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Maybe we shouldn't canonicalize attribute names like this, because they can be then canonicalized multiple times, each time removing the __ and __ pair from it and so could lead to inconsistencies. Now, no standard nor supported attribute name starts with _, so perhaps it could be even just punt if it is prefixed with ___ instead of __ (though, guess one can use -Wno-attributes=something::_foo ).
[Bug target/111403] LoongArch: Wrong code with -O -mlasx -fopenmp-simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111403 --- Comment #4 from Xi Ruoyao --- After r14-5545 this issue became latent. And at some point before r14-5545 this issue became nondeterministic: a compiled program *sometimes* crashes. Really strange...
[Bug rtl-optimization/113682] Branches in branchless binary search rather than cmov/csel/csinc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682 --- Comment #2 from Andrew Pinski --- I should note that on x86, 2 cmov in a row might be an issue and worse than branches. There is a cost model and the x86 backend rejects that. There are some cores where it is worse. I don't know if it applies to recent ones though.
[Bug c++/113674] [11/12/13/14 Regression] [[____attr____]] causes internal compiler error: in decl_attributes, at attribs.cc:776
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113674 --- Comment #4 from Marek Polacek --- Note that [[pure]] int g (int i) { return i; } doesn't crash: pure isn't a standard attribute. The crash seems to occur only with an attribute that is registered twice: the GNU version and the standard version.
[Bug fortran/113503] [14 Regression] xtb test miscompilation starting with r14-870
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113503 --- Comment #4 from Jakub Jelinek --- It is the PR fortran/82774 (alloc_scalar_allocatable_subcomponent): Shorten the function name and replace the symbol argument with the se string length. If a deferred length character length is either not present or is not a variable, give the typespec a variable and assign the string length to that. Use gfc_deferred_strlen to find the hidden string length component. (gfc_trans_subcomponent_assign): Convert the expression before the call to alloc_scalar_allocatable_subcomponent so that a good string length is provided. (gfc_trans_structure_assign): Remove the unneeded derived type symbol from calls to gfc_trans_subcomponent_assign. part of the changes that cause this, reverting those hunks (had to revert one manually as tree size; declaration has been added there later) makes the testcase not warn anymore or in the other case not ICE anymore.
[Bug c++/113674] [11/12/13/14 Regression] [[____attr____]] causes internal compiler error: in decl_attributes, at attribs.cc:776
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113674 --- Comment #5 from Joseph S. Myers --- C supports _Noreturn (and thus ___Noreturn__) as an attribute name, so that code with "#define noreturn _Noreturn" (probably from stdnoreturn.h) works with C23 [[noreturn]].
[Bug target/113686] [RISC-V] TLS (Local Exec) relaxation on structures (LE)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113686 palmer at gcc dot gnu.org changed: What|Removed |Added CC||nelsonc1225 at sourceware dot org, ||palmer at gcc dot gnu.org Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-01-31 --- Comment #1 from palmer at gcc dot gnu.org --- (In reply to H. Peter Anvin from comment #0) > When the Local Exec TLS model is in use, gcc generates inefficient code for > accessing the member of a structure: > > struct foobar { >int alpha; >int beta; > }; > > _Thread_local struct foobar foo; > > void func(int bar) > { > foo.beta = bar; > } > > # Version 1 > luia1,%tprel_hi(foo) > adda1,a1,tp,%tprel_add(foo) > addi a1,a1,%tprel_lo(foo) > sw a0,4(a1) > > However, in this case it could be generated as: > > # Version 2 > luia1,%tprel_hi(sym+4) > addi a1,a1,tp,%tprel_add(sym+4) > sw a0,%tprel_lo(sym+4)(a1) > > ... which, if %tprel_hi(sym+4) == 0, as it often is for small embedded > software, the linker can relax to a simple (tp) reference: > > # Version 2a (post-relaxation with small .tbss) > sw a0,%tprel_lo(sym+4)(tp) > > The linker will *not* relax version 1 all the way; leaving an unnecessary mv: > > # Version 1a (post-relaxation with small .tbss) > mv a1,tp > sw a0,%tprel_lo(sym+4)(tp) > > It is of course trickier for the case of multiple subsequent references to > the structure if the structure is not aligned, as gcc can't know a priori > where the 4K breaks are[*]. The version 1 code is more efficient in that > case (3 instructions + 1 instruction/field as opposed to 3 > instructions/field.) > > However, if the structure *is* aligned, gcc will still not optimize 1 into 2. > > There are at least a few options I see: > > 1. gcc option: gcc can generate version 2 code for a single field reference, > or if the alignment is such that all fields are guaranteed to fall inside > the same 4K window. IIUC we could do this without adding anything to the linker or psABI, it's just better code from GCC (we already have TPREL_LO12_S for the stores). That's just better code so it seems uncontroversial to me. > 2. gcc and optional ABI option: introduce a "TLS TE-tiny" model for deep > embedded use, where the combined size of the TSS area is limited to 4K > equivalent to the way direct gp references [or zero, if the global pointer > is 0] work. Thus, direct (tp) references can be used. Unless I'm missing something, we never emit direct GP references from GCC right now. We rely on the linker to relax them. > NOTE: With the current binutils, this will error unless .option norelax is > in effect. It might be desirable to instead have a new relocation type, > which would require binutils support. Alternatively, ld should recognize > that the TLS offset is within +/- 2K and suppress the warning in that case > (since at that point the address is available the the linker.) > > The linker could be further optimized by allowing the TLS to offset; > presumably equivalently to the __global_pointer$ symbol. > > 3. binutils option: teach ld to relax these kinds of chained pointer > references. I'd favor adding support better for relaxing TP-relative sequences to the linker where we can, it avoids the need for a new code model and we've already got most of the linker complexity as it's required for GP. So I think we can essentially just call these LD missed optimizations. Nelson might be out for a bit, but I added him to the CC list. > [*] Rant: in my opinion, the lui/auipc instructions are fundamentally > misdesigned by not having an overlap bit to guarantee a sizable window. I agree we've got auipc issues, it bites us all over the place (we essentially can't share a hi* between multiple lo*s, as we don't know when overflow is going to happen). There'd been some vague proposals to add a third relocation in the chain to align things, but I think they fizzled out because it'd require talking to the psABI folks. I think we're broadly safe for lui, though, so not sure if I'm missing something there? The low bits are always 0 so the intermediate alignment is known.
[Bug middle-end/110659] Error from linker: .eh_frame_hdr refers to overlapping FDEs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110659 John David Anglin changed: What|Removed |Added CC||danglin at gcc dot gnu.org --- Comment #10 from John David Anglin --- I hit this error in stage1 after a small change to pa.cc. It seems to have gone after updating to Debian binutils 2.42-2. I just rebuilt without any changes to gcc tree. The error occurred with binutils 2.41.90.20240122-1.
[Bug target/53929] [meta-bug] -masm=intel with global symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53929 Eric Gallager changed: What|Removed |Added CC||egallager at gcc dot gnu.org --- Comment #27 from Eric Gallager --- is this really a meta-bug? Normally meta-bugs depend on other bugs...
[Bug tree-optimization/113691] New: ICE: in lower_stmt, at gimple-lower-bitint.cc:5444 with function declaration with no parameter specification
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113691 Bug ID: 113691 Summary: ICE: in lower_stmt, at gimple-lower-bitint.cc:5444 with function declaration with no parameter specification Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 57272 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57272=edit reduced testcase I don't know if the code is valid or not, but GCC does not reject the code unless -pedantic-errors or -std=c23 is passed. Compiler output: $ x86_64-pc-linux-gnu-gcc -O testcase.c during GIMPLE pass: bitintlower testcase.c: In function 'bar': testcase.c:3:6: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5444 3 | void bar() { foo(bar_i); } | ^~~ 0xd8cab5 lower_stmt /repo/gcc-trunk/gcc/gimple-lower-bitint.cc:5444 0x2720009 gimple_lower_bitint /repo/gcc-trunk/gcc/gimple-lower-bitint.cc:6564 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.1 20240131 (experimental) (GCC)
[Bug tree-optimization/113693] ICE: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647 with _BitInt() at -O2 -fdbg-cnt=vect_loop:1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113693 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-01-31 --- Comment #1 from Andrew Pinski --- Confirmed. Though I wonder if -fdbg-cnt is just broken for the vectorizer now ...
[Bug target/113686] New: [RISC-V] TLS (Local Exec) relaxation on structures (LE)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113686 Bug ID: 113686 Summary: [RISC-V] TLS (Local Exec) relaxation on structures (LE) Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hpa at zytor dot com Target Milestone: --- When the Local Exec TLS model is in use, gcc generates inefficient code for accessing the member of a structure: struct foobar { int alpha; int beta; }; _Thread_local struct foobar foo; void func(int bar) { foo.beta = bar; } # Version 1 luia1,%tprel_hi(foo) adda1,a1,tp,%tprel_add(foo) addi a1,a1,%tprel_lo(foo) sw a0,4(a1) However, in this case it could be generated as: # Version 2 luia1,%tprel_hi(sym+4) addi a1,a1,tp,%tprel_add(sym+4) sw a0,%tprel_lo(sym+4)(a1) ... which, if %tprel_hi(sym+4) == 0, as it often is for small embedded software, the linker can relax to a simple (tp) reference: # Version 2a (post-relaxation with small .tbss) sw a0,%tprel_lo(sym+4)(tp) The linker will *not* relax version 1 all the way; leaving an unnecessary mv: # Version 1a (post-relaxation with small .tbss) mv a1,tp sw a0,%tprel_lo(sym+4)(tp) It is of course trickier for the case of multiple subsequent references to the structure if the structure is not aligned, as gcc can't know a priori where the 4K breaks are[*]. The version 1 code is more efficient in that case (3 instructions + 1 instruction/field as opposed to 3 instructions/field.) However, if the structure *is* aligned, gcc will still not optimize 1 into 2. There are at least a few options I see: 1. gcc option: gcc can generate version 2 code for a single field reference, or if the alignment is such that all fields are guaranteed to fall inside the same 4K window. 2. gcc and optional ABI option: introduce a "TLS TE-tiny" model for deep embedded use, where the combined size of the TSS area is limited to 4K equivalent to the way direct gp references [or zero, if the global pointer is 0] work. Thus, direct (tp) references can be used. NOTE: With the current binutils, this will error unless .option norelax is in effect. It might be desirable to instead have a new relocation type, which would require binutils support. Alternatively, ld should recognize that the TLS offset is within +/- 2K and suppress the warning in that case (since at that point the address is available the the linker.) The linker could be further optimized by allowing the TLS to offset; presumably equivalently to the __global_pointer$ symbol. 3. binutils option: teach ld to relax these kinds of chained pointer references. [*] Rant: in my opinion, the lui/auipc instructions are fundamentally misdesigned by not having an overlap bit to guarantee a sizable window.
[Bug c/113438] ICE (segfault) in dwarf2out_decl with -g -std=c23 on c23-tag-composite-2.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113438 --- Comment #6 from GCC Commits --- The master branch has been updated by Martin Uecker : https://gcc.gnu.org/g:f6ba433d3c30c20fadd393eed31617a4da81789c commit r14-8666-gf6ba433d3c30c20fadd393eed31617a4da81789c Author: Martin Uecker Date: Tue Jan 23 13:33:34 2024 +0100 Fix ICE with -g and -std=c23 when forming composite types [PR113438] Set TYPE_STUB_DECL to an artificial decl when creating a new structure as a composite type. PR c/113438 gcc/c/ * c-typeck.cc (composite_type_internal): Set TYPE_STUB_DECL. gcc/testsuite/ * gcc.dg/pr113438.c: New test.
[Bug c/113492] ICE: in composite_type_internal, at c/c-typeck.cc:557 with -std=c2x -funsigned-bitfields since r14-6808
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113492 uecker at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #13 from uecker at gcc dot gnu.org --- Fixed on trunk.
[Bug rtl-optimization/113690] New: [13/14 Regression] ICE: in as_a, at machmode.h:381 with -O2 -fno-dce -fno-forward-propagate -fno-split-wide-types -funroll-loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113690 Bug ID: 113690 Summary: [13/14 Regression] ICE: in as_a, at machmode.h:381 with -O2 -fno-dce -fno-forward-propagate -fno-split-wide-types -funroll-loops Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 57271 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57271=edit reduced testcase Compiler output: $ x86_64-pc-linux-gnu-gcc -O2 -fno-dce -fno-forward-propagate -fno-split-wide-types -funroll-loops testcase.c during RTL pass: cse2 testcase.c: In function 'foo': testcase.c:10:1: internal compiler error: in as_a, at machmode.h:381 10 | } | ^ 0x814e46 scalar_int_mode as_a(machine_mode) /repo/gcc-trunk/gcc/machmode.h:381 0x8196e0 scalar_mode as_a(machine_mode) /repo/gcc-trunk/gcc/simplify-rtx.cc:3179 0x8196e0 wi::int_traits >::get_precision(std::pair const&) /repo/gcc-trunk/gcc/rtl.h:2283 0x8196e0 unsigned int wi::get_precision >(std::pair const&) /repo/gcc-trunk/gcc/wide-int.h:2168 0x8196e0 wide_int_ref_storage::wide_int_ref_storage >(std::pair const&) /repo/gcc-trunk/gcc/wide-int.h:1089 0x8196e0 generic_wide_int >::generic_wide_int >(std::pair const&) /repo/gcc-trunk/gcc/wide-int.h:847 0x8196e0 simplify_context::simplify_binary_operation_1(rtx_code, machine_mode, rtx_def*, rtx_def*, rtx_def*, rtx_def*) /repo/gcc-trunk/gcc/simplify-rtx.cc:3338 0x14dd0ad simplify_context::simplify_binary_operation(rtx_code, machine_mode, rtx_def*, rtx_def*) /repo/gcc-trunk/gcc/simplify-rtx.cc:2667 0x26c4a74 simplify_binary_operation(rtx_code, machine_mode, rtx_def*, rtx_def*) /repo/gcc-trunk/gcc/rtl.h:3493 0x26c4a74 fold_rtx /repo/gcc-trunk/gcc/cse.cc:3719 0x26c5ff2 canonicalize_insn /repo/gcc-trunk/gcc/cse.cc:4461 0x26c6511 cse_insn /repo/gcc-trunk/gcc/cse.cc:4544 0x26cc0e3 cse_extended_basic_block /repo/gcc-trunk/gcc/cse.cc:6574 0x26cc0e3 cse_main /repo/gcc-trunk/gcc/cse.cc:6719 0x26ccb00 rest_of_handle_cse2 /repo/gcc-trunk/gcc/cse.cc:7617 0x26ccb00 execute /repo/gcc-trunk/gcc/cse.cc:7672 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.1 20240131 (experimental) (GCC)
[Bug target/111403] LoongArch: Wrong code with -O -mlasx -fopenmp-simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111403 --- Comment #5 from Xi Ruoyao --- (In reply to Xi Ruoyao from comment #4) > After r14-5545 this issue became latent. > > And at some point before r14-5545 this issue became nondeterministic: a > compiled program *sometimes* crashes. Really strange... At r14-5544, if running the compiled program in GDB it *always* crash, but if directly running it it *sometimes* crash. What's the difference between GDB or not?!
[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281 Jakub Jelinek changed: What|Removed |Added Priority|P1 |P2 Target Milestone|14.0|11.5 Summary|[14 Regression] Wrong code |[11/12/13 Regression] |due to vectorization of |Latent wrong code due to |shift reduction and missing |vectorization of shift |promotions since r14-3027 |reduction and missing ||promotions since r9-1590
[Bug c++/113683] explicit template instantiation wrongly checks private base class accessibility
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113683 --- Comment #1 from Andrew Pinski --- MSVC rejects this also for the same reason: (10): error C2243: 'static_cast': conversion from 'const B *' to 'const A *' exists, but is inaccessible clang accepts it though ...
[Bug libgcc/113402] Incorrect symbol versions for __builtin_nested_func_ptr_created, __builtin_nested_func_ptr in libgcc_s.so.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113402 John David Anglin changed: What|Removed |Added CC||danglin at gcc dot gnu.org --- Comment #6 from John David Anglin --- I'm now seeing: In file included from ../../../gcc/libgcc/libgcc2.c:56: ../../../gcc/libgcc/libgcc2.h:32:13: warning: conflicting types for built-in fun ction '__gcc_nested_func_ptr_created'; expected 'void(void *, void *, void *)' [ -Wbuiltin-declaration-mismatch] 32 | extern void __gcc_nested_func_ptr_created (void *, void *, void **); | ^
[Bug c++/113687] New: -Warray-bounds is not emitted inside class method
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113687 Bug ID: 113687 Summary: -Warray-bounds is not emitted inside class method Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: nmmm at nmmm dot nu Target Milestone: --- No warning -Warray-bounds is shown when code is inside class method, defined inside the class body or outside if they are inline or constexpr. struct S{ int p(){ int x[2] = {0, 0}; return x[3]; // no warning shown } static int ps(){ int x[2] = {0, 0}; return x[3]; // no warning shown } int ps2inl(); int ps2(); }; inline int S::ps2inl(){ int x[2] = {0, 0}; return x[3]; // no warning shown } int S::ps2(){ int x[2] = {0, 0}; return x[3]; // warning shown correctly } int f(){ int x[2] = {0, 0}; return x[3]; // warning shown correctly } int main(){ }
[Bug tree-optimization/113692] New: ICE: in lower_stmt, at gimple-lower-bitint.cc:5444 at -O with _BitInt() in a condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113692 Bug ID: 113692 Summary: ICE: in lower_stmt, at gimple-lower-bitint.cc:5444 at -O with _BitInt() in a condition Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 57273 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57273=edit reduced testcase Compiler output: $ x86_64-pc-linux-gnu-gcc testcase.c -O during GIMPLE pass: bitintlower testcase.c: In function 'foo': testcase.c:4:1: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5444 4 | foo(void) | ^~~ 0xd8cab5 lower_stmt /repo/gcc-trunk/gcc/gimple-lower-bitint.cc:5444 0x2720009 gimple_lower_bitint /repo/gcc-trunk/gcc/gimple-lower-bitint.cc:6564 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.1 20240131 (experimental) (GCC)
[Bug c/112571] [13/14 Regression] ICE with nested redefinition of enum
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112571 Joseph S. Myers changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jsm28 at gcc dot gnu.org Known to work||14.0 Status|NEW |ASSIGNED --- Comment #4 from Joseph S. Myers --- Fixed for GCC 14 so far.
[Bug middle-end/93509] Stack protector should offer trap-only handling
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93509 Matheus Afonso Martins Moreira changed: What|Removed |Added CC||matheus.a.m.moreira at gmail dot c ||om --- Comment #2 from Matheus Afonso Martins Moreira --- I also need this feature. I'm writing freestanding Linux applications which are compiled with -ffreestanding -nostdlib. I would not need to implement __stack_chk_fail if GCC could be configured to emit traps instead of calling a function. The sanitizers already have trapping modes: -fsanitize-trap[=opts] The -fsanitize-trap= option instructs the compiler to report for sanitizers mentioned in comma-separated list of opts undefined behavior using __builtin_trap rather than a libubsan library routine. The advantage of this is that the libubsan library is not needed and is not linked in, so this is usable even in freestanding environments. This feature would be the stack smashing protector equivalent.
[Bug modula2/111627] modula2: Excess test fails with a case-preserving-case-insensitive source tree.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111627 Gaius Mulley changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Gaius Mulley --- Bootstrapped on a jfs case-preserving-case-insensitive gnu-linux system. All regressions pass. Optimistically closing the PR.
[Bug target/111403] LoongArch: Wrong code with -O -mlasx -fopenmp-simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111403 --- Comment #6 from Andreas Schwab --- GDB disables ASLR.
[Bug target/111403] LoongArch: Wrong code with -O -mlasx -fopenmp-simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111403 --- Comment #7 from Xi Ruoyao --- (In reply to Andreas Schwab from comment #6) > GDB disables ASLR. Indeed, with "setarch -R" it always crashes.
[Bug target/113684] Cross compiler without assembler and linker should assume that all assembler and linker features are available
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113684 --- Comment #2 from H.J. Lu --- (In reply to Andrew Pinski from comment #1) > This usecase is only for GCC developers and it might even confuse regular > users of GCC ... True, such cross compilers are only for GCC developers. Regular won't run into it.
[Bug libgcc/113402] Incorrect symbol versions for __builtin_nested_func_ptr_created, __builtin_nested_func_ptr in libgcc_s.so.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113402 --- Comment #7 from Jakub Jelinek --- See https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644521.html
[Bug analyzer/113253] gcc -g causes -fanalyzer to issue false positive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113253 --- Comment #2 from David Malcolm --- I'm testing a fix. The bug observably affects trunk and gcc 13.2. It it probably also present but latent on gcc 12, 11, and 10 (-Wanalyzer-deref-before-check was added in gcc 13).
[Bug tree-optimization/113692] ICE: in lower_stmt, at gimple-lower-bitint.cc:5444 at -O with _BitInt() in a condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113692 --- Comment #1 from Andrew Pinski --- My bet you could get the same error with: _BitInt(129) i; void * foo(void) { void *ret = 0; ret = (void *)(__SIZETYPE__)(i & 1); return ret; }
[Bug target/113618] [14 Regression] AArch64: memmove idiom regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113618 Wilco changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |wilco at gcc dot gnu.org --- Comment #4 from Wilco --- (In reply to Alex Coplan from comment #1) > Confirmed. > > (In reply to Wilco from comment #0) > > A possible fix would be to avoid emitting LDP/STP in memcpy/memmove/memset > > expansions. > > Yeah, so I had posted > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636855.html for that > but held off from committing it at the time as IMO there wasn't enough > evidence to show that this helps in general (and the pass could in theory > miss opportunities which would lead to regressions). > > But perhaps this is a good argument for going ahead with that change (of > course it will need rebasing). Yes I have a patch based on current trunk + my outstanding memset cleanup patch. It's slightly faster but causes a small codesize regression. This appears mostly due to GCC being overly aggressive in changing loads/stores with a zero offset into indexing, a non-zero offset or a lo_sym. This not only blocks LDP opportunities but also increases register pressure and spilling.
[Bug libstdc++/90276] PSTL tests fail in Debug Mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276 --- Comment #9 from frs.dumont at gmail dot com --- Here is the reason of the 20_util/specialized_algorithms/pstl/uninitialized_copy_move.cc FAIL. Maybe it fixes some other tests too, I need to run all of them. libstdc++: Do not forward arguments several times [PR90276] Forwarding several times the same arguments results in UB. It is detected by the _GLIBCXX_DEBUG mode as an attempt to use a singular iterator which has been moved. libstdc++-v3/ChangeLog PR libstdc++/90276 * testsuite/util/pstl/test_utils.h: Remove std::forward<> calls when done several times on the same arguments. Ok to commit ? François On 31/01/2024 14:11, redi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276 > > Jonathan Wakely changed: > > What|Removed |Added > > See Also||https://github.com/llvm/llv > ||m-project/issues/80136 >
[Bug tree-optimization/113693] New: ICE: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647 with _BitInt() at -O2 -fdbg-cnt=vect_loop:1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113693 Bug ID: 113693 Summary: ICE: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647 with _BitInt() at -O2 -fdbg-cnt=vect_loop:1 Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 57274 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57274=edit reduced testcase (from gcc.dg/pr68766.c) Compiler output: $ x86_64-pc-linux-gnu-gcc -O2 -fdbg-cnt=vect_loop:1 testcase.c ***dbgcnt: lower limit 1 reached for vect_loop.*** ***dbgcnt: upper limit 1 reached for vect_loop.*** during GIMPLE pass: vect testcase.c: In function 'fn1': testcase.c:4:1: internal compiler error: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647 4 | fn1(void) | ^~~ 0x865e0d check_loop_closed_ssa_def /repo/gcc-trunk/gcc/tree-ssa-loop-manip.cc:647 0x1695c07 check_loop_closed_ssa_bb /repo/gcc-trunk/gcc/tree-ssa-loop-manip.cc:672 0x1695fa6 verify_loop_closed_ssa(bool, loop*) /repo/gcc-trunk/gcc/tree-ssa-loop-manip.cc:697 0x1695fa6 verify_loop_closed_ssa(bool, loop*) /repo/gcc-trunk/gcc/tree-ssa-loop-manip.cc:681 0x13ca929 execute_function_todo /repo/gcc-trunk/gcc/passes.cc:2106 0x13cad2e execute_todo /repo/gcc-trunk/gcc/passes.cc:2142 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.1 20240131 (experimental) (GCC)
[Bug tree-optimization/113692] ICE: in lower_stmt, at gimple-lower-bitint.cc:5444 at -O with _BitInt() in a condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113692 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-01-31 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #2 from Andrew Pinski --- (In reply to Andrew Pinski from comment #1) > My bet you could get the same error with: > _BitInt(129) i; > > void * > foo(void) > { > void *ret = 0; > ret = (void *)(__SIZETYPE__)(i & 1); > return ret; > } the above works. The question is there an extra cast in the IR between _BitInt and void* required or not? PHIOPT/match-and-simplify does: ``` phiopt match-simplify trying: _2 != 0 ? 1B : 0B Matching expression match.pd:2274, gimple-match-3.cc:23 Matching expression match.pd:2823, gimple-match-2.cc:35 Matching expression match.pd:2826, gimple-match-1.cc:66 Matching expression match.pd:2833, gimple-match-2.cc:96 Matching expression match.pd:2274, gimple-match-3.cc:23 Applying pattern match.pd:3396, gimple-match-6.cc:2527 Applying pattern match.pd:5327, gimple-match-9.cc:17991 phiopt match-simplify back: _5 = _2 != 0; _6 = (void *) _2; result: _6 ``` Notice how there is just cast a between _2 and `void*`.
[Bug middle-end/113682] Branches in branchless binary search rather than cmov/csel/csinc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682 Andrew Pinski changed: What|Removed |Added Component|rtl-optimization|middle-end --- Comment #3 from Andrew Pinski --- >From the looks of it, jump threading is getting in the way of conditional move generation.
[Bug testsuite/113685] New: [14 regression] xxx fails after yyy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113685 Bug ID: 113685 Summary: [14 regression] xxx fails after yyy Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- g:d45ddc2c04e471d0dcee016b6edacc00b8341b16, r14-4089-gd45ddc2c04e471 make -k check-gcc RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-117.c" FAIL: gcc.dg/vect/vect-117.c scan-tree-dump-not optimized "Invalid sum" FAIL: gcc.dg/vect/vect-117.c -flto -ffat-lto-objects scan-tree-dump-not optimized "Invalid sum" # of expected passes8 # of unexpected failures2 Note this was also reported in pr111462 but the fix for it did not fix this failure. commit d45ddc2c04e471d0dcee016b6edacc00b8341b16 (HEAD) Author: Richard Biener Date: Thu Sep 14 13:06:51 2023 +0200 tree-optimization/111294 - backwards threader PHI costing * gcc.dg/vect/vect-117.c: Make scan for not Invalid sum conditional on lp64.
[Bug fortran/104908] [11/12/13/14 Regression] incorrect Fortran out-of-bound runtime error.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104908 --- Comment #10 from GCC Commits --- The releases/gcc-13 branch has been updated by Harald Anlauf : https://gcc.gnu.org/g:5741e5fc53161ccf2056e0670334ea528431feb7 commit r13-8266-g5741e5fc53161ccf2056e0670334ea528431feb7 Author: Harald Anlauf Date: Sat Jan 27 17:41:43 2024 +0100 Fortran: fix bounds-checking errors for CLASS array dummies [PR104908] Commit r11-1235 addressed issues with bounds of unlimited polymorphic array dummies. However, using the descriptor from sym->backend_decl does break the case of CLASS array dummies. The obvious solution is to restrict the fix to the unlimited polymorphic case, thus keeping the original descriptor in the ordinary case. gcc/fortran/ChangeLog: PR fortran/104908 * trans-array.cc (gfc_conv_array_ref): Restrict use of transformed descriptor (sym->backend_decl) to the unlimited polymorphic case. gcc/testsuite/ChangeLog: PR fortran/104908 * gfortran.dg/pr104908.f90: New test. (cherry picked from commit ce61de1b8a1bb3a22118e900376f380768f2ba59)
[Bug middle-end/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607 --- Comment #24 from GCC Commits --- The master branch has been updated by Robin Dapp : https://gcc.gnu.org/g:8123f3ca3fd891034a8366518e756f161c4ff40d commit r14-8668-g8123f3ca3fd891034a8366518e756f161c4ff40d Author: Robin Dapp Date: Tue Jan 30 18:39:08 2024 +0100 match: Fix vcond into conditional op folding [PR113607]. In PR113607 we see an invalid fold of _429 = .COND_SHL (mask_patt_205.47_276, vect_cst__262, vect_cst__262, { 0, ... }); vect_prephitmp_129.51_282 = _429; vect_iftmp.55_287 = VEC_COND_EXPR ; to Applying pattern match.pd:9607, gimple-match-10.cc:3817 gimple_simplified to vect_iftmp.55_287 = .COND_SHL (mask_patt_205.47_276, vect_cst__262, vect_cst__262, { 0, ... }); where we essentially use COND_SHL's else instead of VEC_COND_EXPR's. This patch adjusts the corresponding match.pd pattern and makes it only match when the else values are the same. That, however, causes the exact test case for which this pattern was introduced for to fail. Therefore XFAIL it for now. gcc/ChangeLog: PR middle-end/113607 * match.pd: Make sure else values match when folding a vec_cond into a conditional operation. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/pre_cond_share_1.c: XFAIL. * gcc.target/riscv/rvv/autovec/pr113607-run.c: New test. * gcc.target/riscv/rvv/autovec/pr113607.c: New test.
[Bug c/113438] ICE (segfault) in dwarf2out_decl with -g -std=c23 on c23-tag-composite-2.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113438 uecker at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #7 from uecker at gcc dot gnu.org --- I filed PR113688 for the verify_type failures with -g. This bug is fixed on trunk.
[Bug c/113688] verify_type fails for compatible structs with FAM in C23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113688 uecker at gcc dot gnu.org changed: What|Removed |Added Last reconfirmed||2024-01-31 Status|UNCONFIRMED |NEW Ever confirmed|0 |1
[Bug libstdc++/108636] [10 Regression] C++20 undefined reference to `std::filesystem::__cxx11::path::_List::type(std::filesystem::__cxx11::path::_Type)' with -fkeep-inline-functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108636 Mark Bourgeault changed: What|Removed |Added CC||Mark_B53 at yahoo dot com --- Comment #9 from Mark Bourgeault --- In 12.3, GCC still fails with -std=c++20 -fkeep-inline-functions, but passes with -std=c++17 -fkeep-inline-functions. Curiously, the other versions (10.5, 11.4, 13.1) pass with both option sets.
[Bug target/113684] Cross compiler without assembler and linker should assume that all assembler and linker features are available
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113684 Andrew Pinski changed: What|Removed |Added Keywords||internal-improvement --- Comment #1 from Andrew Pinski --- This usecase is only for GCC developers and it might even confuse regular users of GCC ...
[Bug target/113689] New: wrong code with unused _BitInt() division with -O2 -fprofile -mcmodel=large -mavx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113689 Bug ID: 113689 Summary: wrong code with unused _BitInt() division with -O2 -fprofile -mcmodel=large -mavx Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 57270 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57270=edit reduced testcase Output: $ x86_64-pc-linux-gnu-gcc -O2 -fprofile -mcmodel=large -mavx testcase.c $ ./a.out Aborted "b" (and then "x") seems to be garbage. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.1 20240131 (experimental) (GCC)
[Bug c/113688] New: verify_type fails for compatible structs with FAM in C23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113688 Bug ID: 113688 Summary: verify_type fails for compatible structs with FAM in C23 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: uecker at gcc dot gnu.org Target Milestone: --- Extracted from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113438#c1 FAIL: gcc.dg/gnu23-tag-1.c (internal compiler error: 'verify_type' failed) FAIL: gcc.dg/gnu23-tag-4.c (internal compiler error: 'verify_type' failed) FAIL: gcc.dg/gnu23-tag-alias-1.c (internal compiler error: 'verify_type' failed)
[Bug c/113688] verify_type fails for compatible structs with FAM in C23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113688 uecker at gcc dot gnu.org changed: What|Removed |Added Target Milestone|--- |14.0 See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=113438
[Bug c/112571] [13/14 Regression] ICE with nested redefinition of enum
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112571 --- Comment #3 from GCC Commits --- The master branch has been updated by Joseph Myers : https://gcc.gnu.org/g:d22d1a9346f27db41459738c6eb404f8f0956e6f commit r14-8669-gd22d1a9346f27db41459738c6eb404f8f0956e6f Author: Joseph Myers Date: Wed Jan 31 21:39:53 2024 + c: Fix ICE for nested enum redefinitions with/without fixed underlying type [PR112571] Bug 112571 reports an ICE-on-invalid for cases where an enum is defined, without a fixed underlying type, inside the enum type specifier for a definition of that same enum with a fixed underlying type. The ultimate cause is attempting to access ENUM_UNDERLYING_TYPE in a case where it is NULL. Avoid this by clearing ENUM_FIXED_UNDERLYING_TYPE_P in thie case of inconsistent definitions. Bootstrapped wth no regressions for x86_64-pc-linux-gnu. PR c/112571 gcc/c/ * c-decl.cc (start_enum): Clear ENUM_FIXED_UNDERLYING_TYPE_P when defining without a fixed underlying type an enumeration previously declared with a fixed underlying type. gcc/testsuite/ * gcc.dg/c23-enum-9.c, gcc.dg/c23-enum-10.c: New tests.
[Bug target/113689] wrong code with unused _BitInt() division with -O2 -fprofile -mcmodel=large -mavx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113689 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-01-31 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Confirmed. s/511/(256-63)/ is the min to reproduce this issue. Basically just enough to auto-vectorize. I get the feeling that mcount causes alignment differences ...
[Bug c/113696] New: RISC-V: ineffective vsetvl behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113696 Bug ID: 113696 Summary: RISC-V: ineffective vsetvl behavior Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pan2.li at intel dot com Target Milestone: --- Given we have a sample code, build with '-march=rv64gcv -O3 -g0'. #include "riscv_vector.h" void f (int32_t * restrict in, int32_t * restrict out, size_t n, size_t cond, size_t cond2) { for (size_t i = 0; i < n; i++) { if (i == cond) { vint8mf8_t v = *(vint8mf8_t*)(in + i + 100); *(vint8mf8_t*)(out + i + 100) = v; } else if (i == cond2) { vfloat32mf2_t v = *(vfloat32mf2_t*)(in + i + 200); *(vfloat32mf2_t*)(out + i + 200) = v; } else if (i == (cond2 - 1)) { vuint16mf2_t v = *(vuint16mf2_t*)(in + i + 300); *(vuint16mf2_t*)(out + i + 300) = v; } else { vint8mf4_t v = *(vint8mf4_t*)(in + i + 400); *(vint8mf4_t*)(out + i + 400) = v; } } } when we have asm code as below, the vsetvl insn is somehow ineffective and can be refined up to a point. f: .LFB0: .cfi_startproc beq a2,zero,.L12 addia7,a0,400 addia6,a1,400 addia0,a0,1600 addia1,a1,1600 li a5,0 addit6,a4,-1 vsetvli t3,zero,e8,mf8,ta,ma .L7: beq a3,a5,.L15 beq a4,a5,.L16 beq t6,a5,.L17 vsetvli t1,zero,e8,mf4,ta,ma vle8.v v1,0(a0) vse8.v v1,0(a1) vsetvli t3,zero,e8,mf8,ta,ma .L4: addia5,a5,1 addia7,a7,4 addia6,a6,4 addia0,a0,4 addia1,a1,4 bne a2,a5,.L7 .L12: ret .L15: vle8.v v1,0(a7) vse8.v v1,0(a6) j .L4 .L17: vsetvli t1,zero,e8,mf4,ta,ma addit5,a0,-400 addit4,a1,-400 vle16.v v1,0(t5) vse16.v v1,0(t4) vsetvli t3,zero,e8,mf8,ta,ma j .L4 .L16: addit5,a0,-800 addit4,a1,-800 vle32.v v1,0(t5) vse32.v v1,0(t4) j .L4
[Bug libgomp/113698] GNU OpenMP with OMP_PROC_BIND alters thread affinity in a way that negatively affects performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113698 --- Comment #1 from Andrew Pinski --- https://www.openmp.org/spec-html/5.0/openmpse52.html > Otherwise, the execution environment should not move OpenMP threads between > OpenMP places, thread affinity is enabled, and __the initial thread is bound > to the first place in the OpenMP place list prior to the first active > parallel region__. Hmm. Maybe LLVM's libomp does not follow the openmp spec.
[Bug tree-optimization/113691] ICE: in lower_stmt, at gimple-lower-bitint.cc:5444 with function declaration with no parameter specification
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113691 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=113692 Last reconfirmed||2024-01-31 Status|UNCONFIRMED |NEW Keywords||ice-on-valid-code --- Comment #1 from Andrew Pinski --- ``` intD.6 * pD.2782; _BitInt(129) i.0_1; ;; basic block 2, loop depth 0, count 1073741824 (estimated locally, freq 1.), maybe hot ;;prev block 0, next block 1, flags: (NEW, REACHABLE, VISITED) ;;pred: ENTRY [always] count:1073741824 (estimated locally, freq 1.) (FALLTHRU,EXECUTABLE) # VUSE <.MEM_2(D)> i.0_1 = iD.2768; # PT = nonlocal escaped null p_4 = (intD.6 *) i.0_1; ``` Confirmed. The problem is the same as recorded in PR 113692. The cast between _BitInt and a pointer is not expected ..
[Bug rtl-optimization/113690] [13/14 Regression] ICE: in as_a, at machmode.h:381 with -O2 -fno-dce -fno-forward-propagate -fno-split-wide-types -funroll-loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113690 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-02-01 Status|UNCONFIRMED |NEW Target Milestone|--- |13.3 --- Comment #1 from Andrew Pinski --- Confirmed. Note there is some missed optimizations on the gimple level too ...
[Bug target/113657] [14 Regression] ICE Segmentation fault with -mstrict-align and __arm_data512_t since r14-1187-gd6b756447cd58b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113657 --- Comment #4 from GCC Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:dbf847d2c8d1c910948ba34c9338939c67323273 commit r14-8671-gdbf847d2c8d1c910948ba34c9338939c67323273 Author: Andrew Pinski Date: Tue Jan 30 00:50:56 2024 -0800 aarch64: -mstrict-align vs __arm_data512_t [PR113657] After r14-1187-gd6b756447cd58b, simplify_gen_subreg can return NULL for "unaligned" memory subreg. Since V8DI has an alignment of 8 bytes, using TImode causes simplify_gen_subreg to return NULL. This fixes the issue by using DImode instead for the loop. And then we will have later on the STP/LDP pass combine it back into STP/LDP if needed. Since strict align is less important (usually used for firmware and early boot only), not doing LDP/STP here is ok. Built and tested for aarch64-linux-gnu with no regressions. PR target/113657 gcc/ChangeLog: * config/aarch64/aarch64-simd.md (split for movv8di): For strict aligned mode, use DImode instead of TImode. gcc/testsuite/ChangeLog: * gcc.target/aarch64/acle/ls64_strict_align.c: New test. Signed-off-by: Andrew Pinski
[Bug middle-end/113694] New: Allow renaming stack smashing protector symbols
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113694 Bug ID: 113694 Summary: Allow renaming stack smashing protector symbols Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: matheus.a.m.moreira at gmail dot com Target Milestone: --- I'm writing freestanding Linux applications which are compiled with -ffreestanding -nostdlib. In order to integrate with the stack smashing protector, I need to provide the following symbols: __stack_chk_guard __stack_chk_fail It would be nice if there was a compiler option to rename these symbols to something else. I want to make them consistent with the rest of my code, thereby improving its quality. A couple of options would be enough: -fstack-protector-canary=my_variable -fstack-protector-handler=my_function Attributes could work too: uintptr_t my_canary __attribute__((stack_protector_canary)); __attribute__((noreturn, stack_protector_handler)) void __stack_chk_fail(void);
[Bug analyzer/113253] gcc -g causes -fanalyzer to issue false positive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113253 --- Comment #3 from GCC Commits --- The master branch has been updated by David Malcolm : https://gcc.gnu.org/g:cc7aebff74d8967563fd9af5cb958dfcc8c111e8 commit r14-8670-gcc7aebff74d8967563fd9af5cb958dfcc8c111e8 Author: David Malcolm Date: Wed Jan 31 18:26:26 2024 -0500 analyzer: fix skipping of debug stmts [PR113253] PR analyzer/113253 reports a case where the analyzer output varied with and without -g enabled. The root cause was that debug stmts were in the FOR_EACH_IMM_USE_FAST list for SSA names, leading to the analyzer's state purging logic differing between the -g and non-debugging cases, and thus leading to differences in the exploration of the user's code. Fix by skipping such stmts in the state-purging logic, and removing debug stmts when constructing the supergraph. gcc/analyzer/ChangeLog: PR analyzer/113253 * region-model.cc (region_model::on_stmt_pre): Add gcc_unreachable for debug statements. * state-purge.cc (state_purge_per_ssa_name::state_purge_per_ssa_name): Skip any debug stmts in the FOR_EACH_IMM_USE_FAST list. * supergraph.cc (supergraph::supergraph): Don't add debug stmts to the supernodes. gcc/testsuite/ChangeLog: PR analyzer/113253 * gcc.dg/analyzer/deref-before-check-pr113253.c: New test. Signed-off-by: David Malcolm
[Bug target/113657] [14 Regression] ICE Segmentation fault with -mstrict-align and __arm_data512_t since r14-1187-gd6b756447cd58b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113657 Andrew Pinski changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #5 from Andrew Pinski --- Fixed.
[Bug target/113249] RISC-V: regression testsuite errors -mtune=generic-ooo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113249 --- Comment #5 from GCC Commits --- The master branch has been updated by Edwin Lu : https://gcc.gnu.org/g:4b799a16ae59fc0f508c5931ebf1851a3446b707 commit r14-8674-g4b799a16ae59fc0f508c5931ebf1851a3446b707 Author: Edwin Lu Date: Wed Jan 31 10:45:43 2024 -0800 RISC-V: Use default cost model for insn scheduling Use default cost model scheduling on these test cases. All these tests introduce scan dump failures with -mtune generic-ooo. Since the vector cost models are the same across all three tunes, some of the tests in PR113249 will be fixed with this patch series. PR target/113249 gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/bug-1.C: use default scheduling * gcc.target/riscv/rvv/autovec/reduc/reduc_call-2.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-102.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-108.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-114.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-119.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-12.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-16.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-17.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-19.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-21.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-23.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-25.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-27.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-29.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-31.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-33.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-35.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-4.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-40.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-44.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-50.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-56.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-62.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-68.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-74.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-79.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-8.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-84.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-90.c: ditto * gcc.target/riscv/rvv/base/binop_vx_constraint-96.c: ditto * gcc.target/riscv/rvv/base/float-point-dynamic-frm-30.c: ditto * gcc.target/riscv/rvv/base/pr108185-1.c: ditto * gcc.target/riscv/rvv/base/pr108185-2.c: ditto * gcc.target/riscv/rvv/base/pr108185-3.c: ditto * gcc.target/riscv/rvv/base/pr108185-4.c: ditto * gcc.target/riscv/rvv/base/pr108185-5.c: ditto * gcc.target/riscv/rvv/base/pr108185-6.c: ditto * gcc.target/riscv/rvv/base/pr108185-7.c: ditto * gcc.target/riscv/rvv/base/shift_vx_constraint-1.c: ditto * gcc.target/riscv/rvv/vsetvl/pr111037-3.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-28.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-29.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-32.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-33.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-17.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-18.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-19.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-10.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-11.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-12.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-4.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-5.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-6.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-7.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-8.c: ditto * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-9.c: ditto * gfortran.dg/vect/vect-8.f90: ditto Signed-off-by: Edwin Lu
[Bug c/113697] New: RISC-V: Redundant vsetvl insn in function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113697 Bug ID: 113697 Summary: RISC-V: Redundant vsetvl insn in function Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pan2.li at intel dot com Target Milestone: --- Give the sample code as below, build with -march=rv64gcv -O3 -g0 int foo (int * __restrict a, int n) { int result = 0; for (int i = 0; i < n; i++) result += a[i]; return result; } The asm code looks like below, we have one duplicated vsetvl insn here. foo: .LFB0: .cfi_startproc ble a1,zero,.L4 vsetvli a5,zero,e32,m1,ta,ma vmv.v.i v1,0 .L3: vsetvli a5,a1,e32,m1,tu,ma sllia4,a5,2 sub a1,a1,a5 vle32.v v2,0(a0) add a0,a0,a4 vadd.vv v1,v2,v1 bne a1,zero,.L3 li a5,0 vsetivlizero,1,e32,m1,ta,ma vmv.s.x v2,a5 vsetvli a5,zero,e32,m1,ta,ma <== redundant vsetvl vredsum.vs v1,v1,v2 vmv.x.s a0,v1 ret .L4: li a0,0 ret
[Bug target/113690] [13/14 Regression] ICE: in as_a, at machmode.h:381 with -O2 -fno-dce -fno-forward-propagate -fno-split-wide-types -funroll-loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113690 Andrew Pinski changed: What|Removed |Added Component|rtl-optimization|target --- Comment #2 from Andrew Pinski --- (gdb) p debug_rtx(tem) (expr_list:REG_EQUAL (mult:V1TI (reg:V1TI 101 [ _12 ]) (const_int 3 [0x3])) (nil)) That seems wrong ... It was introduced by stv1 pass with: (insn 32 31 33 2 (set (reg:V1TI 132 [ _20 ]) (reg:V1TI 101 [ _12 ])) "t5.c":9:5 2025 {movv1ti_internal} (expr_list:REG_DEAD (reg:V1TI 130 [ BIT_FIELD_REF ]) (expr_list:REG_DEAD (reg:V1TI 129 [ BIT_FIELD_REF ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_EQUAL (mult:V1TI (reg:V1TI 101 [ _12 ]) (const_int 3 [0x3])) (nil)) Which was before it: (insn 32 31 33 2 (set (reg:TI 132 [ _20 ]) (reg:TI 101 [ _12 ])) "t5.c":9:5 83 {*movti_internal} (expr_list:REG_DEAD (reg:TI 130 [ BIT_FIELD_REF ]) (expr_list:REG_DEAD (reg:TI 129 [ BIT_FIELD_REF ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_EQUAL (mult:TI (reg:TI 101 [ _12 ]) (const_int 3 [0x3])) (nil)) it should have been replaced with `(const_vect (const_int 3))`. So this is a target specific pass which is causing the issue.
[Bug c/113695] New: RISC-V: Sources with different EEW must use different registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113695 Bug ID: 113695 Summary: RISC-V: Sources with different EEW must use different registers Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: juzhe.zhong at rivai dot ai Target Milestone: --- As this PR in LLVM, https://github.com/llvm/llvm-project/issues/80099 RVV ISA: A vector register cannot be used to provide source operands with more than one EEW for a single instruction. A mask register source is considered to have EEW=1 for this constraint. An encoding that would result in the same vector register being read with two or more different EEWs, including when the vector register appears at different positions within two or more vector register groups, is reserved. #include #include void foo(vuint64m2_t colidx, uint32_t* base_addr, size_t vl) { vuint32m1_t values = __riscv_vget_v_u32m2_u32m1(__riscv_vreinterpret_v_u64m2_u32m2 (colidx), 0); __riscv_vsuxei64_v_u32m1(base_addr, colidx, values, vl); } foo: vsetvli zero,a1,e32,m1,ta,ma vsuxei64.v v8,(a0),v8 ret It is incorrect those 2 input operand with different EEW should not be the same register (v8). Current GCC RTL machine description and constraint can not allow us to fix it. Even though it is a bug, I think we can only revisit it in GCC-15.
[Bug c/113695] RISC-V: Sources with different EEW must use different registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113695 --- Comment #1 from JuzheZhong --- Since both operand are input operand, early clobber "&" constraint can not help.
[Bug libgomp/113698] GNU OpenMP with OMP_PROC_BIND alters thread affinity in a way that negatively affects performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113698 --- Comment #2 from Andrew Pinski --- Actually I misread the testcase, because it looks like LLVM's libomp follows the same as GCC's. And it looks like the OpenMP spec is specific about that behavior too.
[Bug middle-end/113694] Allow renaming stack smashing protector symbols
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113694 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement
[Bug analyzer/113253] gcc -g causes -fanalyzer to issue false positive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113253 --- Comment #4 from David Malcolm --- Should be fixed on trunk for gcc 14 by the above patch. Keeping open to backport to other branches.
[Bug middle-end/93509] Stack protector should offer trap-only handling
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93509 --- Comment #3 from Matheus Afonso Martins Moreira --- Equivalent feature request in the LLVM issue tracker: https://github.com/llvm/llvm-project/issues/80236
[Bug libgomp/113698] New: GNU OpenMP with OMP_PROC_BIND alters thread affinity in a way that negatively affects performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113698 Bug ID: 113698 Summary: GNU OpenMP with OMP_PROC_BIND alters thread affinity in a way that negatively affects performance Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: kugan at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- Created attachment 57275 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57275=edit testcase When OMP_PROC_BIND=true it seems gomp set the affinity even before main() starts. In particular, the main thread gets affinity 0x1 (i.e. pinned to the first core). For the attached, I get $ OMP_NUM_THREADS=72 ./a.out [main thread affinity right after main()]. tid:ae511020 aff:... duration: 402.949 msec $ OMP_PROC_BIND=true OMP_NUM_THREADS=72 ./a.out [main thread affinity right after main()]. tid:fffdded50020 aff:...0001 duration: 7879.59 msec $ OMP_PROC_BIND=true OMP_NUM_THREADS=72 ./a.out [main thread affinity right after main()]. tid:ae54c020 aff:...0001 duration: 311219 msec Compiler options used: gcc -O0 -fopenmp repro.c gcc -v: Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/aarch64-linux-gnu/11/lto-wrapper Target: aarch64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.3.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=aarch64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --enable-fix-cortex-a53-843419 --disable-werror --enable-checking=release --build=aarch64-linux-gnu --host=aarch64-linux-gnu --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04)
[Bug tree-optimization/113699] New: during GIMPLE pass: bitintlower ICE: SIGSEGV in var_to_partition (tree-ssa-live.h:163) with _BitInt() used in __asm__
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113699 Bug ID: 113699 Summary: during GIMPLE pass: bitintlower ICE: SIGSEGV in var_to_partition (tree-ssa-live.h:163) with _BitInt() used in __asm__ Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz CC: jakub at gcc dot gnu.org Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 57276 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57276=edit reduced testcase (from gcc.target/i386/pr41985.c) Compiler output: $ x86_64-pc-linux-gnu-gcc testcase.c -wrapper valgrind,-q ==30321== Invalid read of size 8 ==30321==at 0x271DC7F: var_to_partition (tree-ssa-live.h:163) ==30321==by 0x271DC7F: lower_asm (gimple-lower-bitint.cc:5203) ==30321==by 0x271DC7F: (anonymous namespace)::bitint_large_huge::lower_stmt(gimple*) (gimple-lower-bitint.cc:5240) ==30321==by 0x2720009: gimple_lower_bitint() (gimple-lower-bitint.cc:6564) ==30321==by 0x13CDCBA: execute_one_pass(opt_pass*) (passes.cc:2646) ==30321==by 0x13CE5AF: execute_pass_list_1(opt_pass*) (passes.cc:2755) ==30321==by 0x13CE5E8: execute_pass_list(function*, opt_pass*) (passes.cc:2766) ==30321==by 0xFCDC05: expand (cgraphunit.cc:1843) ==30321==by 0xFCDC05: cgraph_node::expand() (cgraphunit.cc:1796) ==30321==by 0xFCEB19: output_in_order (cgraphunit.cc:2193) ==30321==by 0xFCEB19: symbol_table::compile() [clone .part.0] (cgraphunit.cc:2397) ==30321==by 0xFD1AC7: compile (cgraphunit.cc:2313) ==30321==by 0xFD1AC7: symbol_table::finalize_compilation_unit() (cgraphunit.cc:2585) ==30321==by 0x150FF61: compile_file() (toplev.cc:474) ==30321==by 0xDE840B: do_compile (toplev.cc:2152) ==30321==by 0xDE840B: toplev::main(int, char**) (toplev.cc:2308) ==30321==by 0xDE9BEA: main (main.cc:39) ==30321== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==30321== during GIMPLE pass: bitintlower0 testcase.c: In function 'foo': testcase.c:2:1: internal compiler error: Segmentation fault 2 | foo (void) | ^~~ 0x150fa7f crash_signal /repo/gcc-trunk/gcc/toplev.cc:317 0x271dc7f var_to_partition(_var_map*, tree_node*) /repo/gcc-trunk/gcc/tree-ssa-live.h:163 0x271dc7f lower_asm /repo/gcc-trunk/gcc/gimple-lower-bitint.cc:5203 0x271dc7f lower_stmt /repo/gcc-trunk/gcc/gimple-lower-bitint.cc:5240 0x2720009 gimple_lower_bitint /repo/gcc-trunk/gcc/gimple-lower-bitint.cc:6564 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-8665-20240131161256-g3fed1609f61-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.1 20240131 (experimental) (GCC)
[Bug target/113560] Strange code generated when optimizing a multiplication on x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113560 --- Comment #8 from GCC Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:2f14c0dbb789852947cb58fdf7d3162413f053fa commit r14-8680-g2f14c0dbb789852947cb58fdf7d3162413f053fa Author: Roger Sayle Date: Thu Feb 1 06:10:42 2024 + PR target/113560: Enhance is_widening_mult_rhs_p. This patch resolves PR113560, a code quality regression from GCC12 affecting x86_64, by enhancing the middle-end's tree-ssa-math-opts.cc to recognize more instances of widening multiplications. The widening multiplication perception code identifies cases like: _1 = (unsigned __int128) x; __res = _1 * 100; but in the reported test case, the original input looks like: _1 = (unsigned long long) x; _2 = (unsigned __int128) _1; __res = _2 * 100; which gets optimized by constant folding during tree-ssa to: _2 = x & 18446744073709551615; // x & 0x __res = _2 * 100; where the BIT_AND_EXPR hides (has consumed) the extension operation. This reveals the more general deficiency (missed optimization opportunity) in widening multiplication perception that additionally both __int128 foo(__int128 x, __int128 y) { return (x & 1000) * (y & 1000) } and unsigned __int128 bar(unsigned __int128 x, unsigned __int128) { return (x >> 80) * (y >> 80); } should be recognized as widening multiplications. Hence rather than test explicitly for BIT_AND_EXPR (as in the first version of this patch) the more general solution is to make use of range information, as provided by tree_non_zero_bits. As a demonstration of the observed improvements, function foo above currently with -O2 compiles on x86_64 to: foo:movq%rdi, %rsi movq%rdx, %r8 xorl%edi, %edi xorl%r9d, %r9d andl$1000, %esi andl$1000, %r8d movq%rdi, %rcx movq%r9, %rdx imulq %rsi, %rdx movq%rsi, %rax imulq %r8, %rcx addq%rdx, %rcx mulq%r8 addq%rdx, %rcx movq%rcx, %rdx ret with this patch, GCC recognizes the *w and instead generates: foo:movq%rdi, %rsi movq%rdx, %r8 andl$1000, %esi andl$1000, %r8d movq%rsi, %rax imulq %r8 ret which is perhaps easier to understand at the tree-level where __int128 foo (__int128 x, __int128 y) { __int128 _1; __int128 _2; __int128 _5; [local count: 1073741824]: _1 = x_3(D) & 1000; _2 = y_4(D) & 1000; _5 = _1 * _2; return _5; } gets transformed to: __int128 foo (__int128 x, __int128 y) { __int128 _1; __int128 _2; __int128 _5; signed long _7; signed long _8; [local count: 1073741824]: _1 = x_3(D) & 1000; _2 = y_4(D) & 1000; _7 = (signed long) _1; _8 = (signed long) _2; _5 = _7 w* _8; return _5; } 2023-02-01 Roger Sayle Richard Biener gcc/ChangeLog PR target/113560 * tree-ssa-math-opts.cc (is_widening_mult_rhs_p): Use range information via tree_non_zero_bits to check if this operand is suitably extended for a widening (or highpart) multiplication. (convert_mult_to_widen): Insert explicit casts if the RHS or LHS isn't already of the claimed type. gcc/testsuite/ChangeLog PR target/113560 * g++.target/i386/pr113560.C: New test case. * gcc.target/i386/pr113560.c: Likewise. * gcc.dg/pr87954.c: Update test case.
[Bug tree-optimization/113692] ICE: in lower_stmt, at gimple-lower-bitint.cc:5444 at -O with _BitInt() in a condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113692 --- Comment #3 from Richard Biener --- integer to pointer conversions are not constrained in GIMPLE, only pointer-to-int widening conversions are.
[Bug middle-end/113669] -fsanitize=undefined failed to check a signed integer overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113669 --- Comment #3 from Jiajing_Zheng --- (In reply to Jakub Jelinek from comment #1) > This is because already the FE optimizes it, when it sees that > ((int)(g_B * g_A[1])) & (g_A[1] & g_A[0]) | g_A[0] > is just being added to unsigned char element, the upper bits of it aren't > needed, so the multiplication and & and | are all performed in unsigned char > rather than wider types. Thanks for your reply. I then used 'gcc -O2 mutation.c -fsanitize=undefined -S' to generate mutation.s. As shown below, the relevant compilation sections 'addl %r13d, %r13d' show that the statement 'g_A[0] += temp & (g_A[1] & g_A[0]) | g_A[0];' in the loop is optimized to 'g_A[0] += g_A[0];'. .L8: addl%r13d, %r13d movslq %ebx, %rsi movb%r13b, g_A(%rip) cmpq$4, %rsi jnb .L12 Is that what you mean by "the FE optimizes it"? I want to see the file generated by a file.c after FE optimization, should I go to the corresponding assembly file.s?
[Bug target/113700] libgcc_s does not include symbols for _Float16 and __bf16 on Solaris/Illumos even though gcc generates code for _Float16 and __bf16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113700 --- Comment #2 from Niclas Rosenvik --- (In reply to Andrew Pinski from comment #1) > >I tried to add the gcc12 and up parts of > > It is correct except it should just use GCC 14 I think. I forgot to mention that the problem with _Float16 aka __extendhfdf2 has happened on gcc12 as well as gcc14. If it was possible to not just choose one version of gcc then I would have marked gcc12 to 14.
[Bug middle-end/113699] during GIMPLE pass: bitintlower ICE: SIGSEGV in var_to_partition (tree-ssa-live.h:163) with _BitInt() used in __asm__
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113699 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2024-02-01 Component|tree-optimization |middle-end Keywords||ice-on-valid-code Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- Confirmed. This happens with an uninitialized variable only. Here is a valid code testcase: ``` void foo (void) { _BitInt(129) i; __asm__ ("": :"rm" (i)); } ```
[Bug testsuite/113685] [14 regression] gcc.dg/vect/vect-117.c fails profile checking with Invalid sum after r14-4089-gd45ddc2c04e471
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113685 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-02-01 CC||hubicka at gcc dot gnu.org Ever confirmed|0 |1 Target Milestone|--- |14.0 Keywords||testsuite-fail Summary|[14 regression] xxx fails |[14 regression] |after yyy |gcc.dg/vect/vect-117.c ||fails profile checking with ||Invalid sum after ||r14-4089-gd45ddc2c04e471 Status|UNCONFIRMED |NEW --- Comment #1 from Richard Biener --- As said in the other PR, this is more for Honza who thought checking we do not end with invalid profiles for all vect testcases is a good thing ;) Btw, the wrong count pops up in DOM3: t.c.203t.dom3:;; Invalid sum of incoming counts 138435014 (estimated locally, freq 3.0936), should be 134239200 (estimated locally, freq 2.) so it seems to be a jump threading issue. It's gone with -fno-thread-jumps. Very likely a latent issue, but of course the change triggering this does have an effect on jump threading. Confirmed.
[Bug middle-end/113699] during GIMPLE pass: bitintlower ICE: SIGSEGV in var_to_partition (tree-ssa-live.h:163) with _BitInt() used in __asm__
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113699 --- Comment #2 from Andrew Pinski --- The code for lowering asm is not expecting an SSA_NAME_IS_DEFAULT_DEF here ...
[Bug target/111403] LoongArch: Wrong code with -O -mlasx -fopenmp-simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111403 --- Comment #8 from Chenghui Pan --- (In reply to Xi Ruoyao from comment #4) > After r14-5545 this issue became latent. > > And at some point before r14-5545 this issue became nondeterministic: a > compiled program *sometimes* crashes. Really strange... After applying this commit, GCC does not apply loop peeling while processing the openmp reduction directive, which I think is the spawn point of the problematic codes (According to Guo Jie's sample).
[Bug rust/113553] rust fails to build on sparc64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113553 --- Comment #2 from Andrew Pinski --- Which glibc are you using? See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82346 and maybe https://sourceware.org/bugzilla/show_bug.cgi?id=22146
[Bug libgcc/113700] New: libgcc_s does not include symbols for _Float16 and __bf16 on Solaris/Illumos even though gcc generates code for _Float16 and __bf16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113700 Bug ID: 113700 Summary: libgcc_s does not include symbols for _Float16 and __bf16 on Solaris/Illumos even though gcc generates code for _Float16 and __bf16 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgcc Assignee: unassigned at gcc dot gnu.org Reporter: youremailsarecrap at gmail dot com Target Milestone: --- Created attachment 57277 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57277=edit _Float16 and __bf16 that breaks on Illumos When compiling the code included in the .ii files on illumos the output claims that it can't find the symbols __extendhfdf2, __extendbfsf2 and __truncsfbf2 . Command line and output: g++ -march=native -shared -shared-libgcc -fPIC -Wl,--no-undefined -o f16-bf16.so f16-bf16.cc Undefined first referenced symbol in file __extendhfdf2 f16-bf16.so-f16-bf16.o __extendbfsf2 f16-bf16.so-f16-bf16.o __truncsfbf2f16-bf16.so-f16-bf16.o ld: fatal: symbol referencing errors. No output written to f16-bf16.so collect2: error: ld returned 1 exit status The -Wl,--no-undefined is used here to not have create a executable that links to the so files to cause the error. -march=native is used since sse2 seems to be needed for f16 and bf16 on x86 platforms if I understand the half-precision doc in gcc correctly, I used a system with sse2 when compiling this. My own fix, that may not be correct: I tried to add the gcc12 and up parts of libgcc/config/i386/libgcc-glibc.ver to libgcc/config/i386/libgcc-sol2.ver rebuilt gcc and then it linked correctly. Some maintainer could take a look at that.
[Bug target/113684] Cross compiler without assembler and linker should assume that all assembler and linker features are available
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113684 --- Comment #3 from Richard Biener --- I'm usually having cross assembler/linker around as they are easy to build.
[Bug c++/113687] -Warray-bounds is not emitted inside class method
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113687 Richard Biener changed: What|Removed |Added Blocks||56456 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-01 --- Comment #2 from Richard Biener --- (In reply to Andrew Pinski from comment #1) > The warning only happens if the vague linkage function is used. and IIRC > that is by design. Yeah, we try to avoid diagnosing things on "dead" code and here the whole functions are dead. IIRC even -fanalyzer runs after cgraph removes unreachable functions. It would be still nice to diagnose these kind of trivial cases. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456 [Bug 56456] [meta-bug] bogus/missing -Warray-bounds
[Bug target/113700] libgcc_s does not include symbols for _Float16 and __bf16 on Solaris/Illumos even though gcc generates code for _Float16 and __bf16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113700 --- Comment #1 from Andrew Pinski --- >I tried to add the gcc12 and up parts of It is correct except it should just use GCC 14 I think.
[Bug target/113690] [13/14 Regression] ICE: in as_a, at machmode.h:381 with -O2 -fno-dce -fno-forward-propagate -fno-split-wide-types -funroll-loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113690 Richard Biener changed: What|Removed |Added Priority|P3 |P2
[Bug middle-end/113669] -fsanitize=undefined failed to check a signed integer overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113669 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-01-31 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #2 from Richard Biener --- So confirmed.
[Bug tree-optimization/113670] ICE with vectors in named registers and -fno-vect-cost-model
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113670 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2024-01-31 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #3 from Richard Biener --- I'll hunt it down.
[Bug go/113668] [14 Regression] libgo soname bump needed for the GCC 14 release?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113668 Richard Biener changed: What|Removed |Added Keywords||ABI CC||rguenth at gcc dot gnu.org Target Milestone|--- |14.0
[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395 --- Comment #14 from JuzheZhong --- Thanks Richard. It seems that we can't fix this issue for now. Is that right ? If I understand correctly, do you mean we should wait after SLP representations are finished and then revisit this PR?
[Bug c++/113674] [11/12/13/14 Regression] [[____attr____]] causes internal compiler error: in decl_attributes, at attribs.cc:776
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113674 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-01-31
[Bug tree-optimization/113678] SLP misses up vec_concat
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113678 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-01-31 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- I think the SLP tree we discover is sound: t2.c:11:14: note: node 0x5db76f0 (max_nunits=8, refcnt=2) vector(8) char t2.c:11:14: note: op template: *a_7(D) = _1; t2.c:11:14: note: stmt 0 *a_7(D) = _1; t2.c:11:14: note: stmt 1 MEM[(char *)a_7(D) + 1B] = _2; t2.c:11:14: note: stmt 2 MEM[(char *)a_7(D) + 2B] = _3; t2.c:11:14: note: stmt 3 MEM[(char *)a_7(D) + 3B] = _4; t2.c:11:14: note: stmt 4 MEM[(char *)a_7(D) + 4B] = _1; t2.c:11:14: note: stmt 5 MEM[(char *)a_7(D) + 5B] = _2; t2.c:11:14: note: stmt 6 MEM[(char *)a_7(D) + 6B] = _3; t2.c:11:14: note: stmt 7 MEM[(char *)a_7(D) + 7B] = _4; t2.c:11:14: note: children 0x5db7778 t2.c:11:14: note: node 0x5db7778 (max_nunits=8, refcnt=2) vector(8) char t2.c:11:14: note: op template: _1 = *b_6(D); t2.c:11:14: note: stmt 0 _1 = *b_6(D); t2.c:11:14: note: stmt 1 _2 = MEM[(char *)b_6(D) + 1B]; t2.c:11:14: note: stmt 2 _3 = MEM[(char *)b_6(D) + 2B]; t2.c:11:14: note: stmt 3 _4 = MEM[(char *)b_6(D) + 3B]; t2.c:11:14: note: stmt 4 _1 = *b_6(D); t2.c:11:14: note: stmt 5 _2 = MEM[(char *)b_6(D) + 1B]; t2.c:11:14: note: stmt 6 _3 = MEM[(char *)b_6(D) + 2B]; t2.c:11:14: note: stmt 7 _4 = MEM[(char *)b_6(D) + 3B]; t2.c:11:14: note: load permutation { 0 1 2 3 0 1 2 3 } the issue is as so often t2.c:11:14: note: ==> examining statement: _1 = *b_6(D); t2.c:11:14: missed: BB vectorization with gaps at the end of a load is not supported t2.c:3:19: missed: not vectorized: relevant stmt not supported: _1 = *b_6(D); t2.c:11:14: note: Building vector operands of 0x5db7778 from scalars instead where we are not applying much non-ad-hoc work to deal with those "out-of-bound" accesses. The choice here would be obvious in doing a single vector(4) load instead.
[Bug tree-optimization/113676] [12 Regression] Miscompilation tree-vrp __builtin_unreachable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113676 --- Comment #2 from Magnus Hokland Hegdahl --- Hi, here's a version that doesn't need -std=c++20 or argv: https://godbolt.org/z/Y9ooY998e #include constexpr auto bit_ceil(unsigned x) -> unsigned { if (x <= 1) return 1U; int w = 32 - __builtin_clz(x - 1); return 1U << w; } int main(int argc, char **) { auto rounded_n = bit_ceil(static_cast(argc + 1)); auto a = std::vector(2UL * rounded_n); for (std::size_t i = rounded_n; i-- > 1;) { if (!(0 < i && i < rounded_n)) __builtin_unreachable(); a[i] = 0; } } Exact compile command used with g++-12 (GCC) 12.3.0 on arch linux, x86_64: g++-12 -O1 -ftree-vrp main.cpp
[Bug target/113633] FAIL: gcc.dg/bf-ms-attrib.c execution test, wrong size for ms_struct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113633 LIU Hao changed: What|Removed |Added CC||lh_mouse at 126 dot com --- Comment #1 from LIU Hao --- My suggestion is that following what MSVC produces is the only way to go.
[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395 --- Comment #19 from rguenther at suse dot de --- On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395 > > --- Comment #18 from JuzheZhong --- > (In reply to rguent...@suse.de from comment #17) > > On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395 > > > > > > --- Comment #16 from JuzheZhong --- > > > (In reply to rguent...@suse.de from comment #15) > > > > On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote: > > > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395 > > > > > > > > > > --- Comment #14 from JuzheZhong --- > > > > > Thanks Richard. > > > > > > > > > > It seems that we can't fix this issue for now. Is that right ? > > > > > > > > > > If I understand correctly, do you mean we should wait after SLP > > > > > representations > > > > > are finished and then revisit this PR? > > > > > > > > Yes. > > > > > > It seems to be a big refactor work. > > > > It's not too bad if people wouldn't continue to add features not > > implementing SLP ... > > > > > I wonder I can do anything to help with SLP representations ? > > > > I hope to get back to this before stage1 re-opens and will post > > another request for testing. It's really mostly going to be making > > sure all paths have coverage which means testing all the various > > architectures - I can only easily test x86. There's a branch > > I worked on last year, refs/users/rguenth/heads/vect-force-slp, > > which I use to hunt down cases not supporting SLP (it's a bit > > overeager to trigger, and it has known holes so it's not really > > a good starting point yet for folks to try other archs). > > Ok. It seems that you almost done with that but needs more testing in > various targets. > > So, if I want to work on optimizing vectorization (start with TSVC), > I should avoid touching the failed vectorized due to data reference/dependence > analysis (e.g. this PR case, s116). It depends on the actual case - the one in this bug at least looks like half of it might be dealt with with the refactoring. > and avoid adding new features into loop vectorizer, e.g. min/max reduction > with > index (s315). It's fine to add features if they works with SLP as well ;) Note that in the future SLP will also do the "single lane" case but it doesn't do that on trunk. Some features are difficult with multi-lane SLP and probably not important in practice for that case, still handling single-lane SLP will be important as otherwise the feature is lost. > To not to make your SLP refactoring work heavier. > > Am I right ? Yes. I've got early break vectorization to chase now, I was "finished" with the parts I could exercise on x86_64 in autumn ...
[Bug c/113679] New: long long minus double with gcc -m32 produces different results than other compilers or gcc -m64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679 Bug ID: 113679 Summary: long long minus double with gcc -m32 produces different results than other compilers or gcc -m64 Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: dilyan.palauzov at aegee dot org Target Milestone: --- diff.c is: #include int main(void) { long long l = 9223372036854775806; double d = 9223372036854775808.0; printf("%f\n", (double)l - d); return 0; } With gcc (GCC) 13.2.1 20231205 (Red Hat 13.2.1-6), gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, clang 16.0.4 and clang 17.0.5: $ gcc -m64 -o diff diff.c && ./diff 0.00 $ gcc -m32 -o diff diff.c && ./diff -2.00 $ clang -m64 -o diff diff.c && ./diff 0.00 $ clang -m32 -o diff diff.c && ./diff 0.00 With cl.exe 19.29.3015319.29.30153 (first is x84 - 32 bit, second is 64 bit) C:\> CALL "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat" x86 10.0.17763.0 C:\> cl diff.c >nul 2>nul & .\diff.exe 0.00 C:\> CALL "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat" amd64 10.0.17763.0 C:\> cl diff.c >nul 2>nul & .\diff.exe 0.00 gcc -m32 produces a different result, compared to gcc -m64, clang 17 (32 and 64bit), and MSCV Visual Studio 2019 (32 and 64bit).
[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679 --- Comment #4 from Jakub Jelinek --- Yeah, it is, that is how excess precision behaves. Due to the cast applying just to l rather than l - d it returns 0.0 with -fexcess-precision=standard, but if you change it to (double)(l - d) then it will return -2.0 at all optimization levels with -fexcess-precision=standard. -fexcess-precision=fast behaves depending on what instructions are actually used and where the conversions to float or double happen due to storing of expressions or subexpressions into memory as documented. If you don't like excess precision and have SSE2, you can use -msse2 -mfpmath=sse.
[Bug tree-optimization/113670] ICE with vectors in named registers and -fno-vect-cost-model
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113670 --- Comment #4 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:924137b9012cee5603482242de08fbf0b2030f6a commit r14-8645-g924137b9012cee5603482242de08fbf0b2030f6a Author: Richard Biener Date: Wed Jan 31 09:09:50 2024 +0100 tree-optimization/113670 - gather/scatter to/from hard registers The following makes sure we're not taking the address of hard registers when vectorizing appearant gathers or scatters to/from them. PR tree-optimization/113670 * tree-vect-data-refs.cc (vect_check_gather_scatter): Make sure we can take the address of the reference base. * gcc.target/i386/pr113670.c: New testcase.
[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395 --- Comment #15 from rguenther at suse dot de --- On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395 > > --- Comment #14 from JuzheZhong --- > Thanks Richard. > > It seems that we can't fix this issue for now. Is that right ? > > If I understand correctly, do you mean we should wait after SLP > representations > are finished and then revisit this PR? Yes.
[Bug tree-optimization/113670] ICE with vectors in named registers and -fno-vect-cost-model
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113670 Richard Biener changed: What|Removed |Added Known to fail|14.0| Target Milestone|--- |14.0 Resolution|--- |FIXED Status|ASSIGNED|RESOLVED Known to work||14.0 --- Comment #5 from Richard Biener --- Fixed for trunk.
[Bug tree-optimization/113676] [12 Regression] Miscompilation tree-vrp __builtin_unreachable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113676 Jakub Jelinek changed: What|Removed |Added CC||aldyh at gcc dot gnu.org, ||amacleod at redhat dot com, ||jakub at gcc dot gnu.org Keywords|needs-bisection | --- Comment #3 from Jakub Jelinek --- Bisection with -O2 -ftree-vrp #include unsigned bit_ceil (unsigned x) { if (x <= 1) return 1U; int w = 32 - __builtin_clz (x - 1); return 1U << w; } int main (int argc, char **) { unsigned rounded_n = bit_ceil ((unsigned) (argc + 1)); auto a = std::vector (2UL * rounded_n); for (long unsigned int i = rounded_n; i-- > 1;) { if (!(0 < i && i < rounded_n)) __builtin_unreachable(); a[i] = 0; } } shows this started with r12-155-gd8e1f1d24179690fd9c0f63c27b12e030010d9ea and went away with r13-3596-ge7310e24b1c0ca67b1bb507c1330b2bf39e59e32 so nothing really backportable.
[Bug tree-optimization/113676] [12 Regression] Miscompilation tree-vrp __builtin_unreachable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113676 --- Comment #4 from Jakub Jelinek --- And with --param=vrp1-mode=vrp it segfaulted even with r13-4276-gce917b0422c145779b83e005afd8433c0c86fb06 but the next revision removed that parameter, so can't go further.