[Bug target/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #6 from Xi Ruoyao --- (In reply to absoler from comment #3) > The gimple ir has no problem, but `_13` is replaced with g_26[5][3][0] in > the follow-up process, this shouldn't be expected behavior. > > We question this option because we found in an older version of gcc > (10.2.0), only the O2 option is needed to produce the same bad code, so we > worry about there's a hidden un-fixed problem and it's re-triggered by this > option. > > Besides, the bad binary code introduce more load operation than the source > code (without optimization), so we thought it's necessary to check it > regardless of which optimization is disabled. (In reply to absoler from comment #5) > (In reply to Andrew Pinski from comment #2) > > The difference from the gimple level IR: > > ``` > > _14 = g_26[5][3][0]; > > _15 = (int) _14; > > _16 = _13 ^ _15; > > g_51 = _16; > > if (_13 != _15) > > ``` > > > > vs: > > ``` > > _14 = g_26[5][3][0]; > > _15 = (int) _14; > > _16 = _13 ^ _15; > > g_51 = _16; > > if (_16 != 0) > > goto ; [50.00%] > > else > > goto ; [50.00%] > > ``` > > > > > > This is expected behavior even for the x86_64 target > > The gimple ir has no problem, but `_13` is replaced with g_26[5][3][0] in > the follow-up process, this shouldn't be expected behavior. No. GIMPLE pass knows nothing about register pressure, _13 is only a temporary variable, not necessarily an register. > We question this option because we found in an older version of gcc > (10.2.0), only the O2 option is needed to produce the same bad code, so we > worry about there's a hidden un-fixed problem and it's re-triggered by this > option. So are you expecting a bug must be fixed in at least two different passes and any -fno-* option shouldn't regress the code? No this won't happen.
[Bug middle-end/113205] [14 Regression] internal compiler error: in backward_pass, at tree-vect-slp.cc:5346 since r14-3220
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113205 --- Comment #10 from Richard Biener --- Btw, I was hoping Richard would chime in here ...
[Bug c++/113834] [14 Regression] internal compiler error: in tree_to_shwi, at tree.cc:6461
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-02-09 Summary|internal compiler error: in |[14 Regression] internal |tree_to_shwi, at|compiler error: in |tree.cc:6461|tree_to_shwi, at ||tree.cc:6461 CC||ppalka at gcc dot gnu.org Target Milestone|--- |14.0 Keywords|needs-bisection | --- Comment #4 from Andrew Pinski --- __type_pack_element support was introduced in r14-92-g58b7dbf865b146 . So yes this is visual regression in that using the libstdc++ headers in GCC 13 will not ICE but using them in GCC 14+, there will be an ICE.
[Bug libstdc++/113835] [13/14 Regression] compiling std::vector with const size in C++20 is slow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113835 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-02-09 Target Milestone|--- |13.3 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Known to fail||13.2.1, 14.0 Component|c++ |libstdc++ Known to work||12.2.1 Summary|compiling std::vector with |[13/14 Regression] |const size in C++20 is slow |compiling std::vector with ||const size in C++20 is slow --- Comment #1 from Richard Biener --- Confirmed with -std=c++20 -fsyntax-only constant expression evaluation : 1.80 ( 85%) 0.03 ( 14%) 1.84 ( 78%) 220M ( 88%) TOTAL : 2.13 0.22 2.36 250M Samples: 8K of event 'cycles', Event count (approx.): 9294971478 Overhead Samples Command Shared Object Symbol 16.33% 1385 cc1plus cc1plus [.] cxx_eval_constant_expression 4.35% 369 cc1plus cc1plus [.] cxx_eval_call_expression 3.90% 331 cc1plus cc1plus [.] cxx_eval_store_expression 3.16% 268 cc1plus cc1plus [.] hash_table::find_s 1.98% 168 cc1plus cc1plus [.] tree_operand_check GCC 12 was fast (possibly std::vector wasn't constexpr there?)
[Bug tree-optimization/113833] 435.gromacs fails verification on with -Ofast -march={cascadelake,icelake-server} and PGO after r14-7272-g57f611604e8bab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113833 --- Comment #3 from Richard Biener --- I suspect the issue would pop up with -Ofast -fno-vect-cost-model for any sub-architecture. The patch referenced just adjusts costs for doing BB vectorization (and there's reductions there as well). It might be interesting to offer more high-level knobs to tune for vectorization, say -fno-vect-bb-reduction or -fforce-in-order-bb-reduction-vectorization. A compare before/after the patch of -fopt-info-vec output might show the few cases that are affected by the patch.
[Bug target/113827] MrBayes benchmark redundant load
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113827 --- Comment #3 from Andrew Pinski --- (In reply to Robin Dapp from comment #0) > A hot block in the MrBayes benchmark (as used in the Phoronix testsuite) has > a redundant scalar load when vectorized. > > Minimal example, compiled with -march=rv64gcv -O3 > > int foo (float **a, float f, int n) > { > for (int i = 0; i < n; i++) > { > a[i][0] /= f; > a[i][1] /= f; > a[i][2] /= f; > a[i][3] /= f; > a[i] += 4; > } > } LLVM for aarch64 with the above testcase: `` .L3: ldr x2, [x0] mov x1, x2 ldr q31, [x2] fdivv31.4s, v31.4s, v0.4s str q31, [x1], 16 str x1, [x0], 8 HERE cmp x3, x0 bne .L3 ``` There is a store of x1 there. I really think you messed up reducing the testcase.
[Bug middle-end/65947] Vectorizer misses conditional assignment of constant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65947 Andrew Pinski changed: What|Removed |Added Severity|normal |enhancement Target Milestone|--- |6.0
[Bug target/113827] MrBayes benchmark redundant load
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113827 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING Last reconfirmed||2024-02-09 --- Comment #2 from Andrew Pinski --- >a redundant scalar load I don't see any redundant load in that loop. ``` L3: movq(%rdi), %rax ;; load a[i] from rdi vmovups (%rax), %xmm1 ;; load rax[0-3] into vector vdivps %xmm0, %xmm1, %xmm1 ;; divide vmovups %xmm1, (%rax) ;; store result back into rax[0-3] addq$16, %rax ;; add 4*4 to rax movq%rax, (%rdi) ;; store rax back into rdi addq$8, %rdi ;; add 8 to rdi cmpq%rdi, %rdx jne .L3 ;; compare and loop back ``` That is a[i] is different between each iterations. Maybe you reduced this code too much?
[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652 Peter Bergner changed: What|Removed |Added CC||meissner at gcc dot gnu.org --- Comment #15 from Peter Bergner --- (In reply to Kewen Lin from comment #11) > In gcc, lfiwzx is guarded with TARGET_LFIWZX => TARGET_POPCNTD (ISA2.06), > while -mvsx will guarantee TARGET_POPCNTD (ISA_2_6_MASKS_SERVER) set, so it > considers lfiwzx is supported. IMHO the underlying philosophy is that having > the capability of vsx the supported ISA level is at least 2.06, lfiwzx is > supported from 2.06, so it's supported. > > But binutils seems not to follow it: > {"xvadddp", XX3(60,96), XX3_MASK,PPCVSX,PPCVLE, > {XT6, XA6, XB6}}, > {"lfiwzx", X(31,887), X_MASK, POWER7|PPCA2, 0, > {FRT, RA0, RB}}, > Both are guarded with different masks and apparently PPCVSX doesn't enable > POWER7. That's because xvadddp is a VSX instruction (ie, mentioned in the VSX section of the ISA), while lfiwzx is a floating point instruction and part of the base ISA (for Power7 and above). To me, that means the -mvsx assembler option is correct to not enable lfiwzx. ...and as Alan mentioned, even changing the assembler to have -mvsx enable lfiwzx isn't a solution, since old already released assemblers would still be broken. The problem seems to be that the GCC option -mvsx enables some base (ie, non-vsx) instructions not included in the 7450 which seems dangerous to me. If the vsx support in the compiler really needs those base power7 instructions to function correctly, then we should be emitting an error when the user does -mcpu=CPU -mvsx and CPU is something less the power7. If the vsx support doesn't really need those base power7 instructions to operate, then we shouldn't be enabling them. Mike, can you confirm whether our -mvsx VSX support requires those base power7 instructions or not?
[Bug target/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838 --- Comment #5 from absoler at smail dot nju.edu.cn --- (In reply to Andrew Pinski from comment #2) > The difference from the gimple level IR: > ``` > _14 = g_26[5][3][0]; > _15 = (int) _14; > _16 = _13 ^ _15; > g_51 = _16; > if (_13 != _15) > ``` > > vs: > ``` > _14 = g_26[5][3][0]; > _15 = (int) _14; > _16 = _13 ^ _15; > g_51 = _16; > if (_16 != 0) > goto ; [50.00%] > else > goto ; [50.00%] > ``` > > > This is expected behavior even for the x86_64 target The gimple ir has no problem, but `_13` is replaced with g_26[5][3][0] in the follow-up process, this shouldn't be expected behavior. We question this option because we found in an older version of gcc (10.2.0), only the O2 option is needed to produce the same bad code, so we worry about there's a hidden un-fixed problem and it's re-triggered by this option. Besides, the bad binary code introduce more load operation than the source code (without optimization), so we thought it's necessary to check it regardless of which optimization is disabled.
[Bug target/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838 --- Comment #4 from absoler at smail dot nju.edu.cn --- @(In reply to Jakub Jelinek from comment #1) > Disabling optimizations and then wondering why optimizations didn't happen > is too weird. Don't do that. Such options are intended for debugging GCC, > or perhaps working around some compiler or application bug, but performance > of generated code should be only judged without such options. The gimple ir has no problem, but `_13` is replaced with g_26[5][3][0] in the follow-up process, this shouldn't be expected behavior. We question this option because we found in an older version of gcc (10.2.0), only the O2 option is needed to produce the same bad code, so we worry about there's a hidden un-fixed problem and it's re-triggered by this option. Besides, the bad binary code introduce more load operation than the source code (without optimization), so we thought it's necessary to check it regardless of which optimization is disabled.
[Bug target/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838 --- Comment #3 from absoler at smail dot nju.edu.cn --- The gimple ir has no problem, but `_13` is replaced with g_26[5][3][0] in the follow-up process, this shouldn't be expected behavior. We question this option because we found in an older version of gcc (10.2.0), only the O2 option is needed to produce the same bad code, so we worry about there's a hidden un-fixed problem and it's re-triggered by this option. Besides, the bad binary code introduce more load operation than the source code (without optimization), so we thought it's necessary to check it regardless of which optimization is disabled.
[Bug c++/113834] internal compiler error: in tree_to_shwi, at tree.cc:6461
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834 --- Comment #3 from Andrew Pinski --- Reduced testcase: ``` template class tuple{}; template __type_pack_element<__i, _Elements...> (tuple<_Elements...> &__t) noexcept; tuple data; template unsigned take_impl(unsigned idx) { if constexpr (Level != -1){ return take_impl(get(data)); } return 0; } int main() { take_impl<2>(0); } ``` Note I think this is invalid code ...
[Bug c++/113830] GCC accepts invalid code when instantiating the local class inside a function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113830 --- Comment #13 from Bo Wang --- (In reply to Harald van Dijk from comment #12) > (In reply to Bo Wang from comment #11) > > I have read the working draft standard of C++20 > > (https://github.com/cplusplus/draft/tree/c%2B%2B20). > > > > Following the subsection "13.9.2 Explicit instantiation" in the section > > "13.9 Template instantiation and specialization", the statement `template > > void f();` is an explicit instantiation, which requires instantiating > > everything in the function. > > Where are you getting "everything in the function" from? It seems to say > rather the opposite in [temp.explicit]p14: > > > An explicit instantiation does not constitute a use of a default argument, > > so default argument instantiation is not done. > > Now, the example shows that this was intended to apply to default arguments > of the function itself, but the actual wording does not limit it to that, so > I actually think this is a bug in clang, by the current wording this must be > accepted? Please refer to the example in Comment 9 which has no default arguments. For the standard, I found this one in "13.9 Template instantiation and specialization" p6 of C++20, which requires access checking. > The usual access checking rules do not apply to names in a declaration of an > explicit instantiation or explicit specialization, with the exception of > names > appearing in a function body, default argument, base-clause, member- > specification, enumerator-list, or static data member or variable template > initializer. [Note: In particular, the template arguments and names used in > the > function declarator zincluding parameter types, return types and exception > specifications) may be private types or objects that would normally not be > accessible. —end note] Also, I don't think Clang rejecting these codes is buggy.
[Bug c++/113844] New: inline namespace lookup ambiguity for using
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113844 Bug ID: 113844 Summary: inline namespace lookup ambiguity for using Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: ldalessandro at gmail dot com Target Milestone: --- namespace a { inline namespace b { inline int b{}; } } using namespace a::b; // gcc & msvc & edg reject, clang accepts I think that by https://eel.is/c++draft/basic.lookup.udir#1 this using should be okay, but gcc seems to be considering the a::b::b integer as a candidate, which feels wrong. Is there somewhere else that makes this ambiguous?
[Bug middle-end/113205] [14 Regression] internal compiler error: in backward_pass, at tree-vect-slp.cc:5346 since r14-3220
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113205 --- Comment #9 from Sérgio Basto --- Thank you it worked , MLT was built successfully on Fedora Rawhide
[Bug target/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838 Andrew Pinski changed: What|Removed |Added Target||x86_64 Component|tree-optimization |target Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #2 from Andrew Pinski --- The difference from the gimple level IR: ``` _14 = g_26[5][3][0]; _15 = (int) _14; _16 = _13 ^ _15; g_51 = _16; if (_13 != _15) ``` vs: ``` _14 = g_26[5][3][0]; _15 = (int) _14; _16 = _13 ^ _15; g_51 = _16; if (_16 != 0) goto ; [50.00%] else goto ; [50.00%] ``` This is expected behavior even for the x86_64 target
[Bug tree-optimization/113833] 435.gromacs fails verification on with -Ofast -march={cascadelake,icelake-server} and PGO after r14-7272-g57f611604e8bab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113833 --- Comment #2 from Andrew Pinski --- Looking at the past issues (which we "fixed"), makes me wonder about the spec verification testing for gromacs and the use of -Ofast ...
[Bug tree-optimization/113833] 435.gromacs fails verification on with -Ofast -march={cascadelake,icelake-server} and PGO after r14-7272-g57f611604e8bab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113833 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=60418 --- Comment #1 from Andrew Pinski --- >From https://www.spec.org/cpu2006/Docs/435.gromacs.html : > The gromacs.out results shouldn't differ by more than 1.25% from the > reference values.
[Bug c++/113834] internal compiler error: in tree_to_shwi, at tree.cc:6461
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org --- Comment #2 from Andrew Pinski --- Reducing, this looks like an ICE after an error ...
[Bug c++/100326] Crash with `#pragma GCC unroll` when calling value which can't be called in template function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100326 --- Comment #3 from Andrew Pinski --- I even tried: ``` template void f(T v) { #pragma GCC unroll v() for (int i = 0; i < 10; i++) { } } int main() { f(0); } ```
[Bug c++/100326] Crash with `#pragma GCC unroll` when calling value which can't be called in template function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100326 --- Comment #2 from Andrew Pinski --- This seems to be fixed on the trunk. I think by r14-6193-g59be79fd596ec8 .
[Bug libgomp/113843] FAIL: libgomp.c/alloc-pinned-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113843 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=113085 --- Comment #1 from Andrew Pinski --- Most likely a similar issue as on powerpc (and aarch64), see PR 113085 for some analysis.
[Bug libgomp/113843] New: FAIL: libgomp.c/alloc-pinned-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113843 Bug ID: 113843 Summary: FAIL: libgomp.c/alloc-pinned-1.c execution test Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: danglin at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- Host: hppa*-*-linux* Target: hppa*-*-linux* Build: hppa*-*-linux* spawn -ignore SIGHUP /home/dave/gnu/gcc/objdir/gcc/xgcc -B/home/dave/gnu/gcc/obj dir/gcc/ /home/dave/gnu/gcc/gcc/libgomp/testsuite/libgomp.c/alloc-pinned-1.c -B/ home/dave/gnu/gcc/objdir/hppa-linux-gnu/./libgomp/ -B/home/dave/gnu/gcc/objdir/h ppa-linux-gnu/./libgomp/.libs -I/home/dave/gnu/gcc/objdir/hppa-linux-gnu/./libgo mp -I/home/dave/gnu/gcc/gcc/libgomp/testsuite/../../include -I/home/dave/gnu/gcc /gcc/libgomp/testsuite/.. -fmessage-length=0 -fno-diagnostics-show-caret -fdiagn ostics-color=never -fopenmp -O2 -L/home/dave/gnu/gcc/objdir/hppa-linux-gnu/./lib gomp/.libs -lm -o ./alloc-pinned-1.exe PASS: libgomp.c/alloc-pinned-1.c (test for excess errors) Got flock('../lock', '--exclusive') at Thu Feb 08 13:32:33 UTC 2024 after 0 s Setting LD_LIBRARY_PATH to .:/home/dave/gnu/gcc/objdir/gcc:/home/dave/gnu/gcc/ob jdir/hppa-linux-gnu/./libgomp/.libs:/home/dave/gnu/gcc/objdir/gcc:.:/home/dave/g nu/gcc/objdir/gcc:/home/dave/gnu/gcc/objdir/hppa-linux-gnu/./libgomp/.libs:/home /dave/gnu/gcc/objdir/gcc:/home/dave/gnu/gcc/objdir/hppa-linux-gnu/libstdc++-v3/s rc/.libs:/home/dave/gnu/gcc/objdir/hppa-linux-gnu/libssp/.libs:/home/dave/gnu/gc c/objdir/hppa-linux-gnu/libphobos/src/.libs:/home/dave/gnu/gcc/objdir/hppa-linux -gnu/libgm2/.libs:/home/dave/gnu/gcc/objdir/hppa-linux-gnu/libgomp/.libs:/home/d ave/gnu/gcc/objdir/hppa-linux-gnu/libatomic/.libs:/home/dave/gnu/gcc/objdir/./gc c:/home/dave/gnu/gcc/objdir/./prev-gcc Execution timeout is: 300 spawn [open ...] FAIL: libgomp.c/alloc-pinned-1.c execution test Breakpoint 3, main () at /home/dave/gnu/gcc/gcc/libgomp/testsuite/libgomp.c/alloc-pinned-1.c:96 96if (amount == 0) (gdb) p amount $3 = 0 Program then aborts. mx3210:~# cat /proc/16240/status Name: alloc-pinned-1. Umask: 0022 State: t (tracing stop) Tgid: 16240 [...] VmLck: 0 kB VmPin: 0 kB dave@mx3210:~/gnu/gcc/objdir/hppa-linux-gnu/libgomp/testsuite$ ulimit -a real-time non-blocking time (microseconds, -R) unlimited core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 31949 max locked memory (kbytes, -l) 1025596 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 81920 cpu time (seconds, -t) unlimited max user processes (-u) 31949 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Similar fails: libgomp.c/alloc-pinned-2.c libgomp.c/alloc-pinned-3.c libgomp.c/alloc-pinned-4.c
[Bug libstdc++/113841] Can't swap two std::hash
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113841 --- Comment #1 from Viktor Ostashevskyi --- Issue is visible with -std=c++20, works fine for -std=c++17 (for both GCC12 and Clang).
[Bug libstdc++/113841] Can't swap two std::hash
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113841 --- Comment #2 from Viktor Ostashevskyi --- Compiler exporer link: https://godbolt.org/z/cPqsKq6nM
[Bug jit/113842] New: Assertion failure in assemble_external_libcall due to a missing finalizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113842 Bug ID: 113842 Summary: Assertion failure in assemble_external_libcall due to a missing finalizer Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: jit Assignee: dmalcolm at gcc dot gnu.org Reporter: bouanto at zoho dot com Target Milestone: --- This happens when compiling code with try/catch, so not yet possible to trigger on master. I'll soon post the patch to fix this issue.
[Bug libstdc++/113841] New: Can't swap two std::hash
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113841 Bug ID: 113841 Summary: Can't swap two std::hash Product: gcc Version: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: ostash at ostash dot kiev.ua Target Milestone: --- For the following code snippet: --- #include #include class Storage; template class MyAllocator { public: using value_type = T; using pointer = T*; MyAllocator(Storage* s); template MyAllocator(MyAllocator const& other) noexcept; T* allocate( std::size_t n ); void deallocate(T* p, std::size_t n ); private: Storage* s_; }; class Foo{ public: Foo(int, int); }; void x() { using std::swap; using MyVec = std::vector>; using MyArrVec = std::array; using MyHash = std::hash*>; MyHash h1, h2; swap(h1, h2); } --- GCC 12 reports: In file included from /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/vector:64, from :2: /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/bits/stl_vector.h: In instantiation of 'constexpr std::_Vector_base<_Tp, _Alloc>::_Vector_impl::_Vector_impl() [with _Tp = int; _Alloc = MyAllocator]': /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/bits/stl_vector.h:312:7: required by substitution of 'template static std::true_type std::__do_is_implicitly_default_constructible_impl::__test(const _Tp&, decltype (__helper(()))*) [with _Tp = std::array >, 1>]' /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/type_traits:1249:30: required from 'struct std::__is_implicitly_default_constructible_impl >, 1> >' /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/type_traits:1253:12: required from 'struct std::__is_implicitly_default_constructible_safe >, 1> >' /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/type_traits:167:12: required from 'struct std::__and_ >, 1> >, std::__is_implicitly_default_constructible_safe >, 1> > >' /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/type_traits:1258:12: required from 'struct std::__is_implicitly_default_constructible >, 1> >' /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/type_traits:167:12: required from 'struct std::__and_, std::__is_implicitly_default_constructible >, 1> > >' /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/type_traits:178:41: required from 'struct std::__not_, std::__is_implicitly_default_constructible >, 1> > > >' /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/bits/stl_pair.h:226:16: required from 'struct std::pair >, 1> >' :38:14: required from here /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/type_traits:1240:58: in 'constexpr' expansion of 'std::vector >()' /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/bits/stl_vector.h:526:7: in 'constexpr' expansion of '((std::vector >*)this)->std::vector >::.std::_Vector_base >::_Vector_base()' /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/bits/stl_vector.h:139:26: error: no matching function for call to 'MyAllocator::MyAllocator()' 139 | : _Tp_alloc_type() | ^ :16:3: note: candidate: 'template MyAllocator::MyAllocator(const MyAllocator&) [with T = int]' 16 | MyAllocator(MyAllocator const& other) noexcept; | ^~~ :16:3: note: template argument deduction/substitution failed: /opt/compiler-explorer/gcc-12.1.0/include/c++/12.1.0/bits/stl_vector.h:139:26: note: candidate expects 1 argument, 0 provided 139 | : _Tp_alloc_type() | ^ :13:3: note: candidate: 'MyAllocator::MyAllocator(Storage*) [with T = int]' 13 | MyAllocator(Storage* s); | ^~~ :13:3: note: candidate expects 1 argument, 0 provided :7:7: note: candidate: 'constexpr MyAllocator::MyAllocator(const MyAllocator&)' 7 | class MyAllocator | ^~~ :7:7: note: candidate expects 1 argument, 0 provided :7:7: note: candidate: 'constexpr MyAllocator::MyAllocator(MyAllocator&&)' :7:7: note: candidate expects 1 argument, 0 provided Compiler returned: 1 Clang 15, 16, 17 also fail to compile with similar error about missing default constructor for custom allocator when they are using libstdc++ >=12 GCC 13 is able to compile this, same as Clang with libc++.
[Bug tree-optimization/113673] [12/13/14 Regression] ICE: verify_flow_info failed: BB 5 cannot throw but has an EH edge with -Os -finstrument-functions -fnon-call-exceptions -ftrapv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113673 --- Comment #5 from Andrew Pinski --- (In reply to Roger Sayle from comment #4) > The identified patch implements += the same way as |=. Presumably a version > of the test case replacing "m += *data++;" with "m |= *data++;" would be > more useful at identifying a patch that actually changed EH edges. Well + can trap on overflow with -ftrapv (and cause exceptions with -fnon-call-exceptions) but | will not/cannot trap ... So that patch is definitely the one which changes the EH edges.
[Bug libstdc++/100147] libstdc++-v3/include/bits/gslice.h:170: missing check for assignment to self ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100147 Jonathan Wakely changed: What|Removed |Added Target Milestone|--- |14.0 Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from Jonathan Wakely --- Fixed
[Bug other/89863] [meta-bug] Issues in gcc that other static analyzers (cppcheck, clang-static-analyzer, PVS-studio) find that gcc misses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89863 Bug 89863 depends on bug 100147, which changed state. Bug 100147 Summary: libstdc++-v3/include/bits/gslice.h:170: missing check for assignment to self ? https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100147 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug libstdc++/100147] libstdc++-v3/include/bits/gslice.h:170: missing check for assignment to self ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100147 --- Comment #3 from GCC Commits --- The master branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:4e5dc6d9686a34d446147b923fe838389758a512 commit r14-8890-g4e5dc6d9686a34d446147b923fe838389758a512 Author: Jonathan Wakely Date: Sun Feb 4 21:39:11 2024 + libstdc++: Add comment to gslice::operator=(const gslice&) [PR100147] There's no need to check for self-assignment here, it would just add extra code for an unlikely case. Add a comment saying so. libstdc++-v3/ChangeLog: PR libstdc++/100147 * include/bits/gslice.h (operator=): Add comment about lack of self-assignment check.
[Bug libstdc++/107466] [12 Regression] invalid -Wnarrowing error with std::subtract_with_carry_engine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107466 Jonathan Wakely changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #13 from Jonathan Wakely --- And fixed for real now, for 12.4, 13.3 and trunk.
[Bug tree-optimization/113673] [12/13/14 Regression] ICE: verify_flow_info failed: BB 5 cannot throw but has an EH edge with -Os -finstrument-functions -fnon-call-exceptions -ftrapv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113673 --- Comment #4 from Roger Sayle --- The identified patch implements += the same way as |=. Presumably a version of the test case replacing "m += *data++;" with "m |= *data++;" would be more useful at identifying a patch that actually changed EH edges.
[Bug libstdc++/99117] [11/12/13/14 Regression] cannot accumulate std::valarray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99117 --- Comment #22 from Jonathan Wakely --- Instead of adding yet another __valarray_copy overload, we can just not use it: --- a/libstdc++-v3/include/std/valarray +++ b/libstdc++-v3/include/std/valarray @@ -840,7 +840,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // _GLIBCXX_RESOLVE_LIB_DEFECTS // 630. arrays of valarray. if (_M_size == __e.size()) - std::__valarray_copy(__e, _M_size, _Array<_Tp>(_M_data)); + { + // Copy manually instead of using __valarray_copy, because __e might + // alias _M_data and the _Array param type of __valarray_copy uses + // restrict which doesn't allow aliasing. + for (size_t __i = 0; __i < _M_size; ++__i) + _M_data[__i] = __e[__i]; + } else { if (_M_data) That would actually allow us to remove that __valarray_copy overload, because this is the only caller.
[Bug modula2/113836] gm2 does not dump gimple or quadruples to a file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113836 --- Comment #4 from Gaius Mulley --- Bootstrap completed and no extra failures seen in C, C++, Fortan or Modula-2.
[Bug libstdc++/113811] std::rotate does 64-bit signed division
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113811 --- Comment #3 from Jonathan Wakely --- It seems fairly easy to do: commit 12a028d76bbdf26d34d4d90a2ecdc39c6c0a4bd4 (HEAD -> master) Author: Jonathan Wakely Date: Thu Feb 8 15:40:32 2024 libstdc++: Use unsigned division in std::rotate [PR113811] Signed 64-bit division is much slower than unsigned, so cast the n and k values to unsigned before doing n %= k. We know this is safe because neither value can be negative. libstdc++-v3/ChangeLog: PR libstdc++/113811 * include/bits/stl_algo.h (__rotate): Use unsigned values for division. diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h index 9496b53f887..7a0cf6b6737 100644 --- a/libstdc++-v3/include/bits/stl_algo.h +++ b/libstdc++-v3/include/bits/stl_algo.h @@ -1251,6 +1251,12 @@ _GLIBCXX_BEGIN_INLINE_ABI_NAMESPACE(_V2) typedef typename iterator_traits<_RandomAccessIterator>::value_type _ValueType; +#if __cplusplus >= 201103L + typedef typename make_unsigned<_Distance>::type _UDistance; +#else + typedef _Distance _UDistance; +#endif + _Distance __n = __last - __first; _Distance __k = __middle - __first; @@ -1281,7 +1287,7 @@ _GLIBCXX_BEGIN_INLINE_ABI_NAMESPACE(_V2) ++__p; ++__q; } - __n %= __k; + __n = static_cast<_UDistance>(__n) % static_cast<_UDistance>(__k); if (__n == 0) return __ret; std::swap(__n, __k); @@ -1305,7 +1311,7 @@ _GLIBCXX_BEGIN_INLINE_ABI_NAMESPACE(_V2) --__q; std::iter_swap(__p, __q); } - __n %= __k; + __n = static_cast<_UDistance>(__n) % static_cast<_UDistance>(__k); if (__n == 0) return __ret; std::swap(__n, __k); Conditionally using 32-bit types would be a bit trickier, as it needs runtime branches, or making the type of __n and __k a template parameter, so we can call __rotate_with to use a smaller type than make_unsigned<_Distance> if max(n,k) < UINT_MAX.
[Bug libstdc++/90276] PSTL tests fail in Debug Mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276 --- Comment #15 from GCC Commits --- The releases/gcc-12 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:39d37ffbf890334b16ffb56da9fe00f0daa87f16 commit r12-10145-g39d37ffbf890334b16ffb56da9fe00f0daa87f16 Author: Jonathan Wakely Date: Wed Jan 31 10:41:49 2024 + libstdc++: Avoid reusing moved-from iterators in PSTL tests [PR90276] The reverse_invoker utility for PSTL tests uses forwarding references for all parameters, but some of those parameters get forwarded to move constructors which then leave the objects in a moved-from state. When the parameters are forwarded a second time that results in making new copies of moved-from iterators. For libstdc++ debug mode iterators, the moved-from state is singular, which means copying them will abort at runtime. The fix is to make copies of iterator arguments instead of forwarding them. The callers of reverse_invoker::operator() also forward the iterators multiple times, but that's OK because reverse_invoker accepts them by forwarding reference but then breaks the chain of forwarding and copies them as lvalues. libstdc++-v3/ChangeLog: PR libstdc++/90276 * testsuite/util/pstl/test_utils.h (reverse_invoker): Do not use perfect forwarding for iterator arguments. (cherry picked from commit 723a7c1ad29523b9ddff53c7b147bffea56fbb63)
[Bug libstdc++/113258] Pre-C++17 code that replaces malloc/free crashes when mixed with post-C++17 code that uses the align_val_t variants of new/delete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113258 --- Comment #28 from GCC Commits --- The releases/gcc-12 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:b255ab901dd0d13ad7f0dc1a823749a5e5f62570 commit r12-10142-gb255ab901dd0d13ad7f0dc1a823749a5e5f62570 Author: Jonathan Wakely Date: Tue Jan 9 15:22:46 2024 + libstdc++: Prefer posix_memalign for aligned-new [PR113258] As described in PR libstdc++/113258 there are old versions of tcmalloc which replace malloc and related APIs, but do not repalce aligned_alloc because it didn't exist at the time they were released. This means that when operator new(size_t, align_val_t) uses aligned_alloc to obtain memory, it comes from libc's aligned_alloc not from tcmalloc. But when operator delete(void*, size_t, align_val_t) uses free to deallocate the memory, that goes to tcmalloc's replacement version of free, which doesn't know how to free it. If we give preference to the older posix_memalign instead of aligned_alloc then we're more likely to use a function that will be compatible with the replacement version of free. Because posix_memalign has been around for longer, it's more likely that old third-party malloc replacements will also replace posix_memalign alongside malloc and free. libstdc++-v3/ChangeLog: PR libstdc++/113258 * libsupc++/new_opa.cc: Prefer to use posix_memalign if available. (cherry picked from commit f50f2efae9fb0965d8ccdb62cfdb698336d5a933)
[Bug libstdc++/107466] [12 Regression] invalid -Wnarrowing error with std::subtract_with_carry_engine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107466 --- Comment #12 from GCC Commits --- The releases/gcc-12 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:f2af87b9705d5a7e37b65bf342146ff25f025e49 commit r12-10143-gf2af87b9705d5a7e37b65bf342146ff25f025e49 Author: Jonathan Wakely Date: Thu Jan 11 15:09:12 2024 + libstdc++: Fix non-portable results from 64-bit std::subtract_with_carry_engine [PR107466] I implemented the resolution of LWG 3809 in r13-4364-ga64775a0edd469 but it was recently noted in the MSVC STL github repo that the change causes possible truncation for 64-bit seeds. Whether the truncation occurs (and to what value) depends on the width of uint_least32_t which is not portable, so the output of the PRNG for 64-bit seed values is no longer the same as in C++20, and no longer portable across platforms. That new issue was filed as LWG 4014. I proposed a new change which reduces the seed by the LCG's modulus before the conversion to uint_least32_t. This ensures that 64-bit seed values are consistently reduced by the modulus before any truncation. This removes the platform-dependent behaviour and restores the old behaviour for std::subtract_with_carry_engine specializations using a 64-bit result type (such as std::ranlux48_base). libstdc++-v3/ChangeLog: PR libstdc++/107466 * include/bits/random.tcc (subtract_with_carry_engine::seed): Implement proposed resolution of LWG 4014. * testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error line number. * testsuite/26_numerics/random/subtract_with_carry_engine/cons/lwg3809.cc: Check for expected result of 64-bit engine with seed that doesn't fit in 32-bits. (cherry picked from commit c224dec0e7c88e7a95633023018cdcb6ee87c65f)
[Bug fortran/113823] ice in gfc_get_element_type, at fortran/trans-types.cc:1286
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113823 --- Comment #7 from Steve Kargl --- On Thu, Feb 08, 2024 at 08:43:08PM +, dcb314 at hotmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113823 > > --- Comment #6 from David Binderman --- > (In reply to Steve Kargl from comment #5) > > That's not what I meant. There is no bug1006.f90 in > > the llvm-project repo. What is the actual URL to the > > actual testcase? It should look something like > > > > https://github.com/llvm/llvm-project/tree/main/flang/test/bug1006.f90 > > bug1006.f90 is my local file name for it. > > It is just a copy of the original file > Lower/HLFIR/array-ctor-derived.f90 in the flang test suite. > > That has an URL of > https://github.com/llvm/llvm-project/tree/main/flang/test/Lower/HLFIR/array-ctor-derived.f90 > Thanks. This may allow a gfortran contributor to see how other developers handled the code. Although not with this bug report, some contain an analysis of what the Fortran standard requires with a particular piece of code.
[Bug fortran/113799] gfc_replace_expr: double free detected ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113799 anlauf at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |anlauf at gcc dot gnu.org Status|NEW |ASSIGNED Priority|P3 |P4 Keywords|wrong-code |error-recovery --- Comment #8 from anlauf at gcc dot gnu.org --- Submitted: https://gcc.gnu.org/pipermail/fortran/2024-February/060208.html
[Bug fortran/113823] ice in gfc_get_element_type, at fortran/trans-types.cc:1286
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113823 --- Comment #6 from David Binderman --- (In reply to Steve Kargl from comment #5) > That's not what I meant. There is no bug1006.f90 in > the llvm-project repo. What is the actual URL to the > actual testcase? It should look something like > > https://github.com/llvm/llvm-project/tree/main/flang/test/bug1006.f90 bug1006.f90 is my local file name for it. It is just a copy of the original file Lower/HLFIR/array-ctor-derived.f90 in the flang test suite. That has an URL of https://github.com/llvm/llvm-project/tree/main/flang/test/Lower/HLFIR/array-ctor-derived.f90
[Bug c++/113839] misleading syntax error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113839 --- Comment #5 from Jonathan Wakely --- (In reply to Frank Heckenbach from comment #0) > While I appreciate gcc trying to by helpful, it seems it goes wrong rather > often. That doesn't match my experience. The errors that mention a specific grammar production tend to be accurate, with the odd exception like this bug (which will get fixed). I find the problem is that telling the user that a particular grammar production (like "unqualified-id" or "primary-expression") is expected isn't really helpful to the layperson who doesn't memorize the BNF-like grammar in the standard. Clang tends to do a better job in that regard, balancing accuracy with comprehensibility. If you encounter cases where a diagnostic is misleading, wrong, or just unhelpfully technical in its wording, please do report them so they can be improved.
[Bug modula2/113836] gm2 does not dump gimple or quadruples to a file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113836 Gaius Mulley changed: What|Removed |Added Attachment #57363|0 |1 is obsolete|| --- Comment #3 from Gaius Mulley --- Created attachment 57367 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57367=edit Proposed fix v3 Add a missing file and I've now seen it bootstrap successfully on ppc64le.
[Bug c++/113839] misleading syntax error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113839 --- Comment #4 from Jonathan Wakely --- (In reply to Marek Polacek from comment #2) > Confirmed; we should say that we expect an id there. $ clang++ s.cc s.cc:3:14: error: expected unqualified-id static int { }; ^ 1 error generated. $ edg s.cc "s.cc", line 3: error: expected an identifier static int { }; ^ 1 error detected in the compilation of "s.cc".
[Bug fortran/113840] New: [OpenACC] !$acc loop seq – bogus rejection of Fortran's EXIT/CYCLE + C/C++ break/continue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113840 Bug ID: 113840 Summary: [OpenACC] !$acc loop seq – bogus rejection of Fortran's EXIT/CYCLE + C/C++ break/continue Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: openacc, rejects-valid Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: tschwinge at gcc dot gnu.org Target Milestone: --- OpenACC seems to permit EXIT and CYCLE in "!$ACC LOOP" if there is the SEQ clause. The following quote is from OpenACC 3.2 but it can be also found in 2.5 a bit less explicit and between the lines also for 1.0 and 2.0: "2.9 Loop Construct" → "Restrictions" "A loop associated with a loop construct that does not have a seq clause must be written to meet all of the following conditions:1931 – The loop variable must be of integer, C/C++ pointer, or C++ random-access iterator type. – The loop variable must monotonically increase or decrease in the direction of its termination condition. – The loop trip count must be computable in constant time when entering the loop construct." Currently, it fails with: test.f90:4:6: 4 | EXIT | 1 Error: EXIT statement at (1) terminating !$ACC LOOP loop or test.c:5:7: error: break statement used with OpenMP for loop 5 | break; | ^ * * * Testcases: !$acc parallel !$acc loop seq do i=1, 5 EXIT end do !$acc end parallel end void f() { #pragma acc parallel #pragma acc loop seq for (int i=1; i < 5; i++) break; } * * * It seems as if the loop conditions are also relaxed, which needs to be handled / supported. (Not folding to OMP_FOR internally – or still? If not: at least PRIVATE needs to be handled and the SEQ be honored.) * * * Real-world testcase: https://gitlab.dkrz.de/icon/icon-model/-/blob/release-2024.01-public/src/diagnostics/mo_tropopause.f90?ref_type=heads#L200-L213
[Bug fortran/113799] gfc_replace_expr: double free detected ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113799 anlauf at gcc dot gnu.org changed: What|Removed |Added CC||anlauf at gcc dot gnu.org --- Comment #7 from anlauf at gcc dot gnu.org --- Created attachment 57366 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57366=edit Patch This patch changes reduce_unary to continue in the case of overflow encountered during walking through array ctors and returning the error at the end. This also fixes other simple examples like: integer, parameter :: n = huge(1) integer, parameter :: m(*) = [-(n + 1)] print *, -m end
[Bug fortran/113823] ice in gfc_get_element_type, at fortran/trans-types.cc:1286
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113823 --- Comment #5 from Steve Kargl --- On Thu, Feb 08, 2024 at 07:38:59PM +, dcb314 at hotmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113823 > > --- Comment #4 from David Binderman --- > (In reply to kargl from comment #3) > > If you do post the others, is it possible to include a URL to LLVM > > repository? This will allow us to give proper credit for the code. > > https://github.com/llvm/llvm-project/ > That's not what I meant. There is no bug1006.f90 in the llvm-project repo. What is the actual URL to the actual testcase? It should look something like https://github.com/llvm/llvm-project/tree/main/flang/test/bug1006.f90
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #9 from H.J. Lu --- (In reply to Jakub Jelinek from comment #7) > (In reply to H.J. Lu from comment #5) > > (In reply to Jakub Jelinek from comment #1) > > > Ugh no, please don't. > > > This is significant ABI change. > > > First of all, zeroing even for signed _BitInt is very weird, sign > > > extension > > > for that case is more natural, but when _BitInt doesn't have any > > > unspecified > > > bits, everything that computes them will need to compute even the extra > > > bits. That is not the case in the current code. > > > > Can we compare zeroing and undefined codegen of unused bits for storing > > signed _BitInt? > > Not easily, the bitint_info::extended support isn't there yet (as no target > needed it so far). See also the discussions about it on IRC and aarch64 > _BitInt support thread (aarch64 wants to have the extra bits unspecified, > but arm 32 extended). > > > Then implement whatever appropriate in GCC and make it the de facto ABI. > > So what's wrong with > https://gitlab.com/x86-psABIs/i386-ABI/-/issues/5 > ? Has it been discussed, or is i386-ABI dead? i386 psABI is not actively maintained. > I'd probably go with 32-bit limbs for _BitInt(65) and higher instead of > 64-bit, > but under the hood that is how it will be implemented no matter what the ABI > says, > whether it is 32-bit limbs or 64-bit limbs only affects a) the alignment b) > how much is wasted in case of say _BitInt(65) or _BitInt(129) etc. and what > the sizeof is. > Even if limbs are 64-bit, the question is about alignment, ia32 has 32-bit > alignment for long long and double at least when used inside of structs, so > it would be weird to have different alignment from struct { limb l1, l2; } > and similar. Just implement what is the appropriate in GCC. We will document it.
[Bug c++/113830] GCC accepts invalid code when instantiating the local class inside a function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113830 Harald van Dijk changed: What|Removed |Added CC||harald at gigawatt dot nl --- Comment #12 from Harald van Dijk --- (In reply to Bo Wang from comment #11) > I have read the working draft standard of C++20 > (https://github.com/cplusplus/draft/tree/c%2B%2B20). > > Following the subsection "13.9.2 Explicit instantiation" in the section > "13.9 Template instantiation and specialization", the statement `template > void f();` is an explicit instantiation, which requires instantiating > everything in the function. Where are you getting "everything in the function" from? It seems to say rather the opposite in [temp.explicit]p14: > An explicit instantiation does not constitute a use of a default argument, so > default argument instantiation is not done. Now, the example shows that this was intended to apply to default arguments of the function itself, but the actual wording does not limit it to that, so I actually think this is a bug in clang, by the current wording this must be accepted?
[Bug fortran/113823] ice in gfc_get_element_type, at fortran/trans-types.cc:1286
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113823 --- Comment #4 from David Binderman --- (In reply to kargl from comment #3) > If you do post the others, is it possible to include a URL to LLVM > repository? This will allow us to give proper credit for the code. https://github.com/llvm/llvm-project/
[Bug c++/113839] misleading syntax error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113839 --- Comment #3 from Frank Heckenbach --- > Except C++ parsing does not allow for that because C++ parsing requires > unlimited look ahead. While that's true in general, I think in specific cases (including most real-world cases), the look-ahead required is limited. E.g., here, I think it's clear the program is ill-formed at the ";" at the latest, perhaps even at the "{}" already. Even if gcc can't determine the cause of the error, I'd prefer if it said so rather than chosing one (IMHO unlikely) candidate for correction. (Is there actually a primary-expression that could be inserted there to make the program correct, or would this only lead to the next error?)
[Bug c++/113830] GCC accepts invalid code when instantiating the local class inside a function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113830 --- Comment #11 from Bo Wang --- (In reply to Jakub Jelinek from comment #10) > But again, T::unknown isn't used except in a template which is not > instantiated. > It can't be checked during parsing because T::unknown is dependent and could > very well be well formed if it was instantiated with a different template > argument. > So, does the standard require that all methods of local classes are > instantiated when the containing function template is instantiate (of > course, that can't be the case for methods which are templates on their own)? Thank you for pointing out the critical point. I have read the working draft standard of C++20 (https://github.com/cplusplus/draft/tree/c%2B%2B20). Following the subsection "13.9.2 Explicit instantiation" in the section "13.9 Template instantiation and specialization", the statement `template void f();` is an explicit instantiation, which requires instantiating everything in the function. So we don't need another explicit function call to instantiate it. If my understanding of the standard is correct, the compiler indeed should instantiate and reject this code.
[Bug c++/113839] misleading syntax error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113839 Marek Polacek changed: What|Removed |Added Last reconfirmed||2024-02-08 CC||mpolacek at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #2 from Marek Polacek --- Confirmed; we should say that we expect an id there.
[Bug c++/113839] misleading syntax error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113839 --- Comment #1 from Andrew Pinski --- > I'd prefer if gcc (by default, or at least optional) would limit itself to > reporting actual errors if and when they occur. Except C++ parsing does not allow for that because C++ parsing requires unlimited look ahead.
[Bug c++/113839] New: misleading syntax error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113839 Bug ID: 113839 Summary: misleading syntax error message Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: f.heckenb...@fh-soft.de Target Milestone: --- % cat test.cpp void f () { static int { }; } % g++ test.cpp test.cpp: In function 'void f()': test.cpp:3:3: error: expected primary-expression before 'static' 3 | static int { }; | ^~ This message is clearly misleading. There is nothing missing before "static", but rather the variable name after "int" is missing. I seem to get a lot of such confusing messages, to the point I tend to ignore the wording of the messages and treat them as generic "syntax error" messages, which is sad. While I appreciate gcc trying to by helpful, it seems it goes wrong rather often. I'd prefer if gcc (by default, or at least optional) would limit itself to reporting actual errors if and when they occur. (In this case, the program is correct up to and including "static int", so there shouldn't be any error reported on that part.)
[Bug middle-end/109967] [11/12/13/14 Regression] Wrong code at -O2 on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109967 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #10 from Jakub Jelinek --- Plus we have the r14-7274 workaround on the trunk now, wouldn't that make the problem go away too?
[Bug fortran/113799] gfc_replace_expr: double free detected ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113799 anlauf at gcc dot gnu.org changed: What|Removed |Added Known to work||7.5.0, 8.5.0, 9.5.0 --- Comment #6 from anlauf at gcc dot gnu.org --- It appears that we have an inconsistency in the handling of arithmetic errors during simplification. check_result has: if (val == ARITH_OK || val == ARITH_OVERFLOW) *rp = r; else gfc_free_expr (r); while reduce_unary has: for (c = gfc_constructor_first (head); c; c = gfc_constructor_next (c)) { rc = reduce_unary (eval, c->expr, ); if (rc != ARITH_OK) break; gfc_replace_expr (c->expr, r); } With -fno-range-check, ARITH_OVERFLOW does not appear.
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #8 from Jakub Jelinek --- BTW, I guess we should have some RTL optimization (possibly backend combiner pattern) to be able to optimize stuff like sall$7, y(%rip), %eax sall$7, %edi cmpl%eax, %edi to xorl %edi, y(%rip), %eax; testl $0x1ff, %edx or similar (and similarly without APX).
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #7 from Jakub Jelinek --- (In reply to H.J. Lu from comment #5) > (In reply to Jakub Jelinek from comment #1) > > Ugh no, please don't. > > This is significant ABI change. > > First of all, zeroing even for signed _BitInt is very weird, sign extension > > for that case is more natural, but when _BitInt doesn't have any unspecified > > bits, everything that computes them will need to compute even the extra > > bits. That is not the case in the current code. > > Can we compare zeroing and undefined codegen of unused bits for storing > signed _BitInt? Not easily, the bitint_info::extended support isn't there yet (as no target needed it so far). See also the discussions about it on IRC and aarch64 _BitInt support thread (aarch64 wants to have the extra bits unspecified, but arm 32 extended). > Then implement whatever appropriate in GCC and make it the de facto ABI. So what's wrong with https://gitlab.com/x86-psABIs/i386-ABI/-/issues/5 ? Has it been discussed, or is i386-ABI dead? I'd probably go with 32-bit limbs for _BitInt(65) and higher instead of 64-bit, but under the hood that is how it will be implemented no matter what the ABI says, whether it is 32-bit limbs or 64-bit limbs only affects a) the alignment b) how much is wasted in case of say _BitInt(65) or _BitInt(129) etc. and what the sizeof is. Even if limbs are 64-bit, the question is about alignment, ia32 has 32-bit alignment for long long and double at least when used inside of structs, so it would be weird to have different alignment from struct { limb l1, l2; } and similar.
[Bug fortran/109358] Wrong formatting with T-descriptor during stream output
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109358 --- Comment #10 from Jerry DeLisle --- To clarify. The following is the remaining issue that is not related to stream I/O: > program tabs > implicit none > integer :: fd > open(newunit=fd, file="test.txt", form="formatted") > write(fd, "(a)") "12345678901234567890123456789" > write(fd, "(i4, t25, t2, i4.4)") 1234, 0123 > close(fd) > end program tabs Posted earlier in comment #6.
[Bug fortran/109358] Wrong formatting with T-descriptor during stream output
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109358 --- Comment #9 from Jerry DeLisle --- Created attachment 57365 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57365=edit Preliminary patch The attached patch fixes the stream I/O related tabbing. This regression tests fine. There is another side issue I discovered not related to stream I am investigating still. I will also work up a test case stream issue.
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #6 from H.J. Lu --- (In reply to Jakub Jelinek from comment #4) > (In reply to H.J. Lu from comment #3) > > (In reply to Jakub Jelinek from comment #2) > > > OT, what is the state of the ia32 _BitInt ABI? I'd really like to enable > > > it > > > in GCC 14 even for ia32 (and perhaps -mx32 if you care about that case). > > > > I think we should leave ia32 alone. > > You mean never support C23 on it? Then implement whatever appropriate in GCC and make it the de facto ABI.
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #5 from H.J. Lu --- (In reply to Jakub Jelinek from comment #1) > Ugh no, please don't. > This is significant ABI change. > First of all, zeroing even for signed _BitInt is very weird, sign extension > for that case is more natural, but when _BitInt doesn't have any unspecified > bits, everything that computes them will need to compute even the extra > bits. That is not the case in the current code. Can we compare zeroing and undefined codegen of unused bits for storing signed _BitInt?
[Bug fortran/113823] ice in gfc_get_element_type, at fortran/trans-types.cc:1286
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113823 --- Comment #3 from kargl at gcc dot gnu.org --- (In reply to David Binderman from comment #2) > (In reply to kargl from comment #1) > > (In reply to David Binderman from comment #0) > > > > > > This is the second ice from the flang test suite. > > > > If you're keep score > > https://discourse.llvm.org/t/proposal-rename-flang-new-to-flang/69462/57 > > Interesting. Thanks. > > 32 more ice to be reported. There are probably some duplicates > amongst that group. > > Hopefully I can report most or all of these before the next release. If you do post the others, is it possible to include a URL to LLVM repository? This will allow us to give proper credit for the code.
[Bug middle-end/113415] ICE: RTL check: -mstringop-strategy=byte_loop vs inline-asm goto with block copies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113415 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek --- Created attachment 57364 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57364=edit gcc14-pr113415.patch If all we care here is not to ICE on it, then this patch should suffice. But, if there is some asm goto where such loops could be emitted and needed on all the asm goto edges, then more work will be needed, duplicate_insn_chain doesn't duplicate CODE_LABELs and remap references to them in the copied sequences.
[Bug tree-optimization/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- Disabling optimizations and then wondering why optimizations didn't happen is too weird. Don't do that. Such options are intended for debugging GCC, or perhaps working around some compiler or application bug, but performance of generated code should be only judged without such options.
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #4 from Jakub Jelinek --- (In reply to H.J. Lu from comment #3) > (In reply to Jakub Jelinek from comment #2) > > OT, what is the state of the ia32 _BitInt ABI? I'd really like to enable it > > in GCC 14 even for ia32 (and perhaps -mx32 if you care about that case). > > I think we should leave ia32 alone. You mean never support C23 on it?
[Bug tree-optimization/113838] New: regression of redundant load operation introduced by -fno-tree-forwprop introduce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838 Bug ID: 113838 Summary: regression of redundant load operation introduced by -fno-tree-forwprop introduce Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: absoler at smail dot nju.edu.cn Target Milestone: --- hi, I have found for the following code, with -O2 option, gcc-10.2.0 will generate a redundant load, and gcc-13.2.0 won't. However, with an extra flag " -fno-tree-forwprop", gcc-13.2.0 will produce the same bad code full code https://godbolt.org/z/objsWGnY6 ``` func_37() ... l_50[1] = _51; if (((*l_50[1]) ^= g_26[5][3][0])) { /* block id: 13 */ int32_t **l_52[5] = {_50[2],_50[2],_50[2],_50[2],_50[2]}; int i; (*l_52[4]) = (((void*)0 != _51) , _36[3][4]); return p_39; } else { /* block id: 16 */ int32_t *l_53 = _54[0][1][1]; return l_53; } ``` ``` func_37(): 401d74: mov0x364e(%rip),%edx# 4053c8 func_11(): 401d7a: mov%al,0x33d9(%rip)# 405159 func_37(): 401d80: mov0x33c2(%rip),%eax# 405148 401d86: xor%eax,%edx 401d88: cmp0x363a(%rip),%eax# 4053c8 401d8e: mov$0x40510c,%eax 401d93: mov%edx,0x33af(%rip)# 405148 401d99: mov$0x4051a4,%edx 401d9e: cmovne %rdx,%rax ``` the second load of g_26[5][3][0], i.e. "cmp0x363a(%rip),%eax" can be optimized away. The better code generated by gcc-13.2.0 is: ``` func_37(): 401e40: mov0x3582(%rip),%eax# 4053c8 401e46: mov%edx,%ecx 401e48: xor%eax,%ecx 401e4a: cmp%eax,%edx 401e4c: mov$0x40510c,%edx 401e51: mov$0x4051a4,%eax 401e56: cmove %rdx,%rax ```
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #3 from H.J. Lu --- (In reply to Jakub Jelinek from comment #2) > OT, what is the state of the ia32 _BitInt ABI? I'd really like to enable it > in GCC 14 even for ia32 (and perhaps -mx32 if you care about that case). I think we should leave ia32 alone. x32 uses the same psABI as x86-64.
[Bug modula2/113836] gm2 does not dump gimple or quadruples to a file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113836 --- Comment #2 from Gaius Mulley --- Created attachment 57363 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57363=edit Proposed fix Here is a proposed fix which implements: -fdump-lang-all, -fdump-lang-quad, -fdump-lang-quad=, -fdump-lang-gimple, -fdump-lang-gimple=, -fm2-dump-filter=. The filter must be a comma separated list which can take three forms: the full decl textual name of a procedure, [libname.]module.ident or [filename.]module.ident. Currently it only filters on procedure names and regexp matching is not implemented. It would be straightforward to add regexp if required as a followup also there could be a another option to walk the tree and dump out all dependants possibly. An example of it usage: $ gm2 hello3.mod -fdump-lang-all -fm2-whole-program -fm2-dump-filter=\ m2pim.Storage.ALLOCATE,\ m2pim.SysStorage.ALLOCATE,\ Storage_DEALLOCATE,NumberIO.HexToStr $ ls -1r a-hello3.mod.001l.quad a-hello3.mod.002l.quad a-hello3.mod.003l.quad a-hello3.mod.002l.gimple a-hello3.mod.001l.gimple a-hello3.mod.004l.quad Currently undergoing full bootstrapping testing.
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 --- Comment #2 from Jakub Jelinek --- OT, what is the state of the ia32 _BitInt ABI? I'd really like to enable it in GCC 14 even for ia32 (and perhaps -mx32 if you care about that case).
[Bug target/113837] Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- Ugh no, please don't. This is significant ABI change. First of all, zeroing even for signed _BitInt is very weird, sign extension for that case is more natural, but when _BitInt doesn't have any unspecified bits, everything that computes them will need to compute even the extra bits. That is not the case in the current code.
[Bug target/113837] New: Zeroing unused bits in _BitInt can improve codegen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113837 Bug ID: 113837 Summary: Zeroing unused bits in _BitInt can improve codegen Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: crazylht at gmail dot com Target Milestone: --- Target: x86-64 I opened this x86-64 psABI issue: https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/16
[Bug target/113832] [14 Regression] 6% exec time regression of 464.h264ref on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113832 Roger Sayle changed: What|Removed |Added CC||roger at nextmovesoftware dot com --- Comment #2 from Roger Sayle --- Adding myself to Cc list (in case this is confirmed to be a widening multiply issue).
[Bug modula2/113836] gm2 does not dump gimple or quadruples to a file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113836 Gaius Mulley changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2024-02-08 --- Comment #1 from Gaius Mulley --- Following on from a discussion on irc it was suggested that filtering the IR would be useful.
[Bug modula2/113836] New: gm2 does not dump gimple or quadruples to a file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113836 Bug ID: 113836 Summary: gm2 does not dump gimple or quadruples to a file Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: modula2 Assignee: gaius at gcc dot gnu.org Reporter: gaius at gcc dot gnu.org Target Milestone: --- During the early exploratory stage in PR113588 it would have been useful to be able to dump the modula-2 quadruples and gimple representation to file (rather than stdout).
[Bug c++/113835] New: compiling std::vector with const size in C++20 is slow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113835 Bug ID: 113835 Summary: compiling std::vector with const size in C++20 is slow Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: pobrn at protonmail dot com Target Milestone: --- Consider the following code: #include const std::size_t N = 1'000'000; std::vector x(N); int main() {} Then: $ hyperfine 'g++ -std=c++20 -O2 x.cpp' 'g++ -std=c++17 -O2 x.cpp' Benchmark 1: g++ -std=c++20 -O2 x.cpp Time (mean ± σ): 4.945 s ± 0.116 s[User: 4.676 s, System: 0.229 s] Range (min … max):4.770 s … 5.178 s10 runs Benchmark 2: g++ -std=c++17 -O2 x.cpp Time (mean ± σ): 491.3 ms ± 24.0 ms[User: 440.9 ms, System: 46.3 ms] Range (min … max): 465.6 ms … 538.0 ms10 runs Summary g++ -std=c++17 -O2 x.cpp ran 10.07 ± 0.55 times faster than g++ -std=c++20 -O2 x.cpp If you remove the `const` from `N`, the runtime will be closer to C++17 levels. `-ftime-report` suggests that "constant expression evaluation" is the reason. I imagine this is related to C++20 making std::vector constexpr.
[Bug tree-optimization/113673] [12/13/14 Regression] ICE: verify_flow_info failed: BB 5 cannot throw but has an EH edge with -Os -finstrument-functions -fnon-call-exceptions -ftrapv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113673 Michal Jireš changed: What|Removed |Added Last reconfirmed|2024-01-30 00:00:00 |2024-2-8 Keywords|needs-bisection | CC||mjires at gcc dot gnu.org, ||roger at nextmovesoftware dot com --- Comment #3 from Michal Jireš --- Bisected to r12-5453-ga944b5dec3adb2.
[Bug c++/113834] internal compiler error: in tree_to_shwi, at tree.cc:6461
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834 --- Comment #1 from Ivan Lazaric --- Created attachment 57362 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57362=edit Preprocessed file generated by `-freport-bug`
[Bug c++/113834] New: internal compiler error: in tree_to_shwi, at tree.cc:6461
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113834 Bug ID: 113834 Summary: internal compiler error: in tree_to_shwi, at tree.cc:6461 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: ivan.lazaric.gcc at gmail dot com Target Milestone: --- Version of g++ is 14.0.1, built of commit 3979171d8ea149912d99bdca6aeb3e7276842f02 Complete backtrace: ``` /home/ilazaric/installs/gcc-latest/include/c++/14.0.1/bits/utility.h:231:13: internal compiler error: in tree_to_shwi, at tree.cc:6461 231 | { using type = __type_pack_element<_Np, _Types...>; }; | ^~~~ 0x99d54d tree_to_shwi(tree_node const*) ../../gcc/gcc/tree.cc:6461 0x99d54d tree_to_shwi(tree_node const*) ../../gcc/gcc/tree.cc:6459 0xd07a45 finish_type_pack_element ../../gcc/gcc/cp/semantics.cc:4653 0xd07a45 finish_trait_type(cp_trait_kind, tree_node*, tree_node*, int) ../../gcc/gcc/cp/semantics.cc:12783 0xcc8fc7 tsubst(tree_node*, tree_node*, int, tree_node*) ../../gcc/gcc/cp/pt.cc:16920 0xcbb9ae tsubst_decl ../../gcc/gcc/cp/pt.cc:15518 0xce3350 instantiate_class_template(tree_node*) ../../gcc/gcc/cp/pt.cc:12494 0xd37570 complete_type(tree_node*) ../../gcc/gcc/cp/typeck.cc:138 0xd3769d complete_type_or_maybe_complain(tree_node*, tree_node*, int) ../../gcc/gcc/cp/typeck.cc:151 0xcc9c76 tsubst(tree_node*, tree_node*, int, tree_node*) ../../gcc/gcc/cp/pt.cc:16795 0xcbb9ae tsubst_decl ../../gcc/gcc/cp/pt.cc:15518 0xce3350 instantiate_class_template(tree_node*) ../../gcc/gcc/cp/pt.cc:12494 0xd37570 complete_type(tree_node*) ../../gcc/gcc/cp/typeck.cc:138 0xd3769d complete_type_or_maybe_complain(tree_node*, tree_node*, int) ../../gcc/gcc/cp/typeck.cc:151 0xcc9c76 tsubst(tree_node*, tree_node*, int, tree_node*) ../../gcc/gcc/cp/pt.cc:16795 0xcbb9ae tsubst_decl ../../gcc/gcc/cp/pt.cc:15518 0xcad863 instantiate_template(tree_node*, tree_node*, int) ../../gcc/gcc/cp/pt.cc:22089 0xcc9adb instantiate_alias_template ../../gcc/gcc/cp/pt.cc:22180 0xcc9adb tsubst(tree_node*, tree_node*, int, tree_node*) ../../gcc/gcc/cp/pt.cc:16180 0xcc8652 tsubst(tree_node*, tree_node*, int, tree_node*) ../../gcc/gcc/cp/pt.cc:16226 ``` Attached is the preprocessed file generated by `-freport-bug`
[Bug tree-optimization/113818] ICE: verify_gimple failed: missing 'PHI' def with -Os -fnon-call-exceptions -finstrument-functions-once and _BitInt()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113818 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- Created attachment 57361 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57361=edit gcc14-pr113818.patch Untested fix.
[Bug c/113825] missing warning for omitted parameter names in function definitions (c23 extension)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113825 Joseph S. Myers changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Joseph S. Myers --- This is diagnosed as expected with -pedantic (given a standard older than C23) or -Wc11-c23-compat (even in C23 mode); I see no reason for this to be different from any other extension that doesn't affect the semantics of code valid in an older standard version.
[Bug tree-optimization/113833] New: 435.gromacs fails verification on with -Ofast -march={cascadelake,icelake-server} and PGO after r14-7272-g57f611604e8bab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113833 Bug ID: 113833 Summary: 435.gromacs fails verification on with -Ofast -march={cascadelake,icelake-server} and PGO after r14-7272-g57f611604e8bab Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jamborm at gcc dot gnu.org CC: fxue at os dot amperecomputing.com Blocks: 26163 Target Milestone: --- Host: x86_64-linux Target: x86_64-linux After r14-7272-g57f611604e8bab (Feng Xue: Do not count unused scalar use when marking STMT_VINFO_LIVE_P [PR113091]), our runs of SPEC 2006 CPU benchmark 435.gromacs on Icelake-server CPU compiled with -Ofast -march=native and PGO (with and without LTO) started failing with miscompare error: 0002: 3.07684e+02 3.03476e+02 ^ I subsequently verified the failure on an Intel CascadeLake and bisected it to the aforementioned commit. We don't see it on our AMD or Ampere testers (using -march=native). I guess the miscomparison error may be well within what is expected when using -Ofast but even in that case it would be nice to have it documented here that that is indeed expected. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
[Bug debug/113519] [14 Regression] ICE: in replace_child, at dwarf2out.cc:5704 with -g -fdebug-types-section -fsso-struct=big-endian (or little-endian if the target is big-endian)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113519 Michal Jireš changed: What|Removed |Added Keywords|needs-bisection | CC||ebotcazou at gcc dot gnu.org, ||mjires at gcc dot gnu.org Last reconfirmed|2024-01-20 00:00:00 |2024-2-8 --- Comment #4 from Michal Jireš --- Bisected to r14-7098-g5d8b60effc7268.
[Bug rtl-optimization/113390] [14 Regression] ICE: in model_update_limit_points_in_group, at haifa-sched.cc:1986 with -O2 --param=max-sched-region-insns=200 --param=max-sched-extend-regions-iters=2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113390 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-08 --- Comment #2 from Andrew Pinski --- (In reply to Michal Jireš from comment #1) > Bisected to r14-7114-g113475d03b0ab1. Funny, guess a latent bug.
[Bug rtl-optimization/113390] [14 Regression] ICE: in model_update_limit_points_in_group, at haifa-sched.cc:1986 with -O2 --param=max-sched-region-insns=200 --param=max-sched-extend-regions-iters=2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113390 Michal Jireš changed: What|Removed |Added Keywords|needs-bisection | CC||mjires at gcc dot gnu.org, ||pinskia at gcc dot gnu.org --- Comment #1 from Michal Jireš --- Bisected to r14-7114-g113475d03b0ab1.
[Bug middle-end/102580] Failure to optimize signed division to unsigned division when dividend can't be negative
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102580 --- Comment #7 from Jakub Jelinek --- (In reply to Andrew Macleod from comment #6) > My other question. so is the issue that unsigned divides are cheaper than > signed divides? The middle-end doesn't really know. On some targets unsigned divides can be cheaper than signed divides, on others it doesn't matter, it wouldn't surprise me if there were targets where signed divides are cheaper than unsigned. And then there could be targets where say unsigned divides aren't implemented in hw and signed ones are or vice versa (and the other is implemented in libgcc). On GIMPLE, adding any further casts makes the IL larger and shorter IL is what we usually consider the canonical case (sure, there can be some exceptions, but if there are, it is for stuff that is desirable on all targets, not just a subset of them). So, the idea is if we from ranges find out that a division or similar operation can be equivalently implemented as signed or unsigned because the ranges say that the operands don't have MSB set when used on the division/modulo stmt, we emit both as RTL and ask the backend what is cheaper.
[Bug middle-end/102580] Failure to optimize signed division to unsigned division when dividend can't be negative
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102580 --- Comment #6 from Andrew Macleod --- (In reply to Jakub Jelinek from comment #5) > To be precise, expand_expr_divmod uses get_range_pos_neg for that during > expansion (unless we'd e.g. somehow noted it in some very late pass before > expansion that the division ought to be expanded both ways and cost > compared), and get_range_pos_neg uses the global range of SSA_NAME only. In > order to optimize #c0 we'd need to query range with the use stmt and > enabling ranger in the pass (that is possible in some pass before expansion, > but doing it during expansion (which would be useful to other spots too, say > .*_OVERFLOW expansion) would need to deal with some basic blocks already > converted into RTL and others still at GIMPLE). Im working on a logging mechanism for ranges for GCC 15. Its possible that a side effect of this work could make some selective contextual ranges available from the global table after .. in which case we could get things like this as if ranger was running. My other question. so is the issue that unsigned divides are cheaper than signed divides? the global range of _2 is set: _2 : [irange] int [0, 715827882] Given the statements in .optimized are: [local count: 1073741824]: _2 = x_1(D) / 3; return _2; could we not actually conclude that the divide can be unsigned based on the result being positive and the divisor being positive? We have "simple" versions of fold_range() which would calculate _2 on the statement from the global value of x_1, but there aren't any simple versions of the operand calculators. It would be fairly trivial to provide one which, given the global value for _2, you could ask op1_range (stmt) and get back a value of [0, +INF] for x_1 at that statement. If that would help... THey are going to be added for GCC 15 anyway...
[Bug target/113832] [14 Regression] 6% exec time regression of 464.h264ref on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113832 --- Comment #1 from Andrew Pinski --- Maybe https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=2f14c0dbb789852947cb58fdf7d3162413f053fa
[Bug libstdc++/113258] Pre-C++17 code that replaces malloc/free crashes when mixed with post-C++17 code that uses the align_val_t variants of new/delete
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113258 --- Comment #27 from GCC Commits --- The releases/gcc-13 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:c5e12bbb45313df876ee2b81e418851822bed694 commit r13-8307-gc5e12bbb45313df876ee2b81e418851822bed694 Author: Jonathan Wakely Date: Tue Jan 9 15:22:46 2024 + libstdc++: Prefer posix_memalign for aligned-new [PR113258] As described in PR libstdc++/113258 there are old versions of tcmalloc which replace malloc and related APIs, but do not repalce aligned_alloc because it didn't exist at the time they were released. This means that when operator new(size_t, align_val_t) uses aligned_alloc to obtain memory, it comes from libc's aligned_alloc not from tcmalloc. But when operator delete(void*, size_t, align_val_t) uses free to deallocate the memory, that goes to tcmalloc's replacement version of free, which doesn't know how to free it. If we give preference to the older posix_memalign instead of aligned_alloc then we're more likely to use a function that will be compatible with the replacement version of free. Because posix_memalign has been around for longer, it's more likely that old third-party malloc replacements will also replace posix_memalign alongside malloc and free. libstdc++-v3/ChangeLog: PR libstdc++/113258 * libsupc++/new_opa.cc: Prefer to use posix_memalign if available. (cherry picked from commit f50f2efae9fb0965d8ccdb62cfdb698336d5a933)
[Bug libstdc++/107466] [12 Regression] invalid -Wnarrowing error with std::subtract_with_carry_engine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107466 --- Comment #11 from GCC Commits --- The releases/gcc-13 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:3bdd80d56aa07d5975f551e0026f3cf9411124bf commit r13-8306-g3bdd80d56aa07d5975f551e0026f3cf9411124bf Author: Jonathan Wakely Date: Thu Jan 11 15:09:12 2024 + libstdc++: Fix non-portable results from 64-bit std::subtract_with_carry_engine [PR107466] I implemented the resolution of LWG 3809 in r13-4364-ga64775a0edd469 but it was recently noted in the MSVC STL github repo that the change causes possible truncation for 64-bit seeds. Whether the truncation occurs (and to what value) depends on the width of uint_least32_t which is not portable, so the output of the PRNG for 64-bit seed values is no longer the same as in C++20, and no longer portable across platforms. That new issue was filed as LWG 4014. I proposed a new change which reduces the seed by the LCG's modulus before the conversion to uint_least32_t. This ensures that 64-bit seed values are consistently reduced by the modulus before any truncation. This removes the platform-dependent behaviour and restores the old behaviour for std::subtract_with_carry_engine specializations using a 64-bit result type (such as std::ranlux48_base). libstdc++-v3/ChangeLog: PR libstdc++/107466 * include/bits/random.tcc (subtract_with_carry_engine::seed): Implement proposed resolution of LWG 4014. * testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error line number. * testsuite/26_numerics/random/subtract_with_carry_engine/cons/lwg3809.cc: Check for expected result of 64-bit engine with seed that doesn't fit in 32-bits. (cherry picked from commit c224dec0e7c88e7a95633023018cdcb6ee87c65f)
[Bug libstdc++/90276] PSTL tests fail in Debug Mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276 --- Comment #14 from GCC Commits --- The releases/gcc-13 branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:3c04a1533b32362c7c28fc32b05623dda45a1b44 commit r13-8304-g3c04a1533b32362c7c28fc32b05623dda45a1b44 Author: Jonathan Wakely Date: Wed Jan 31 10:41:49 2024 + libstdc++: Avoid reusing moved-from iterators in PSTL tests [PR90276] The reverse_invoker utility for PSTL tests uses forwarding references for all parameters, but some of those parameters get forwarded to move constructors which then leave the objects in a moved-from state. When the parameters are forwarded a second time that results in making new copies of moved-from iterators. For libstdc++ debug mode iterators, the moved-from state is singular, which means copying them will abort at runtime. The fix is to make copies of iterator arguments instead of forwarding them. The callers of reverse_invoker::operator() also forward the iterators multiple times, but that's OK because reverse_invoker accepts them by forwarding reference but then breaks the chain of forwarding and copies them as lvalues. libstdc++-v3/ChangeLog: PR libstdc++/90276 * testsuite/util/pstl/test_utils.h (reverse_invoker): Do not use perfect forwarding for iterator arguments. (cherry picked from commit 723a7c1ad29523b9ddff53c7b147bffea56fbb63)
[Bug fortran/113823] ice in gfc_get_element_type, at fortran/trans-types.cc:1286
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113823 --- Comment #2 from David Binderman --- (In reply to kargl from comment #1) > (In reply to David Binderman from comment #0) > > > > This is the second ice from the flang test suite. > > If you're keep score > https://discourse.llvm.org/t/proposal-rename-flang-new-to-flang/69462/57 Interesting. Thanks. 32 more ice to be reported. There are probably some duplicates amongst that group. Hopefully I can report most or all of these before the next release.
[Bug target/113832] [14 Regression] 6% exec time regression of 464.h264ref on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113832 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org Target Milestone|--- |14.0
[Bug target/113832] New: [14 Regression] 6% exec time regression of 464.h264ref on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113832 Bug ID: 113832 Summary: [14 Regression] 6% exec time regression of 464.h264ref on aarch64 Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: needs-bisection Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: fkastl at suse dot cz Target Milestone: --- Host: aarch64-gnu-linux Target: aarch64-gnu-linux As seen here https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=586.220.0 between commits g:cc7aebff74d89675 g:314cbfe2980b32f5 there is a 6% slowdown of the 464.h264ref 2006 SPEC benchmark. Compiler options are: -Ofast -march=native -flto PGO The CPU is Ampere Altra - Neoverse N1.
[Bug c++/113830] GCC accepts invalid code when instantiating the local class inside a function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113830 --- Comment #10 from Jakub Jelinek --- But again, T::unknown isn't used except in a template which is not instantiated. It can't be checked during parsing because T::unknown is dependent and could very well be well formed if it was instantiated with a different template argument. So, does the standard require that all methods of local classes are instantiated when the containing function template is instantiate (of course, that can't be the case for methods which are templates on their own)?
[Bug fortran/113823] ice in gfc_get_element_type, at fortran/trans-types.cc:1286
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113823 kargl at gcc dot gnu.org changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-08 CC||kargl at gcc dot gnu.org --- Comment #1 from kargl at gcc dot gnu.org --- (In reply to David Binderman from comment #0) > > This is the second ice from the flang test suite. If you're keep score https://discourse.llvm.org/t/proposal-rename-flang-new-to-flang/69462/57 Confirmed.
[Bug testsuite/113085] New test case libgomp.c/alloc-pinned-1.c from r14-6499-g348874f0baac0f fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113085 --- Comment #7 from seurer at gcc dot gnu.org --- I posted the LE stuff already but here it is again: spawn [open ...]^M unsufficient lockable memory; please increase ulimit FAIL: libgomp.c/alloc-pinned-1.c execution test seurer@ltcden2-lp1:~/gcc/git/build/gcc-test$ ulimit -a ... max locked memory (kbytes, -l) 64 ... seurer@ltcden2-lp1:~/gcc/git/build/gcc-test$ getconf PAGESIZE 65536 This is a RHEL 8.9 machine and as far as I know it is using the default ulimit settings. On the BE machine: seurer@nilram:~/gcc/git/build/gcc-test$ ulimit -a real-time non-blocking time (microseconds, -R) unlimited ... max locked memory (kbytes, -l) 529679232 ... seurer@nilram:~/gcc/git/build/gcc-test$ getconf PAGESIZE 65536 There were no messages. Running it in gdb I get: (gdb) where #0 0x0fce3340 in ?? () from /lib32/libc.so.6 #1 0x0fc851e4 in raise () from /lib32/libc.so.6 #2 0x0fc6a128 in abort () from /lib32/libc.so.6 #3 0x1ae4 in set_pin_limit (size=size@entry=131072) at /home/seurer/gcc/git/gcc-test/libgomp/testsuite/libgomp.c/alloc-pinned-4.c:44 #4 0x1754 in main () at /home/seurer/gcc/git/gcc-test/libgomp/testsuite/libgomp.c/alloc-pinned-4.c:106 if (getrlimit (RLIMIT_MEMLOCK, )) abort (); // line 44 in alloc-pinned-4.c This is a Debian Trixie machine and it too is using whatever the defaults are.