[Bug c++/100240] Compiler crashes with segmentation fault on a chrono library using nvcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100240 --- Comment #4 from Peter Taraba --- Actually this works: nvcc -o dt ./DeeperThought/*.cu -save-temps But files it creates even if zipped are above 1MB (which is not allowed to be attached).
[Bug c++/99683] Deduction failure when using CTAD of CNTTP inside a deduction guide
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99683 --- Comment #3 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:bcd77b7b9f35bd5b559ed593c3b3e346c1e6f364 commit r12-100-gbcd77b7b9f35bd5b559ed593c3b3e346c1e6f364 Author: Patrick Palka Date: Sat Apr 24 00:14:29 2021 -0400 c++: do_class_deduction and dependent init [PR93383] Here we're crashing during CTAD with a dependent initializer (performed from convert_template_argument) because one of the initializer's elements has an empty TREE_TYPE, which ends up making resolve_args unhappy. Besides the case where we're initializing one template placeholder from another, which is already specifically handled earlier in do_class_deduction, it seems we can't in general correctly resolve a template placeholder using a dependent initializer, so this patch makes the function just punt until instantiation time instead. gcc/cp/ChangeLog: PR c++/89565 PR c++/93383 PR c++/95291 PR c++/99200 PR c++/99683 * pt.c (do_class_deduction): Punt if the initializer is type-dependent. gcc/testsuite/ChangeLog: PR c++/89565 PR c++/93383 PR c++/95291 PR c++/99200 PR c++/99683 * g++.dg/cpp2a/nontype-class39.C: Remove dg-ice directive. * g++.dg/cpp2a/nontype-class45.C: New test. * g++.dg/cpp2a/nontype-class46.C: New test. * g++.dg/cpp2a/nontype-class47.C: New test. * g++.dg/cpp2a/nontype-class48.C: New test.
[Bug c++/99200] __PRETTY_FUNCTION__ used as template parameter causes internal compiler error (segmentation fault)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99200 --- Comment #7 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:bcd77b7b9f35bd5b559ed593c3b3e346c1e6f364 commit r12-100-gbcd77b7b9f35bd5b559ed593c3b3e346c1e6f364 Author: Patrick Palka Date: Sat Apr 24 00:14:29 2021 -0400 c++: do_class_deduction and dependent init [PR93383] Here we're crashing during CTAD with a dependent initializer (performed from convert_template_argument) because one of the initializer's elements has an empty TREE_TYPE, which ends up making resolve_args unhappy. Besides the case where we're initializing one template placeholder from another, which is already specifically handled earlier in do_class_deduction, it seems we can't in general correctly resolve a template placeholder using a dependent initializer, so this patch makes the function just punt until instantiation time instead. gcc/cp/ChangeLog: PR c++/89565 PR c++/93383 PR c++/95291 PR c++/99200 PR c++/99683 * pt.c (do_class_deduction): Punt if the initializer is type-dependent. gcc/testsuite/ChangeLog: PR c++/89565 PR c++/93383 PR c++/95291 PR c++/99200 PR c++/99683 * g++.dg/cpp2a/nontype-class39.C: Remove dg-ice directive. * g++.dg/cpp2a/nontype-class45.C: New test. * g++.dg/cpp2a/nontype-class46.C: New test. * g++.dg/cpp2a/nontype-class47.C: New test. * g++.dg/cpp2a/nontype-class48.C: New test.
[Bug c++/93383] ICE on accessing field of a structure which is non-type template parameter, -std=c++2a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93383 --- Comment #13 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:bcd77b7b9f35bd5b559ed593c3b3e346c1e6f364 commit r12-100-gbcd77b7b9f35bd5b559ed593c3b3e346c1e6f364 Author: Patrick Palka Date: Sat Apr 24 00:14:29 2021 -0400 c++: do_class_deduction and dependent init [PR93383] Here we're crashing during CTAD with a dependent initializer (performed from convert_template_argument) because one of the initializer's elements has an empty TREE_TYPE, which ends up making resolve_args unhappy. Besides the case where we're initializing one template placeholder from another, which is already specifically handled earlier in do_class_deduction, it seems we can't in general correctly resolve a template placeholder using a dependent initializer, so this patch makes the function just punt until instantiation time instead. gcc/cp/ChangeLog: PR c++/89565 PR c++/93383 PR c++/95291 PR c++/99200 PR c++/99683 * pt.c (do_class_deduction): Punt if the initializer is type-dependent. gcc/testsuite/ChangeLog: PR c++/89565 PR c++/93383 PR c++/95291 PR c++/99200 PR c++/99683 * g++.dg/cpp2a/nontype-class39.C: Remove dg-ice directive. * g++.dg/cpp2a/nontype-class45.C: New test. * g++.dg/cpp2a/nontype-class46.C: New test. * g++.dg/cpp2a/nontype-class47.C: New test. * g++.dg/cpp2a/nontype-class48.C: New test.
[Bug c++/95291] ICE in resolve_args at gcc/cp/call.c:4482
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95291 --- Comment #8 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:bcd77b7b9f35bd5b559ed593c3b3e346c1e6f364 commit r12-100-gbcd77b7b9f35bd5b559ed593c3b3e346c1e6f364 Author: Patrick Palka Date: Sat Apr 24 00:14:29 2021 -0400 c++: do_class_deduction and dependent init [PR93383] Here we're crashing during CTAD with a dependent initializer (performed from convert_template_argument) because one of the initializer's elements has an empty TREE_TYPE, which ends up making resolve_args unhappy. Besides the case where we're initializing one template placeholder from another, which is already specifically handled earlier in do_class_deduction, it seems we can't in general correctly resolve a template placeholder using a dependent initializer, so this patch makes the function just punt until instantiation time instead. gcc/cp/ChangeLog: PR c++/89565 PR c++/93383 PR c++/95291 PR c++/99200 PR c++/99683 * pt.c (do_class_deduction): Punt if the initializer is type-dependent. gcc/testsuite/ChangeLog: PR c++/89565 PR c++/93383 PR c++/95291 PR c++/99200 PR c++/99683 * g++.dg/cpp2a/nontype-class39.C: Remove dg-ice directive. * g++.dg/cpp2a/nontype-class45.C: New test. * g++.dg/cpp2a/nontype-class46.C: New test. * g++.dg/cpp2a/nontype-class47.C: New test. * g++.dg/cpp2a/nontype-class48.C: New test.
[Bug c++/89565] [C++2a] ICE on template instantiating user defined non-type template from template value member
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89565 --- Comment #5 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:bcd77b7b9f35bd5b559ed593c3b3e346c1e6f364 commit r12-100-gbcd77b7b9f35bd5b559ed593c3b3e346c1e6f364 Author: Patrick Palka Date: Sat Apr 24 00:14:29 2021 -0400 c++: do_class_deduction and dependent init [PR93383] Here we're crashing during CTAD with a dependent initializer (performed from convert_template_argument) because one of the initializer's elements has an empty TREE_TYPE, which ends up making resolve_args unhappy. Besides the case where we're initializing one template placeholder from another, which is already specifically handled earlier in do_class_deduction, it seems we can't in general correctly resolve a template placeholder using a dependent initializer, so this patch makes the function just punt until instantiation time instead. gcc/cp/ChangeLog: PR c++/89565 PR c++/93383 PR c++/95291 PR c++/99200 PR c++/99683 * pt.c (do_class_deduction): Punt if the initializer is type-dependent. gcc/testsuite/ChangeLog: PR c++/89565 PR c++/93383 PR c++/95291 PR c++/99200 PR c++/99683 * g++.dg/cpp2a/nontype-class39.C: Remove dg-ice directive. * g++.dg/cpp2a/nontype-class45.C: New test. * g++.dg/cpp2a/nontype-class46.C: New test. * g++.dg/cpp2a/nontype-class47.C: New test. * g++.dg/cpp2a/nontype-class48.C: New test.
[Bug c++/87709] c++17 class template argument deduction not working in a very specific case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87709 --- Comment #10 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:5f1a2cb9c2dc09eed53da5d5787d14bec700b10b commit r12-99-g5f1a2cb9c2dc09eed53da5d5787d14bec700b10b Author: Patrick Palka Date: Sat Apr 24 00:01:42 2021 -0400 c++: Hard error with tentative parse and CTAD [PR87709] When parsing e.g. the operand of sizeof, where both types and expressions are accepted, if during the tentative type parse we encounter an unexpected template placeholder, we must simulate an error rather than issue a real error because the expression parse can still succeed. gcc/cp/ChangeLog: PR c++/87709 * parser.c (cp_parser_type_id_1): If we see a template placeholder, first try simulating an error before issuing a real error. gcc/testsuite/ChangeLog: PR c++/87709 * g++.dg/cpp1z/class-deduction86.C: New test.
[Bug c++/93413] Destructor definition not found during constant evaluation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93413 Luke Dalessandro changed: What|Removed |Added CC||ldalessandro at gmail dot com --- Comment #2 from Luke Dalessandro --- I just ran into this today. The referenced "dup" seems to have been resolved last June, but this is still failing. Also probably the same as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99495. Is there any chance someone could take a fresh look at this?
[Bug c++/99495] constexpr virtual destructor is used before its definition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99495 Peter Dimov changed: What|Removed |Added CC||pdimov at gmail dot com --- Comment #1 from Peter Dimov --- Simplified: ``` struct Base { constexpr virtual ~Base() = default; }; constexpr Base b; ``` https://godbolt.org/z/qGn1nx9ET
[Bug c++/100240] Compiler crashes with segmentation fault on a chrono library using nvcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100240 --- Comment #3 from Peter Taraba --- unfortunately this is using nvcc, which calls probably gcc internally, so adding option "-save-temps to the complete compilation command" is not going to provide what you want. also, I have no clue what "spam" you are mentioning. Provided URLs are legit websites.
[Bug libstdc++/100243] New: [10 Regression] invalid use of incomplete type 'std::__detail::__iter_traits >' {aka 'struct std::indirectly_readable_traits'}
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100243 Bug ID: 100243 Summary: [10 Regression] invalid use of incomplete type 'std::__detail::__iter_traits >' {aka 'struct std::indirectly_readable_traits'} Product: gcc Version: 10.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: nok.raven at gmail dot com Target Milestone: --- // g++-10 -std=c++2a #include #include int main() { boost::unordered_multiset a, b; b.insert(std::make_move_iterator(a.begin()), std::make_move_iterator(b.begin())); } ``` In file included from /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/bits/stl_iterator_base_types.h:71, from /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/bits/stl_algobase.h:65, from /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/array:40, from /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/tuple:39, from /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/functional:54, from /opt/compiler-explorer/libs/boost_1_75_0/boost/container_hash/hash.hpp:20, from /opt/compiler-explorer/libs/boost_1_75_0/boost/functional/hash.hpp:6, from /opt/compiler-explorer/libs/boost_1_75_0/boost/unordered/unordered_set.hpp:18, from :1: /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/bits/iterator_concepts.h: In substitution of 'template using __iter_value_t = typename std::__detail::__iter_traits_impl<_Tp, std::indirectly_readable_traits<_Iter> >::type::value_type [with _Tp = boost::unordered::iterator_detail::c_iterator > >]': /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/bits/iterator_concepts.h:259:11: required by substitution of 'template using iter_value_t = std::__detail::__iter_value_t::type>::type> [with _Tp = boost::unordered::iterator_detail::c_iterator > >]' /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/bits/stl_iterator.h:1301:13: required from 'class std::move_iterator > > >' :7:47: required from here /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/bits/iterator_concepts.h:254:13: error: ambiguous template instantiation for 'struct std::indirectly_readable_traits > > >' 254 | using __iter_value_t = typename | ^~ /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/bits/iterator_concepts.h:242:12: note: candidates are: 'template requires requires{typename _Tp::value_type;} struct std::indirectly_readable_traits<_Iter> [with _Tp = boost::unordered::iterator_detail::c_iterator > >]' 242 | struct indirectly_readable_traits<_Tp> |^~~ /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/bits/iterator_concepts.h:247:12: note: 'template requires requires{typename _Tp::element_type;} struct std::indirectly_readable_traits<_Iter> [with _Tp = boost::unordered::iterator_detail::c_iterator > >]' 247 | struct indirectly_readable_traits<_Tp> |^~~ /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/bits/iterator_concepts.h:254:13: error: invalid use of incomplete type 'std::__detail::__iter_traits > >, std::indirectly_readable_traits > > > >' {aka 'struct std::indirectly_readable_traits > > >'} 254 | using __iter_value_t = typename | ^~ /opt/compiler-explorer/gcc-10.3.0/include/c++/10.3.0/bits/iterator_concepts.h:225:29: note: declaration of 'std::__detail::__iter_traits > >, std::indirectly_readable_traits > > > >' {aka 'struct std::indirectly_readable_traits > > >'} 225 | template struct indirectly_readable_traits { }; | ^~ In file included from /opt/compiler-explorer/libs/boost_1_75_0/boost/unordered/detail/set.hpp:6, from /opt/compiler-explorer/libs/boost_1_75_0/boost/unordered/unordered_set.hpp:20, from :1: /opt/compiler-explorer/libs/boost_1_75_0/boost/unordered/detail/implementation.hpp: In instantiation of 'struct boost::unordered::detail::is_forward > > > >': /opt/compiler-explorer/libs/boost_1_75_0/boost/unordered/detail/implementation.hpp:273:14: required from 'struct boost::unordered::detail::disable_if_forward > > >, void*>' /opt/compiler-explorer/libs/boost_1_75_0/boost/unordered/detail/implementation.hpp:4364:14: required by substitution of 'template void boost::unordered::detail::table >, std::__cxx11::basic_string, boost::hash >, std::equal_to > > >::insert_range_equiv(I, I, typename boost::unordered::detail::disable_if_forward::type) [with I = std::move_iterator > > >]' /opt/compiler-explorer/libs/boost_1_75_0/boost/unordered/unordered_set.hpp:1714:32: required from 'void
[Bug fortran/97571] long parsing phase for simple array constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97571 Jerry DeLisle changed: What|Removed |Added CC||jvdelisle at gcc dot gnu.org --- Comment #8 from Jerry DeLisle --- Is it technically possible to do this differently in the compiler? yes Are there resources to actually do so? no Are putting resouces into it worth it (cost/benefit)? no Is there an easy work around? yes solution: do the workaround. subroutine bpr_init implicit none integer :: i real :: tacos2( 0:35250) do i=0,35250 tacos2(i) = acos((i + 64000.0)/10.0) end do end subroutine bpr_init [aside: I realize the example is a simplified/contrived example. Regardless it is horrible software practices using hard coded un-named real constants, etc etc etc.]
[Bug jit/100242] libgccjit.so: error: in expmed_mode_index, at expmed.h:249
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100242 --- Comment #2 from Antoni --- Created attachment 50666 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50666=edit Third part of the reproducer
[Bug jit/100242] libgccjit.so: error: in expmed_mode_index, at expmed.h:249
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100242 --- Comment #1 from Antoni --- Created attachment 50665 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50665=edit Second part of the reproducer
[Bug jit/100242] New: libgccjit.so: error: in expmed_mode_index, at expmed.h:249
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100242 Bug ID: 100242 Summary: libgccjit.so: error: in expmed_mode_index, at expmed.h:249 Product: gcc Version: 10.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: jit Assignee: dmalcolm at gcc dot gnu.org Reporter: bouanto at zoho dot com Target Milestone: --- Created attachment 50664 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50664=edit First part of the compressed reproducer Hi. I get the following error for the attached reproducer: during RTL pass: expand libgccjit.so: error: in expmed_mode_index, at expmed.h:249 0x7f0da2e61a35 expmed_mode_index ../../../gcc/gcc/expmed.h:249 0x7f0da2e61aa4 expmed_op_cost_ptr ../../../gcc/gcc/expmed.h:271 0x7f0da2e620dc sdiv_cost_ptr ../../../gcc/gcc/expmed.h:540 0x7f0da2e62129 sdiv_cost ../../../gcc/gcc/expmed.h:558 0x7f0da2e73c12 expand_divmod(int, tree_code, machine_mode, rtx_def*, rtx_def*, rtx_def*, int) ../../../gcc/gcc/expmed.c:4335 0x7f0da2ea1423 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier) ../../../gcc/gcc/expr.c:9240 0x7f0da2cd1a1e expand_gimple_stmt_1 ../../../gcc/gcc/cfgexpand.c:3796 0x7f0da2cd1c30 expand_gimple_stmt ../../../gcc/gcc/cfgexpand.c:3857 0x7f0da2cd90a9 expand_gimple_basic_block ../../../gcc/gcc/cfgexpand.c:5898 0x7f0da2cdade8 execute ../../../gcc/gcc/cfgexpand.c:6582 The reproducer was so big that I needed to compress it and split it in 3 files, so you'll have to cat the 3 files together and uncompress it. (If you could also explain to me how to find out where exactly is the issue in the reproducer in order to make it smaller and easier to debug, I'd appreciate.) Thanks to fix this issue.
[Bug libgcc/98952] powerpc*: __trampoline_setup inverted test for trampoline size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98952 --- Comment #4 from Segher Boessenkool --- Fixed on trunk. Needs backports to 11 and whatever else is still an open branch when the backports are done :-)
[Bug c++/100240] Compiler crashes with segmentation fault on a chrono library using nvcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100240 Andrew Pinski changed: What|Removed |Added URL|https://www.frisky.world/20 | |21/04/lets-talk-about-biode | |gradable-food.html | Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING Last reconfirmed||2021-04-23 --- Comment #2 from Andrew Pinski --- Can you attach the preprocessed source as request on https://gcc.gnu.org/bugs/ ? Also what is up with the spam in the URL field?
[Bug target/100241] internal compiler error: in curr_insn_transform, at lra-constraints.c:4133
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100241 --- Comment #2 from Marek Polacek --- This is on aarch64: Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/aarch64-redhat-linux/11/lto-wrapper Target: aarch64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --without-isl --enable-gnu-indirect-function --build=aarch64-redhat-linux Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.0.1 20210324 (Red Hat 11.0.1-0) (GCC)
[Bug fortran/97571] long parsing phase for simple array constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97571 kargl at gcc dot gnu.org changed: What|Removed |Added CC||kargl at gcc dot gnu.org --- Comment #7 from kargl at gcc dot gnu.org --- (In reply to Mark J Olah from comment #5) > > A quick test shows I can compute ACOS of 10^8 elements in less than a second > on any reasonable hardware. We are talking about only 32k elements here, > which should be trivial. > Without see your code for your quick test, I'll proffer that you were doing floating point math with hardware FPU support. gfortran's internal representation of an integer is a GMP mpz_t type. For a real type, gfortran uses an MPFR mpfr_t type. Your array constructor [(i, i=0, i=10)] creates an array of 11 gfc_expr nodes, which contain GMP types. Each of these are then converted to an mpfr_t type and divided by a gfc_expr node containing an mpfr_t type for 10.0, so that you now have an array of 11 gfc_expr nodes with mpfr_t entities. Each of these mpfr_t entities is then used as an actual argument to mpfr_acos() from the MPFR library. IOW, gfortran is doing floating point arithmetic with a software implementation, which guarantees correctly rounded values (in round-to-nearest mode). Yes, it is slow. But, at least, you get the correct result.
[Bug target/100241] internal compiler error: in curr_insn_transform, at lra-constraints.c:4133
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100241 Marek Polacek changed: What|Removed |Added Keywords||needs-bisection --- Comment #1 from Marek Polacek --- I'm reducing this again, the original code was valid I think.
[Bug target/100241] New: internal compiler error: in curr_insn_transform, at lra-constraints.c:4133
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100241 Bug ID: 100241 Summary: internal compiler error: in curr_insn_transform, at lra-constraints.c:4133 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: mpolacek at gcc dot gnu.org Target Milestone: --- Currently I only have one test which contains UB: # cat vp9_blockd.c extern char num_4x4_blocks_high_lookup, num_4x4_blocks_wide_lookup; typedef struct { char tx_size; } MODE_INFO; struct macroblockd_plane { int subsampling_y; }; typedef struct { struct macroblockd_plane plane[3]; MODE_INFO *mi; int mb_to_right_edge; } MACROBLOCKD; typedef void foreach_transformed_block_visitor(); int vp9_foreach_transformed_block_in_plane_i; foreach_transformed_block_visitor vp9_foreach_transformed_block_in_plane_visit; #pragma GCC visibility push(internal) void vp9_foreach_transformed_block_in_plane(MACROBLOCKD *xd) { struct macroblockd_plane pd = xd->plane[0]; MODE_INFO mi = xd->mi[0]; char tx_size = mi.tx_size, plane_bsize = pd.subsampling_y; int num_4x4_w = num_4x4_blocks_wide_lookup, num_4x4_h = num_4x4_blocks_high_lookup, r, c, max_blocks_wide = num_4x4_w + xd->mb_to_right_edge, max_blocks_high = num_4x4_h, extra_step = max_blocks_wide >> 1; for (r = 0; r < max_blocks_high; r += tx_size) { for (c = 0; c < max_blocks_wide; c += 1 << tx_size) vp9_foreach_transformed_block_in_plane_visit(plane_bsize); vp9_foreach_transformed_block_in_plane_i += extra_step; } } void vp9_encode_sby_pass1(MACROBLOCKD *x) { vp9_foreach_transformed_block_in_plane(x); } #pragma GCC visibility pop struct vpx_codec_iface { int i; int get_glob_hdrs; void (*fp)(); }; void vp9_first_pass (); void vp9_get_compressed_data() { vp9_first_pass(); } void encoder_encode() { vp9_get_compressed_data(); } struct vpx_codec_iface vpx_codec_vp9_cx_algo = {1, 0, encoder_encode }; void first_pass_worker_hook() { vp9_first_pass_encode_tile_mb_row(); } # cat vp9_firstpass.c typedef long int64_t; void vp9_encode_sby_pass1(); typedef struct { int64_t coded_error; int64_t sr_coded_error; int64_t frame_noise_energy; int64_t intra_error; } FIRSTPASS_DATA; typedef struct { FIRSTPASS_DATA fp_data; } TileDataEnc; void vp9_first_pass_encode_tile_mb_row(int td, FIRSTPASS_DATA *fp_acc_data, TileDataEnc *tile_data) { vp9_encode_sby_pass1(td); FIRSTPASS_DATA __trans_tmp_1 = *fp_acc_data; TileDataEnc *this_tile = tile_data; this_tile->fp_data.coded_error += this_tile->fp_data.sr_coded_error += __trans_tmp_1.sr_coded_error; this_tile->fp_data.frame_noise_energy += __trans_tmp_1.frame_noise_energy; this_tile->fp_data.intra_error += __trans_tmp_1.intra_error; } void launch_enc_workers(); void first_pass_worker_hook(); void vp9_encode_fp_row_mt() { launch_enc_workers(first_pass_worker_hook); } void vp9_first_pass() { vp9_encode_fp_row_mt(); } # gcc -flto=auto -ffat-lto-objects -march=armv8-a -fPIC -O3 -Wall -W vp9_blockd.c vp9_firstpass.c -shared vp9_blockd.c: In function ‘first_pass_worker_hook’: vp9_blockd.c:50:3: warning: implicit declaration of function ‘vp9_first_pass_encode_tile_mb_row’ [-Wimplicit-function-declaration] 50 | vp9_first_pass_encode_tile_mb_row(); | ^ vp9_firstpass.c: In function ‘vp9_first_pass_encode_tile_mb_row’: vp9_firstpass.c:21:1: error: unable to generate reloads for: 21 | } | ^ (insn 65 68 69 8 (set (reg:V2DI 138) (vec_concat:V2DI (mem:DI (plus:DI (reg/v/f:DI 153 [orig:122 fp_acc_data ] [122]) (const_int 16 [0x10])) [4 fp_acc_data_12(D)->frame_noise_energy+0 S8 A64]) (mem:DI (plus:DI (reg/v/f:DI 153 [orig:122 fp_acc_data ] [122]) (const_int 24 [0x18])) [4 fp_acc_data_12(D)->intra_error+0 S8 A64]))) "vp9_firstpass.c":17:34 2473 {load_pair_lanesdi} (expr_list:REG_DEAD (reg/v/f:DI 122 [ fp_acc_data ]) (nil))) during RTL pass: reload vp9_firstpass.c:21:1: internal compiler error: in curr_insn_transform, at lra-constraints.c:4133 0x14881df diagnostic_impl(rich_location*, diagnostic_metadata const*, int, char const*, std::__va_list*, diagnostic_t) ???:0 0xc8b7db internal_error(char const*, ...) ???:0 0xc8b8cf fancy_abort(char const*, int, char const*) ???:0 0x783fab _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) ???:0 0xe152a7 curr_insn_transform(bool) ???:0 0xe102d7 lra_constraints(bool) ???:0 0x127407b lra(_IO_FILE*) ???:0 0x1267e3f (anonymous namespace)::pass_reload::execute(function*) ???:0 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See
[Bug libgcc/98952] powerpc*: __trampoline_setup inverted test for trampoline size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98952 --- Comment #3 from CVS Commits --- The master branch has been updated by Michael Meissner : https://gcc.gnu.org/g:9a30a3f06b908e4e781324c2e813cd1db87119df commit r12-97-g9a30a3f06b908e4e781324c2e813cd1db87119df Author: Michael Meissner Date: Fri Apr 23 18:16:03 2021 -0400 Fix logic error in 32-bit trampolines. The test in the PowerPC 32-bit trampoline support is backwards. It aborts if the trampoline size is greater than the expected size. It should abort when the trampoline size is less than the expected size. I fixed the test so the operands are reversed. I then folded the load immediate into the compare instruction. I verified this by creating a 32-bit trampoline program and manually changing the size of the trampoline to be 48 instead of 40. The program aborted with the larger size. I updated this code and ran the test again and it passed. I added a test case that runs on PowerPC 32-bit Linux systems and it calls the __trampoline_setup function with a larger buffer size than the compiler uses. The test is not run on 64-bit systems, since the function __trampoline_setup is not called. I also limited the test to just Linux systems, in case trampolines are handled differently in other systems. libgcc/ 2021-04-23 Michael Meissner PR target/98952 * config/rs6000/tramp.S (__trampoline_setup, elfv1 #ifdef): Fix trampoline size comparison in 32-bit by reversing test and combining load immediate with compare. (__trampoline_setup, elfv2 #ifdef): Fix trampoline size comparison in 32-bit by reversing test and combining load immediate with compare. gcc/testsuite/ 2021-04-23 Michael Meissner PR target/98952 * gcc.target/powerpc/pr98952.c: New test.
[Bug c++/100240] Compiler crashes with segmentation fault on a chrono library using nvcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100240 Peter Taraba changed: What|Removed |Added CC||taraba.peter at mail dot com --- Comment #1 from Peter Taraba --- It used to work on Ubuntu 19.04 & 20.04 just fine: [https://www.frisky.world/2019/04/testing-ubuntu-1904.html]
[Bug c++/100240] New: Compiler crashes with segmentation fault on a chrono library using nvcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100240 Bug ID: 100240 Summary: Compiler crashes with segmentation fault on a chrono library using nvcc Product: gcc Version: 10.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: taraba.peter at mail dot com Target Milestone: --- Repro steps: 1. Install Ubuntu 21.04 (with third party libraries install enabled) 2. sudo apt install nvidia-cuda-toolkit 3. git clone https://github.com/pepe78/DeeperThought 4. cd DeeperThought 5. ./compile.sh [ https://www.frisky.world/2021/04/testing-ubuntu-2104.html ] Output from compilation: pepe@pepe-MS-7C90:~/code/DeeperThought$ ./compile.sh /usr/include/c++/10/chrono: In substitution of ‘template template using __is_harmonic = std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den))), ((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’: /usr/include/c++/10/chrono:473:154: required from here /usr/include/c++/10/chrono:428:27: internal compiler error: Segmentation fault 428 | _S_gcd(intmax_t __m, intmax_t __n) noexcept | ^~ Please submit a full bug report, with preprocessed source if appropriate. See for instructions. pepe@pepe-MS-7C90:~/code/DeeperThought$ cat compile.sh nvcc -o dt ./DeeperThought/*.cu Version of gcc: pepe@pepe-MS-7C90:~/code/DeeperThought$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 10.3.0-1ubuntu1' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-10 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-gcn/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-mutex Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.3.0 (Ubuntu 10.3.0-1ubuntu1) pepe@pepe-MS-7C90:~/code/DeeperThought$
[Bug libstdc++/100017] error: 'fenv_t' has not been declared in '::' x86_64-w64-mingw32 host cross toolchain fails to build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100017 --- Comment #11 from cqwrteur --- (In reply to Jonathan Wakely from comment #8) > I can only fix the case where the target (in the build tree) is > found first and then its #include_next finds the host (installed on > the host). > > But that seems to be the case that's breaking the canadian cross build. yeah. looks like this issue is very similar to stdint.h. I just removed fenv.h in libstdc++'s build and it works.
[Bug rtl-optimization/100239] [10/11/12 Regression] ICE: in expand_expr_real_2, at expr.c:9865 with __builtin_shuffle()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100239 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Last reconfirmed||2021-04-23 CC||jakub at gcc dot gnu.org Priority|P3 |P2 Target Milestone|--- |10.4 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek --- Started with my r10-1201-g2e83f583c27ef7a9d3b0fb0b5ed372439d6222a8 I'll have a look.
[Bug libstdc++/100017] error: 'fenv_t' has not been declared in '::' x86_64-w64-mingw32 host cross toolchain fails to build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100017 --- Comment #10 from cqwrteur --- (In reply to Jonathan Wakely from comment #8) > I can only fix the case where the target (in the build tree) is > found first and then its #include_next finds the host (installed on > the host). > > But that seems to be the case that's breaking the canadian cross build. The same issue happens with x86_64-linux-gnu build, x86_64-w64-mingw32 host, x86_64-ubuntu-linux-gnu target /home/cqwrteur/myhome/glibc231/mingw_toolchain/mingw-host-linux/gcc/x86_64-ubuntu-linux-gnu/libstdc++-v3/include/cfenv:77:11: error: 'feupdateenv' has not been declared in '::' 77 | using ::feupdateenv; Looks like it has something to do with fenv.h
[Bug rtl-optimization/100239] New: [10/11/12 Regression] ICE: in expand_expr_real_2, at expr.c:9865 with __builtin_shuffle()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100239 Bug ID: 100239 Summary: [10/11/12 Regression] ICE: in expand_expr_real_2, at expr.c:9865 with __builtin_shuffle() Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Created attachment 50663 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50663=edit reduced testcase Compiler output: $ x86_64-pc-linux-gnu-gcc testcase.c during RTL pass: expand testcase.c: In function 'foo': testcase.c:9:7: internal compiler error: in expand_expr_real_2, at expr.c:9865 9 | c & __builtin_shuffle (v != v, 0 < (V){}, (V){719} >>5); | ^~~ 0x671830 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier) /repo/gcc-trunk/gcc/expr.c:9865 0xb94d3f expand_gimple_stmt_1 /repo/gcc-trunk/gcc/cfgexpand.c:3947 0xb94d3f expand_gimple_stmt /repo/gcc-trunk/gcc/cfgexpand.c:4008 0xb9a5d1 expand_gimple_basic_block /repo/gcc-trunk/gcc/cfgexpand.c:6045 0xb9c246 execute /repo/gcc-trunk/gcc/cfgexpand.c:6729 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-96-20210423095621-g886b6c1e8af-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r12-96-20210423095621-g886b6c1e8af-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.0.0 20210423 (experimental) (GCC)
[Bug libstdc++/100017] error: 'fenv_t' has not been declared in '::' x86_64-w64-mingw32 host cross toolchain fails to build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100017 --- Comment #9 from cqwrteur --- (In reply to Jonathan Wakely from comment #8) > I can only fix the case where the target (in the build tree) is > found first and then its #include_next finds the host (installed on > the host). > > But that seems to be the case that's breaking the canadian cross build. it looks like the issue still exists in GCC 12.0.0
[Bug libstdc++/100238] New: [11/12] Link failure in debug libstdc++ on MinGW due to atomicitiy.cc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100238 Bug ID: 100238 Summary: [11/12] Link failure in debug libstdc++ on MinGW due to atomicitiy.cc Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: markus.boeck02 at gmail dot com Target Milestone: --- Building a shared libstdc++ with --enable-libstdcxx-debug for MinGW currently fails to link on both trunk and the upcoming GCC 11 branch. The precise error as shown in the terminal is: /mnt/c/GCC-Build/NewestLinux/bin/x86_64-w64-mingw32-ld: ../../src/debug/c++98/.libs/libc++98convenience.a(atomicity.o): in function `__gnu_cxx::__exchange_and_add(int volatile*, int)': /mnt/c/GCC-Build-Array/gcc/build-target-x86_64/x86_64-w64-mingw32/libstdc++-v3/src/debug/c++98/atomicity.cc:36: multiple definition of `__gnu_cxx::__exchange_and_add(int volatile*, int)'; ../../libsupc++/.libs/libsupc++convenience.a(atomicity.o):C:/GCC-Build-Array/gcc/build-target-x86_64/x86_64-w64-mingw32/libstdc++-v3/libsupc++/atomicity.cc:36: first defined here /mnt/c/GCC-Build/NewestLinux/bin/x86_64-w64-mingw32-ld: ../../src/debug/c++98/.libs/libc++98convenience.a(atomicity.o): in function `__gnu_cxx::__atomic_add(int volatile*, int)': /mnt/c/GCC-Build-Array/gcc/build-target-x86_64/x86_64-w64-mingw32/libstdc++-v3/src/debug/c++98/atomicity.cc:41: multiple definition of `__gnu_cxx::__atomic_add(int volatile*, int)'; ../../libsupc++/.libs/libsupc++convenience.a(atomicity.o):C:/GCC-Build-Array/gcc/build-target-x86_64/x86_64-w64-mingw32/libstdc++-v3/libsupc++/atomicity.cc:41: first defined here Configure option used was: ../configure --target=x86_64-w64-mingw32 --host=x86_64-w64-mingw32 --disable-bootstrap --enable-libstdcxx-debug --with-tune=znver1 --prefix=/mnt/c/GCC --with-sysroot=/mnt/c/GCC-Build/NewestLinux --with-build-sysroot=/mnt/c/GCC-Build/NewestLinux --disable-libstdcxx-pch --disable-multilib --enable-libgomp --with-cross-host --with-libiconv-prefix=/mnt/c/GCC-Build/NewestLinux/Libraries --disable-libstdcxx-verbose --enable-languages=c,c++,fortran,lto,objc,obj-c++,d --disable-nls --disable-win32-registry --enable-shared --with-gnu-as --with-gnu-ld --enable-threads=posix --program-suffix=-11 --enable-version-specific-runtime-libs --with-gcc-major-version-only --enable-__cxa_atexit --enable-plugin --program-prefix= --enable-checking=release Last tested on revision 886b6c1e8af502b69e3f318b9830b73b88215878 of the master branch
[Bug fortran/100227] [8/9/10/11/12 Regression] write with implicit loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100227 anlauf at gcc dot gnu.org changed: What|Removed |Added CC||anlauf at gcc dot gnu.org --- Comment #3 from anlauf at gcc dot gnu.org --- Reduced testcase: program p implicit none integer, parameter :: nbmode = 3 integer :: k real:: mass(nbmode*2) do k = 1, nbmode*2 mass(k) = k end do print *, (mass(k+k), k=1,nbmode) end program Also valgrind shows that bad things happen with -ffrontend-optimize ...
[Bug c++/92145] -Wdeprecated-copy false-positive when inheriting base assignment operators
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92145 Jason Merrill changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jason at gcc dot gnu.org Status|NEW |ASSIGNED CC||jason at gcc dot gnu.org
[Bug fortran/97571] long parsing phase for simple array constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97571 --- Comment #6 from anlauf at gcc dot gnu.org --- (In reply to Mark J Olah from comment #5) > Whatever is happening inside the AST evaluation in this case is not only > extraordinarily inefficient, but also apparently exponential with the size > of the fixed array. This can be demonstrated by adjusting N in the test > case I added to bug #100235. You are asking for the evaluation of a constant expression, and the compiler does it. > A quick test shows I can compute ACOS of 10^8 elements in less than a second > on any reasonable hardware. We are talking about only 32k elements here, > which should be trivial. If you want run-time evaluation of acos, you get it when you write your code this way. If not, e.g. change it so that it looks to the compiler that one of the bounds is non-constant. The code in comment#0 e.g. could read: subroutine bpr_init implicit none integer :: i real :: tacos2( 0:35250) integer :: n = 99250 tacos2 = acos( (/ (i, i=64000,n) /) / 10.0) end subroutine bpr_init You'll then get even more compact and probably efficient code than in older versions of gfortran, because it will no longer generate the unused data array as explained by Steve. > In any case is there a flag I can add to the compiler to disable this new > behavior? Currently no. The gfortran change was done to allow the r.h.s. expression to evaluate at compile time so that it can be used e.g. in parameter statements, which is required by the standard. One could envision checking the l.h.s. and decide to not do the simplification of the r.h.s. in case of assignments. Somebody would need to implement that. To me it looks like: doctor, it hurts when I do ... As a workaround, see my suggestion above.
[Bug fortran/97571] long parsing phase for simple array constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97571 --- Comment #5 from Mark J Olah --- I reported this in bug #100235, where it was suggested this would be treated as WONTFIX. It seems this bug is reporting on the same issue arising from rttov: https://nwp-saf.eumetsat.int/site/software/rttov/ Whatever is happening inside the AST evaluation in this case is not only extraordinarily inefficient, but also apparently exponential with the size of the fixed array. This can be demonstrated by adjusting N in the test case I added to bug #100235. IMHO, 2000sec is not a reasonable compile time for *any* code, especially something this simple (and arguably correct code that is used in production packages). A quick test shows I can compute ACOS of 10^8 elements in less than a second on any reasonable hardware. We are talking about only 32k elements here, which should be trivial. In any case is there a flag I can add to the compiler to disable this new behavior?
[Bug libstdc++/100237] Unnecessary std::move in ranges::min, ranges::max and ranges::minmax
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100237 Patrick Palka changed: What|Removed |Added Last reconfirmed||2021-04-23 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 CC||ppalka at gcc dot gnu.org --- Comment #1 from Patrick Palka --- Good point, confirmed. It seems clear from the definition of the indirect_strict_weak_order concept that these algorithms are expected to invoke the predicate as an lvalue, so it's wrong to std::move it.
[Bug fortran/97571] long parsing phase for simple array constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97571 Dominique d'Humieres changed: What|Removed |Added CC||molah at ucar dot edu --- Comment #4 from Dominique d'Humieres --- *** Bug 100235 has been marked as a duplicate of this bug. ***
[Bug fortran/100235] 10.3.0 Performance regressions for compile-time math intrinsics computation on arrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100235 Dominique d'Humieres changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #2 from Dominique d'Humieres --- Dup. *** This bug has been marked as a duplicate of bug 97571 ***
[Bug fortran/100235] 10.3.0 Performance regressions for compile-time math intrinsics computation on arrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100235 kargl at gcc dot gnu.org changed: What|Removed |Added CC||kargl at gcc dot gnu.org Priority|P3 |P4 --- Comment #1 from kargl at gcc dot gnu.org --- (In reply to Mark J Olah from comment #0) > I am observing a significant compile time increase for gfortran-10.3.0, when > computing math intrinsic operations over arrays constructed at compile-time. > > === Test case = > > #define N 8000 > PROGRAM TEST > > REAL :: V(0:N) > V = ACOS( (/ (I, I=0,N) /)/10.0 ) > > END PROGRAM > Using a sane value for N such as 4 and adding -fdump-tree-original to the command line reveals gfortran 10.2.0 % gfortran -fdump-tree-orginal -o z a.F90 % more a.F90.004t.original test () { integer(kind=4) i; real(kind=4) v[5]; { static real(kind=4) A.0[5] = {0.0, 9.9974737875163555145263671875e-6, 1.9994947575032711029052734375e-5, 2.99924213625490665435791015625e-5, 3.998989515006542205810546875e-5}; { integer(kind=8) S.1; S.1 = 0; while (1) { if (S.1 > 4) goto L.1; v[S.1] = __builtin_acosf (A.0[S.1]); S.1 = S.1 + 1; } L.1:; } } } For top-of-tree gfortran, one finds % gfcx -o z -fdump-tree-original a.F90 % more z-a.F90.005t.original __attribute__((fn spec (". "))) void test () { integer(kind=4) i; real(kind=4) v[11]; { static real(kind=4) A.0[11] = {1.57079637050628662109375e+0, 1.57078635692596435546875e+0, 1.57077634334564208984375e+0, 1.57076632976531982421875e+0, 1.57075631618499755859375e+0, 1.57074630260467529296875e+0, 1.57073628902435302734375e+0, 1.57072627544403076171875e+0, 1.570716381072998046875e+0, 1.57070636749267578125e+0, 1.570696353912353515625e+0}; (void) __builtin_memcpy ((void *) , (void *) , 44); } } So, what we have learned is that for 10.2.x and older, gfortran does not evaluate ACOS(). In newer versions of gfortran, the compiler will perform the evaluation of ACOS() at compile time, and this is done with MPFR. So, yeah, it's slower because the compiler is doing more work. Bug should likely be closed with WONTFIX. Someone else can make that decision.
[Bug libstdc++/100237] New: Unnecessary std::move in ranges::min, ranges::max and ranges::minmax
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100237 Bug ID: 100237 Summary: Unnecessary std::move in ranges::min, ranges::max and ranges::minmax Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: hewillk at gmail dot com Target Milestone: --- Hey, in ranges_algo.h#L3121: constexpr const _Tp& operator()(const _Tp& __a, const _Tp& __b, _Comp __comp = {}, _Proj __proj = {}) const { if (std::__invoke(std::move(__comp), std::__invoke(__proj, __b), std::__invoke(__proj, __a))) return __b; else return __a; } Although it is unclear why there is a std::move in ranges::min, ranges::max, and ranges::minmax, it is obviously unnecessary and will lead to the following valid code failed: #include struct Comp { constexpr bool operator()(int, int) & { return true; }; } comp; static_assert(std::min(0, 1, comp)); static_assert(std::ranges::min(0, 1, comp)); static_assert(std::max(0, 1, comp)); static_assert(std::ranges::max(0, 1, comp)); static_assert(std::minmax(0, 1, comp).first); static_assert(std::ranges::minmax(0, 1, comp).min); https://godbolt.org/z/MbG8zbcGY
[Bug target/100236] arm: UB in arm_compute_save_core_reg_mask (shift exponent 4294967295 is too large for 32-bit type 'int')
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100236 --- Comment #1 from Alex Coplan --- GCC compiled with UBSan here. I should have mentioned it needs -march=armv8.1-m.main.
[Bug target/100236] New: arm: UB in arm_compute_save_core_reg_mask (shift exponent 4294967295 is too large for 32-bit type 'int')
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100236 Bug ID: 100236 Summary: arm: UB in arm_compute_save_core_reg_mask (shift exponent 4294967295 is too large for 32-bit type 'int') Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- $ cat test.c void a() { void b() {} b(); } $ ./arm-eabi-gcc -c -fpic test.c /data_sdb/toolchain/src/gcc/gcc/config/arm/arm.c:21008:27: runtime error: shift exponent 4294967295 is too large for 32-bit type 'int' #0 0x2a07eee in arm_compute_save_core_reg_mask /data_sdb/toolchain/src/gcc/gcc/config/arm/arm.c:21008 #1 0x2a07eee in arm_compute_frame_layout /data_sdb/toolchain/src/gcc/gcc/config/arm/arm.c:22629 #2 0x1a9b56e in set_initial_elim_offsets /data_sdb/toolchain/src/gcc/gcc/reload1.c:3766 #3 0x1abe973 in calculate_elim_costs_all_insns() /data_sdb/toolchain/src/gcc/gcc/reload1.c:1559 #4 0x158e870 in ira_costs() /data_sdb/toolchain/src/gcc/gcc/ira-costs.c:2296 #5 0x157369e in ira_build() /data_sdb/toolchain/src/gcc/gcc/ira-build.c:3426 #6 0x155714d in ira /data_sdb/toolchain/src/gcc/gcc/ira.c:5655 #7 0x155714d in execute /data_sdb/toolchain/src/gcc/gcc/ira.c:5978 #8 0x192438e in execute_one_pass(opt_pass*) /data_sdb/toolchain/src/gcc/gcc/passes.c:2567 #9 0x1926e3a in execute_pass_list_1 /data_sdb/toolchain/src/gcc/gcc/passes.c:2656 #10 0x1926df8 in execute_pass_list_1 /data_sdb/toolchain/src/gcc/gcc/passes.c:2657 #11 0x1926e95 in execute_pass_list(function*, opt_pass*) /data_sdb/toolchain/src/gcc/gcc/passes.c:2667 #12 0xc22f30 in cgraph_node::expand() /data_sdb/toolchain/src/gcc/gcc/cgraphunit.c:1830 #13 0xc23e50 in cgraph_order_sort::process() /data_sdb/toolchain/src/gcc/gcc/cgraphunit.c:2069 #14 0xc2979a in output_in_order /data_sdb/toolchain/src/gcc/gcc/cgraphunit.c:2137 #15 0xc2979a in symbol_table::compile() /data_sdb/toolchain/src/gcc/gcc/cgraphunit.c:2355 #16 0xc3433a in symbol_table::finalize_compilation_unit() /data_sdb/toolchain/src/gcc/gcc/cgraphunit.c:2539 #17 0x1cc8e7f in compile_file /data_sdb/toolchain/src/gcc/gcc/toplev.c:482 #18 0x1ccf7bf in do_compile /data_sdb/toolchain/src/gcc/gcc/toplev.c:2201 #19 0x1ccf7bf in toplev::main(int, char**) /data_sdb/toolchain/src/gcc/gcc/toplev.c:2340 #20 0x432625c in main /data_sdb/toolchain/src/gcc/gcc/main.c:39 #21 0x76740bf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6) #22 0x645e69 in _start (/data_sdb/toolchain/build-arm-eabi-armv8.1-m.main+mve/install/libexec/gcc/arm-eabi/11.0.1/cc1+0x645e69)
[Bug target/100041] ICE in curr_insn_transform, at lra-constraints.c:4022
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100041 --- Comment #22 from CVS Commits --- The master branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:716bb02b40ecef5564abb5ba45a594323123a104 commit r12-94-g716bb02b40ecef5564abb5ba45a594323123a104 Author: Uros Bizjak Date: Fri Apr 23 18:45:14 2021 +0200 i386: Reject -m96bit-long-double for 64bit targets [PR100041] 64bit targets default to 128bit long double, so -m96bit-long-double should not be used. Together with -m128bit-long-double, this option was intended to be an optimization for 32bit targets only. Error out when -m96bit-long-double is used with 64bit targets. 2021-04-23 Uroš Bizjak gcc/ PR target/100041 * config/i386/i386-options.c (ix86_option_override_internal): Error out when -m96bit-long-double is used with 64bit targets. * config/i386/i386.md (*pushxf_rounded): Remove pattern. gcc/testsuite/ PR target/100041 * gcc.target/i386/pr79514.c (dg-error): Expect error for 64bit targets.
[Bug fortran/100235] New: 10.3.0 Performance regressions for compile-time math intrinsics computation on arrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100235 Bug ID: 100235 Summary: 10.3.0 Performance regressions for compile-time math intrinsics computation on arrays Product: gcc Version: 10.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: molah at ucar dot edu Target Milestone: --- I am observing a significant compile time increase for gfortran-10.3.0, when computing math intrinsic operations over arrays constructed at compile-time. === Test case = #define N 8000 PROGRAM TEST REAL :: V(0:N) V = ACOS( (/ (I, I=0,N) /)/10.0 ) END PROGRAM == Performance: gfortran-9.3.0: 0.05s gfortran-10.2.0: 0.05s gfortran-10.3.0: 27.0s !! The runtime is increasing exponentially, so that N=1600 takes a few minutes for 10.3.0. The optimization flags appear to have no effect on the runtime.
[Bug target/99540] ICE: Segmentation fault in aarch64_add_offset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99540 rsandifo at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED Summary|[8/9 Backport] ICE: |ICE: Segmentation fault in |Segmentation fault in |aarch64_add_offset |aarch64_add_offset | --- Comment #14 from rsandifo at gcc dot gnu.org --- My bad. The expand_mult call didn't exist before GCC 10, so all is well.
[Bug target/99249] SVE: ICE in aarch64_expand_sve_const_vector (during RTL pass: early_remat)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99249 rsandifo at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED Summary|[8/9 Backport] SVE: ICE in |SVE: ICE in |aarch64_expand_sve_const_ve |aarch64_expand_sve_const_ve |ctor (during RTL pass: |ctor (during RTL pass: |early_remat)|early_remat) --- Comment #6 from rsandifo at gcc dot gnu.org --- I was going to backport this further than GCC 10, but it turns out that the code changed too much in the GCC 9->GCC 10 timeframe for it to apply cleanly. I'm also not aware of any way of triggering the bug before GCC 10.
[Bug target/98119] [10 Regression] SVE: Wrong code with -O1 -ftree-vectorize -msve-vector-bits=512 -mtune=thunderx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98119 rsandifo at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #8 from rsandifo at gcc dot gnu.org --- Fixed.
[Bug target/99929] [8/9/10 Backport] SVE: Wrong code at -O2 -ftree-vectorize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99929 --- Comment #5 from CVS Commits --- The releases/gcc-10 branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:2c3a699b91dac954271c9fd96416128fc39cb3f9 commit r10-9760-g2c3a699b91dac954271c9fd96416128fc39cb3f9 Author: Richard Sandiford Date: Fri Apr 23 17:17:12 2021 +0100 Check for matching CONST_VECTOR encodings [PR99929] PR99929 is one of those âhow did we get away with this for so longâ bugs: the equality routines weren't checking whether two variable-length CONST_VECTORs had the same encoding. This meant that: { 1, 0, 0, 0, 0, 0, ... } would appear to be equal to: { 1, 0, 1, 0, 1, 0, ... } since both are represented using the elements { 1, 0 }. gcc/ PR rtl-optimization/99929 * rtl.h (same_vector_encodings_p): New function. * cse.c (exp_equiv_p): Check that CONST_VECTORs have the same encoding. * cselib.c (rtx_equal_for_cselib_1): Likewise. * jump.c (rtx_renumbered_equal_p): Likewise. * lra-constraints.c (operands_match_p): Likewise. * reload.c (operands_match_p): Likewise. * rtl.c (rtx_equal_p_cb, rtx_equal_p): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/pr99929_1.c: New file. * gcc.target/aarch64/sve/pr99929_2.c: Likewise. (cherry picked from commit a87d3f964df31d4fbceb822c6d293e85c117d992)
[Bug target/98119] [10 Regression] SVE: Wrong code with -O1 -ftree-vectorize -msve-vector-bits=512 -mtune=thunderx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98119 --- Comment #7 from CVS Commits --- The releases/gcc-10 branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:367aa5ee879c6bbfc4bf7ae94c680f0614581661 commit r10-9759-g367aa5ee879c6bbfc4bf7ae94c680f0614581661 Author: Richard Sandiford Date: Fri Apr 23 17:17:11 2021 +0100 aarch64: Fix target alignment for SVE [PR98119] The vectoriser supports peeling for alignment using predication: we move back to the previous aligned boundary and make the skipped elements inactive in the first loop iteration. As it happens, the costs for existing CPUs give an equal cost to aligned and unaligned accesses, so this feature is rarely used. However, the PR shows that when the feature was forced on, we were still trying to align to a full-vector boundary even when using partial vectors. gcc/ PR target/98119 * config/aarch64/aarch64.c (aarch64_vectorize_preferred_vector_alignment): Query the size of the provided SVE vector; do not assume that all SVE vectors have the same size. gcc/testsuite/ PR target/98119 * gcc.target/aarch64/sve/pr98119.c: New test. (cherry picked from commit 1393938e4c7dab9306cdce5a73d93b242fc246ec)
[Bug c++/98056] ICE tree check: expected record_type or union_type or qual_union_type, have array_type in build_special_member_call, at cp/call.c:9862 since r11-2183-g0f66b8486cea8668
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98056 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #8 from Jakub Jelinek --- *** Bug 100234 has been marked as a duplicate of this bug. ***
[Bug c++/100234] [11/12 Regression] Coroutines ICE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100234 Jakub Jelinek changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #1 from Jakub Jelinek --- Dup. *** This bug has been marked as a duplicate of bug 98056 ***
[Bug target/99932] OpenACC/nvptx offloading execution regressions starting with CUDA 11.2-era Nvidia Driver 460.27.04
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99932 --- Comment #4 from Tom de Vries --- Created attachment 50662 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50662=edit Updated cuda reproducer Slimmed down further, eliminated gang/worker reduction parts.
[Bug c++/100234] [11/12 Regression] Coroutines ICE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100234 Jakub Jelinek changed: What|Removed |Added Target Milestone|--- |11.2 CC||iains at gcc dot gnu.org
[Bug c++/100234] New: [11/12 Regression] Coroutines ICE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100234 Bug ID: 100234 Summary: [11/12 Regression] Coroutines ICE Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: jakub at gcc dot gnu.org Target Milestone: --- Starting with r11-2183-g0f66b8486cea8668020e4bd48f261b760cb579be the following testcase ICEs with -std=c++20 -fcoroutines: rh1952671.C: In member function ‘K L::bar()’: rh1952671.C:44:41: internal compiler error: tree check: expected record_type or union_type or qual_union_type, have array_type in build_special_member_call, at cp/call.c:9758 44 | K bar () { co_await foo (C(), {C()}); } | ^ 0x1a5f2ac tree_check_failed(tree_node const*, char const*, int, char const*, ...) ../../gcc/tree.c:9687 0x958126 tree_check3(tree_node*, char const*, int, char const*, tree_code, tree_code, tree_code) ../../gcc/tree.h:3343 0x94b8b3 build_special_member_call(tree_node*, tree_node*, vec**, tree_node*, int, int) ../../gcc/cp/call.c:9758 0x9d8490 flatten_await_stmt ../../gcc/cp/coroutines.cc:2859 namespace std { template class initializer_list { int *a; decltype(sizeof 0) b; }; } struct C { C(); }; struct D { D(std::initializer_list); }; namespace std { template struct coroutine_traits : e {}; template struct coroutine_handle { operator coroutine_handle<>(); }; } struct F { void await_ready(); void await_suspend(std::coroutine_handle<>); void await_resume(); }; struct M { void await_ready() noexcept; template void await_suspend(h) noexcept; void await_resume() noexcept; }; struct I { F initial_suspend(); auto final_suspend() noexcept { return M{}; } }; struct K { struct J : public I { void unhandled_exception(); //K get_return_object() { return K{}; } }; using promise_type = J; }; struct Q { Q(int); void await_ready(); void await_resume(); void await_suspend(std::coroutine_handle<>); }; Q foo (C, D); struct L { K bar () { co_await foo (C(), {C()}); } }; while before that commit it has been rejected with rh1952671.C: In member function ‘K L::bar()’: rh1952671.C:44:5: error: no member named ‘get_return_object’ in ‘K::promise_type’ {aka ‘K::J’} 44 | K bar () { co_await foo (C(), {C()}); } | ^~~
[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #22 from CVS Commits --- The master branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:d2324a5ab3ff097864ae6828cb1db4dd013c70d1 commit r12-91-gd2324a5ab3ff097864ae6828cb1db4dd013c70d1 Author: Uros Bizjak Date: Fri Apr 23 17:29:29 2021 +0200 i386: Fix atomic FP peepholes [PR100182] 64bit loads to/stores from x87 and SSE registers are atomic also on 32-bit targets, so there is no need for additional atomic moves to a temporary register. Introduced load peephole2 patterns assume that there won't be any additional loads from the load location outside the peepholed sequence and wrongly removed the source location initialization. OTOH, introduced store peephole2 patterns assume there won't be any additional loads from the stored location outside the peepholed sequence and wrongly removed the destination location initialization. Note that we can't use plain x87 FST instruction to initialize destination location because FST converts the value to the double-precision format, changing bits during move. The patch restores removed initializations in load and store patterns. Additionally, plain x87 FST in store peephole2 patterns is prevented by limiting the store operand source to SSE registers. 2021-04-23 Uroš Bizjak gcc/ PR target/100182 * config/i386/sync.md (FILD_ATOMIC/FIST_ATOMIC FP load peephole2): Copy operand 3 to operand 4. Use sse_reg_operand as operand 3 predicate. (FILD_ATOMIC/FIST_ATOMIC FP load peephole2 with mem blockage): Ditto. (LDX_ATOMIC/STX_ATOMIC FP load peephole2): Ditto. (LDX_ATOMIC/LDX_ATOMIC FP load peephole2 with mem blockage): Ditto. (FILD_ATOMIC/FIST_ATOMIC FP store peephole2): Copy operand 1 to operand 0. (FILD_ATOMIC/FIST_ATOMIC FP store peephole2 with mem blockage): Ditto. (LDX_ATOMIC/STX_ATOMIC FP store peephole2): Ditto. (LDX_ATOMIC/LDX_ATOMIC FP store peephole2 with mem blockage): Ditto. gcc/testsuite/ PR target/100182 * gcc.target/i386/pr100182.c: New test. * gcc.target/i386/pr71245-1.c (dg-final): Xfail scan-assembler-not. * gcc.target/i386/pr71245-2.c (dg-final): Ditto.
[Bug target/100232] [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232 --- Comment #2 from Tobias Burnus --- (In reply to Tom de Vries from comment #1) > Can you try the patch for PR81778 ? > It's possible you're looking at a duplicate. Unfortunately, it does not seem to make a difference - it still fails
[Bug target/100232] [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232 --- Comment #1 from Tom de Vries --- Can you try the patch for PR81778 ? It's possible you're looking at a duplicate.
[Bug libstdc++/100233] New: [10/11/12] std::views::elements only accepts types that are defined on std::get
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100233 Bug ID: 100233 Summary: [10/11/12] std::views::elements only accepts types that are defined on std::get Product: gcc Version: 10.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: gcc-bugs at marehr dot dialup.fu-berlin.de Target Milestone: --- Hi gcc-team, the following code will not work on custom tuples that don't add a get overload within the std namespace. These tuples work in structured bindings, but not within std::views::elements. ```cpp #include #include namespace my_namespace { template struct S { int x{}; }; } // namespace my_namespace namespace std { template typename t, typename ...types> requires std::is_base_of_v, t> struct tuple_size> : public std::integral_constant {}; template typename t, typename ...types> requires (i < sizeof...(types)) && std::is_base_of_v, t> struct tuple_element> { using type = int; }; } // namespace std namespace my_namespace { template int & get(my_namespace::S & e) noexcept { return e.x; } } // my_namespace #if !DO_FAIL namespace std { using my_namespace::get; } // namespace std #endif #include int main() { // does work with / without defining get within std using tripplet_t = my_namespace::S; tripplet_t tuple{}; auto & [a, b, c] = tuple; // only works when defining within std std::vector vec(10); // std::views::elements<0>(vec); using elements_view_t = std::ranges::elements_view, 0>; } ``` https://godbolt.org/z/n315fednc ``` > g++-10 --std=c++2a -DDO_FAIL=1 :56:93: error: template constraint failure for 'template requires (input_range<_Vp>) && ((view<_Vp>) && (__has_tuple_element)()))>::type>::type, std::indirectly_readable_traits)()))>::type>::type> >::__iter_traits)()))>::type>::type, std::indirectly_readable_traits)()))>::type>::type> >::value_type, _Nm>) && (__has_tuple_element)()))&>)())>::type, _Nm>) && (__returnable_element)()))&>)()), _Nm>)) class std::ranges::elements_view' 56 | using elements_view_t = std::ranges::elements_view, 0>; | ^ :56:93: note: constraints not satisfied In file included from :44: /opt/compiler-explorer/gcc-trunk-20210423/include/c++/12.0.0/ranges: In substitution of 'template requires (input_range<_Vp>) && ((view<_Vp>) && (__has_tuple_element)()))>::type>::type, std::indirectly_readable_traits)()))>::type>::type> >::__iter_traits)()))>::type>::type, std::indirectly_readable_traits)()))>::type>::type> >::value_type, _Nm>) && (__has_tuple_element)()))&>)())>::type, _Nm>) && (__returnable_element)()))&>)()), _Nm>)) class std::ranges::elements_view [with _Vp = std::ranges::ref_view > >; long unsigned int _Nm = 0]': :56:93: required from here /opt/compiler-explorer/gcc-trunk-20210423/include/c++/12.0.0/ranges:3306:13: required for the satisfaction of '__has_tuple_element, _Nm>' [with _Vp = std::ranges::ref_view, std::allocator > > >; _Nm = 0] /opt/compiler-explorer/gcc-trunk-20210423/include/c++/12.0.0/ranges:3306:35: in requirements with '_Tp __t' [with _Nm = 0; _Tp = my_namespace::S] /opt/compiler-explorer/gcc-trunk-20210423/include/c++/12.0.0/ranges:3311:24: note: the required expression 'get<_Nm>(__t)' is invalid 3311 | { std::get<_Nm>(__t) } | ~^ ``` The standard defines the `get-element` call as ```cpp if constexpr (is_reference_v>) { return get(*i); } else { using E = remove_cv_t>>; return static_cast(get(*i)); } ``` https://eel.is/c++draft/range.elements#iterator-3 With some good-will you could say that `get` should be called unqualified :) I know that it isn't stated explicitly, but with how the whole ADL thing with range adaptors work, it is unexpected that it does not work here too. Thank you!
[Bug sanitizer/95693] [8/9 Regression] Incorrect error from undefined behavior sanitizer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95693 --- Comment #13 from Tibor Billes --- Thank you all for fixing it!
[Bug target/100232] New: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100232 Bug ID: 100232 Summary: [OpenMP][nvptx] Reduction fails with optimization and 'loop'/'for simd' but not with 'for' Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: openmp, wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: vries at gcc dot gnu.org Target Milestone: --- Target: nvptx-none Created attachment 50661 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50661=edit Testcase: gcc -fopenmp -O1 (fails, -O0 works) - to be run with nvptx offloading (Based on https://github.com/SOLLVE/sollve_vv/ 's tests/5.0/loop/test_loop_reduction_{and,or}_device.c ) The code works with nvptx offloading with -O0 but fails with -O1 and higher. (It also works on AMD GCN or with host fallback.) A reduction of result &&= 1 will yield 0 instead of the expected 1. I note that it works with 'for' but fails with 'loop' and 'for simd', hence, I think it might related to SIMT (→ some other PRs about SIMT).
[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152 --- Comment #41 from Iain Sandoe --- (In reply to Richard Biener from comment #40) > (In reply to Richard Biener from comment #39) > > (In reply to Iain Sandoe from comment #38) > > > (In reply to Richard Biener from comment #37) > > > > Oh, and FYI a cc1 cross from x86_64 to x86_64-apple-darwin19.6.0 > > > > doesn't seem > > > > to reproduce the issue with the reduced testcase (I seee no call to > > > > ___UTF_8_put remaining with -O3 -fPIC -fno-strict-aliasing -fwrapv). > > code. But then ___UTF_8_put isn't interposable so I wonder why the linker > > even has to resolve anything. Adding -fPIC OTOH should definitely make the > > symbol interposable but the same code is still generated ... Darwin x86_64 is always PIC (fPIC is a NOP, and is added if no other PIC mode is given). user-mode code is invalid without it. > > breaking on darwin_binds_local_p I see ___UTF_8_put is considered binding > > local even with -fPIC. So GCC thinks there will be no linker stub involved. which is the immediate bug here... > flag_shlib != 0 || force_overridable I want to check on the indirection rules to be sure [they are not exactly the same as Linux] (and that flag_shlib is set appropriately). The other possible bug might be irrelevant (missing information to IPA about the lazy resolver) - but I still need to think about the various cases.
[Bug rtl-optimization/100230] ASan: alloc-dealloc-mismatch in early-remat.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100230 --- Comment #3 from Alex Coplan --- Fixed on trunk so far.
[Bug rtl-optimization/100230] ASan: alloc-dealloc-mismatch in early-remat.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100230 --- Comment #2 from CVS Commits --- The master branch has been updated by Alex Coplan : https://gcc.gnu.org/g:5d87c2251c441f056e0a44f928ffcb8a8a679b6b commit r12-90-g5d87c2251c441f056e0a44f928ffcb8a8a679b6b Author: Alex Coplan Date: Fri Apr 23 14:09:15 2021 +0100 early-remat.c: Fix new/delete mismatch [PR100230] This simple patch fixes a mistmatched operator new/delete in early-remat.c which triggers ASan errors on (at least) AArch64 when compiling SVE code. gcc/ChangeLog: PR rtl-optimization/100230 * early-remat.c (early_remat::sort_candidates): Use delete[] instead of delete for array allocated with new[].
[Bug libstdc++/99402] [10 Regression] std::copy creates _GLIBCXX_DEBUG false positive for attempt to subscript a dereferenceable (start-of-sequence) iterator
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99402 Jonathan Wakely changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #14 from Jonathan Wakely --- Fixed for 10.4 then. Thanks, François.
[Bug libstdc++/100180] experimental/net/internet/address/v6/members.cc fails on arm-eabi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100180 Jonathan Wakely changed: What|Removed |Added Last reconfirmed||2021-04-23 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #7 from Jonathan Wakely --- I hope it's fixed on trunk now. The r12-89 commit can be backported, but not until after the gcc-11 release.
[Bug tree-optimization/98736] [10 Regression] Wrong partition order generated in loop distribution pass since r10-619-g5879ab5fafedc8f6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98736 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Known to fail||10.3.0 Known to work||10.3.1 --- Comment #11 from Richard Biener --- I've convinced myself that simply picking those changes is safe. Thus fixed.
[Bug tree-optimization/98736] [10 Regression] Wrong partition order generated in loop distribution pass since r10-619-g5879ab5fafedc8f6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98736 --- Comment #10 from CVS Commits --- The releases/gcc-10 branch has been updated by Richard Biener : https://gcc.gnu.org/g:6c6a1173cccfd9c466e43bafa7ef7940b93d1495 commit r10-9758-g6c6a1173cccfd9c466e43bafa7ef7940b93d1495 Author: Bin Cheng Date: Wed Apr 7 10:24:32 2021 +0800 tree-optimization/98736 - use programing order preserved RPO in ldist Tree loop distribution uses RPO to build reduced dependence graph, it's important that RPO preserves the original programing order. Though it usually does so, when distributing loop nest, exit BB can be placed before some loop BBs while after loop header. This patch fixes the issue by calling rev_post_order_and_mark_dfs_back_seme. gcc/ChangeLog: PR tree-optimization/98736 * tree-loop-distribution.c * (loop_distribution::bb_top_order_init): Compute RPO with programing order preserved by calling function rev_post_order_and_mark_dfs_back_seme. gcc/testsuite/ChangeLog: PR tree-optimization/98736 * gcc.c-torture/execute/pr98736.c: New test. (cherry picked from commit e0bdccac582c01c928a05f26edcd8f5ac24669eb)
[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152 --- Comment #40 from Richard Biener --- (In reply to Richard Biener from comment #39) > (In reply to Iain Sandoe from comment #38) > > (In reply to Richard Biener from comment #37) > > > Oh, and FYI a cc1 cross from x86_64 to x86_64-apple-darwin19.6.0 doesn't > > > seem > > > to reproduce the issue with the reduced testcase (I seee no call to > > > ___UTF_8_put remaining with -O3 -fPIC -fno-strict-aliasing -fwrapv). > > > > I think my interestingness test isn't strict enough - the creduced code > > resulting doesn't have an extern for ___UTF_8_put and only seems to not > > inline that fn because the interface has been mangled. [ so that the fn is > > legitimately binds_localP as the pasted case ]. > > > > if you still have the build around, out of curiosity, does it fail on the > > original .i file attached here? > > > > and with -fno-trapping-math -fno-math-errno -fschedule-insns2 > > -fomit-frame-pointer > > > > ( I only need O2 to get a fail ). > > Yes, with -O2 -fno-trapping-math -fno-math-errno -fschedule-insns2 > -fomit-frame-pointer it produces the problematical > > .align 4,0x90 > L945: > movl0(%rbp,%r10,4), %esi > callUTF_8_put > movq%r10, %rax > addq$1, %r10 > cmpq%rax, %r12 > jne L945 > > code. But then ___UTF_8_put isn't interposable so I wonder why the linker > even has to resolve anything. Adding -fPIC OTOH should definitely make the > symbol interposable but the same code is still generated ... > > Note the 'extern' declaration shouldn't change anything, only that we > see a definition is relevant. > > breaking on darwin_binds_local_p I see ___UTF_8_put is considered binding > local even with -fPIC. So GCC thinks there will be no linker stub involved. > > Note 'shlib' is passed as false to default_binds_local_p_3 computed as > > 3140 on earlier system versions, and with a TODO to complete. */ > 3141 bool force_overridable = TARGET_KEXTABI && DARWIN_VTABLE_P (decl); > 3142 return default_binds_local_p_3 (decl, force_overridable /* shlib > */, > 3143 false /* weak dominate */, > > and default_binds_local_p_3 would do > > /* If PIC, then assume that any global name can be overridden by > symbols resolved from other modules. */ > if (shlib) > return false; > > ix86_binds_local_p simply passes flag_shlib != 0 as this argument. So it looks like darwin should pass flag_shlib != 0 || force_overridable instead?
[Bug libstdc++/100180] experimental/net/internet/address/v6/members.cc fails on arm-eabi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100180 --- Comment #6 from CVS Commits --- The master branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:0e1e7b77904f1fe2a6dbfe84bb4fc026584ba480 commit r12-89-g0e1e7b77904f1fe2a6dbfe84bb4fc026584ba480 Author: Jonathan Wakely Date: Fri Apr 23 13:38:05 2021 +0100 libstdc++: Allow net::io_context to compile without [PR 100180] This adds dummy placeholders to net::io_context so that it can still be compiled on targets without . libstdc++-v3/ChangeLog: PR libstdc++/100180 * include/experimental/io_context (io_context): Define dummy_pollfd type so that most member functions still compile without and struct pollfd.
[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152 --- Comment #39 from Richard Biener --- (In reply to Iain Sandoe from comment #38) > (In reply to Richard Biener from comment #37) > > Oh, and FYI a cc1 cross from x86_64 to x86_64-apple-darwin19.6.0 doesn't > > seem > > to reproduce the issue with the reduced testcase (I seee no call to > > ___UTF_8_put remaining with -O3 -fPIC -fno-strict-aliasing -fwrapv). > > I think my interestingness test isn't strict enough - the creduced code > resulting doesn't have an extern for ___UTF_8_put and only seems to not > inline that fn because the interface has been mangled. [ so that the fn is > legitimately binds_localP as the pasted case ]. > > if you still have the build around, out of curiosity, does it fail on the > original .i file attached here? > > and with -fno-trapping-math -fno-math-errno -fschedule-insns2 > -fomit-frame-pointer > > ( I only need O2 to get a fail ). Yes, with -O2 -fno-trapping-math -fno-math-errno -fschedule-insns2 -fomit-frame-pointer it produces the problematical .align 4,0x90 L945: movl0(%rbp,%r10,4), %esi callUTF_8_put movq%r10, %rax addq$1, %r10 cmpq%rax, %r12 jne L945 code. But then ___UTF_8_put isn't interposable so I wonder why the linker even has to resolve anything. Adding -fPIC OTOH should definitely make the symbol interposable but the same code is still generated ... Note the 'extern' declaration shouldn't change anything, only that we see a definition is relevant. breaking on darwin_binds_local_p I see ___UTF_8_put is considered binding local even with -fPIC. So GCC thinks there will be no linker stub involved. Note 'shlib' is passed as false to default_binds_local_p_3 computed as 3140 on earlier system versions, and with a TODO to complete. */ 3141 bool force_overridable = TARGET_KEXTABI && DARWIN_VTABLE_P (decl); 3142 return default_binds_local_p_3 (decl, force_overridable /* shlib */, 3143 false /* weak dominate */, and default_binds_local_p_3 would do /* If PIC, then assume that any global name can be overridden by symbols resolved from other modules. */ if (shlib) return false; ix86_binds_local_p simply passes flag_shlib != 0 as this argument.
[Bug c++/100231] New: [C++17] Variable template specialization inside a class gives compilation error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100231 Bug ID: 100231 Summary: [C++17] Variable template specialization inside a class gives compilation error Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: krzyk240 at gmail dot com Target Milestone: --- On the following code: ``` template struct X {}; class Foo { template static constexpr inline bool bar = false; template static constexpr inline bool bar> = true; }; ``` GCC gives error: :8:34: error: explicit template argument list not allowed 8 | static constexpr inline bool bar> = true; | ^ But Clang, ICC and MSVC compile it correctly. Defining variable template bar outside of Foo class produces no compile errors. Compilation command: g++ example.cpp -std=c++17 Live example: https://godbolt.org/z/54hqYxe4P
[Bug target/99932] OpenACC/nvptx offloading execution regressions starting with CUDA 11.2-era Nvidia Driver 460.27.04
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99932 --- Comment #3 from Tom de Vries --- Created attachment 50660 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50660=edit Cuda reproducer
[Bug c++/98767] Function signature lost in concept diagnostic message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98767 --- Comment #2 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:87fc34a461cf362947a430d8a241f653fd83bc7b commit r12-86-g87fc34a461cf362947a430d8a241f653fd83bc7b Author: Patrick Palka Date: Fri Apr 23 08:47:02 2021 -0400 c++: Fix pretty printing pointer to function type [PR98767] When pretty printing a pointer to function type, pp_cxx_parameter_declaration_clause ends up always outputting an empty function parameter list because the loop that outputs the list iterates over 'args' instead of 'types', and 'args' is empty when a FUNCTION_TYPE is passed to this routine (as opposed to a FUNCTION_DECL). This patch fixes this by making the loop iterate over 'types' instead. This patch also moves the retrofitted chain-of-PARM_DECLs printing from here to pp_cxx_requires_expr, the only caller that uses it. Doing so lets us easily output the trailing '...' in the parameter list of a variadic function, which this patch also implements. gcc/cp/ChangeLog: PR c++/98767 * cxx-pretty-print.c (pp_cxx_parameter_declaration_clause): Adjust parameter list loop to iterate over 'types' instead of 'args'. Output the trailing '...' for a variadic function. Remove PARM_DECL support. (pp_cxx_requires_expr): Pretty print the parameter list directly instead of going through pp_cxx_parameter_declaration_clause. gcc/testsuite/ChangeLog: PR c++/98767 * g++.dg/concepts/diagnostic17.C: New test.
[Bug c++/99362] [10 Regression] invalid unused result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99362 Jakub Jelinek changed: What|Removed |Added CC||georg.schwab at emocean dot io --- Comment #10 from Jakub Jelinek --- *** Bug 100210 has been marked as a duplicate of this bug. ***
[Bug c++/100210] [[nodiscard]] constructor causes warning on arm-linux-gnueabihf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100210 Jakub Jelinek changed: What|Removed |Added Resolution|FIXED |DUPLICATE CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Closing as dup. *** This bug has been marked as a duplicate of bug 99362 ***
[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152 --- Comment #38 from Iain Sandoe --- (In reply to Richard Biener from comment #37) > Oh, and FYI a cc1 cross from x86_64 to x86_64-apple-darwin19.6.0 doesn't seem > to reproduce the issue with the reduced testcase (I seee no call to > ___UTF_8_put remaining with -O3 -fPIC -fno-strict-aliasing -fwrapv). I think my interestingness test isn't strict enough - the creduced code resulting doesn't have an extern for ___UTF_8_put and only seems to not inline that fn because the interface has been mangled. [ so that the fn is legitimately binds_localP as the pasted case ]. if you still have the build around, out of curiosity, does it fail on the original .i file attached here? and with -fno-trapping-math -fno-math-errno -fschedule-insns2 -fomit-frame-pointer ( I only need O2 to get a fail ).
[Bug libstdc++/99402] [10 Regression] std::copy creates _GLIBCXX_DEBUG false positive for attempt to subscript a dereferenceable (start-of-sequence) iterator
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99402 --- Comment #13 from François Dumont --- Fixed on gcc-10 branch by this commit https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ab83ce42ea0b2fbc09d51b7bd5e69905dcaa2041.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #8 from Ilya Leoshkevich --- Yeah, inline asm seems to be problematic: /home/iii/gcc/build/gcc/xgcc -B/home/iii/gcc/build/gcc/ /home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c -fdiagnostics-plain-output -O2 -march=z14 -mzarch -S -o long-double-asm-hardreg.s with the patch from comment 2 produces: foo: .LFB0: .cfi_startproc larl%r5,.L4 vl %v0,.L5-.L4(%r5),3 #APP # 10 "/home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c" 1 # %v0 # 0 "" 2 #NO_APP br %r14 `vl %v0,.L5-.L4(%r5),3` loads 1.0L into %v0[0:128]. However, it should be loaded into %v0[0:64] . %v2[0:64]. With the patch from comment 3 I get: foo: .LFB0: .cfi_startproc larl%r5,.L4 ld %f0,.L5-.L4(%r5) ld %f2,.L5-.L4+8(%r5) #APP # 10 "/home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c" 1 # %f0 # 0 "" 2 #NO_APP br %r14 which is correct, but in general case the exact reg that the user requested is not honored.
[Bug c/69558] [8 Regression] glib2 warning pragmas stopped working
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69558 Jakub Jelinek changed: What|Removed |Added Priority|P1 |P2 --- Comment #31 from Jakub Jelinek --- 5 years old bug can't be P1.
[Bug c++/98297] [8/9/10/11 Regression] ICE in cp_parser_elaborated_type_specifier, at cp/parser.c:19653
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98297 --- Comment #6 from Jakub Jelinek --- Ah, tracked already in PR98358.
[Bug target/99748] MVE: Wrong code at -O0 with float to integer conversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99748 Alex Coplan changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Alex Coplan --- Fixed for 10.4, so fixed everywhere.
[Bug target/99748] MVE: Wrong code at -O0 with float to integer conversion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99748 --- Comment #6 from CVS Commits --- The releases/gcc-10 branch has been updated by Alex Coplan : https://gcc.gnu.org/g:283367662c25057fd7c9c98257cca858f85b75fc commit r10-9755-g283367662c25057fd7c9c98257cca858f85b75fc Author: Alex Coplan Date: Tue Apr 6 09:06:27 2021 +0100 arm: Fix PCS for SFmode -> SImode libcalls [PR99748] This patch fixes PR99748 which shows us trying to pass the argument to __aeabi_f2iz in the VFP register s0 when the library function is expecting to use the GPR r0. It also fixes the __aeabi_f2uiz case which was broken in the same way. For the testcase in the PR, here is the code we generate before the patch (with -mfloat-abi=hard -march=armv8.1-m.main+mve -O0): main: push{r7, lr} sub sp, sp, #8 add r7, sp, #0 mov r3, #1065353216 str r3, [r7, #4]@ float vldr.32 s0, [r7, #4] bl __aeabi_f2iz mov r3, r0 cmp r3, #1 [...] This becomes: main: push{r7, lr} sub sp, sp, #8 add r7, sp, #0 mov r3, #1065353216 str r3, [r7, #4]@ float ldr r0, [r7, #4]@ float bl __aeabi_f2iz mov r3, r0 cmp r3, #1 [...] after the patch. We see a similar change for the same testcase with a cast to unsigned instead of int. gcc/ChangeLog: PR target/99748 * config/arm/arm.c (arm_libcall_uses_aapcs_base): Also use base PCS for [su]fix_optab. (cherry picked from commit 16ea7f57891d3fe885ee55b2917208695e184714)
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #7 from Jakub Jelinek --- That said, I'm afraid I don't really understand what wrong happens with the patch I've attached. Trying something like: long double foo (void) { register long double f0 asm ("f0"); f0 = 1.0L; f0 += 127.L; f0 *= 32.L; return f0; } with -O0 -march=z14 -mlong-double-128 so that it is not all folded immediately shows in the end the computations are done in vector registers. And another thing to try is intermix that with inline asm expecting those in "+f" so that intermediate results are pushed to the floating point register pair.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #6 from Jakub Jelinek --- (In reply to Ilya Leoshkevich from comment #5) > That would be an ideal solution, but I wonder how to implement it? Suppose > we find a way to convince expand to pick FPRX2mode for such a long double. > What if the following comes up? > > register long double x asm ("v0"); /* FPRX2mode */ > long double y; /* TFmode */ > x += y; /* convert? */ > > Would it be feasible to also teach expand to do the mode conversions? It is certainly doable, but perhaps with extra target hooks or something similar. Types have their TYPE_MODE and decls have DECL_MODE, though the question is what breaks if TYPE_MODE != DECL_MODE, at least the comment in tree.h says that they can only differ for FIELD_DECLs. Anyway, in GIMPLE register vars are non-SSA, so apart from inline asm one needs separate loads and stores to them, so if we could expand those as having FPRX2 hard reg and loads from it convert to TFmode and stores into it convert from TFmode, ... > One other alternative might be to detect `register long double asm("fN")` > declarations and go back to using floating point register pairs for > functions that contain them. But this might be actually best short-time solution (for GCC 11.x).
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #5 from Ilya Leoshkevich --- That would be an ideal solution, but I wonder how to implement it? Suppose we find a way to convince expand to pick FPRX2mode for such a long double. What if the following comes up? register long double x asm ("v0"); /* FPRX2mode */ long double y; /* TFmode */ x += y; /* convert? */ Would it be feasible to also teach expand to do the mode conversions? One other alternative might be to detect `register long double asm("fN")` declarations and go back to using floating point register pairs for functions that contain them.
[Bug target/100152] [10/11/12 Regression] used caller-saved register not preserved across a call.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152 --- Comment #37 from Richard Biener --- Oh, and FYI a cc1 cross from x86_64 to x86_64-apple-darwin19.6.0 doesn't seem to reproduce the issue with the reduced testcase (I seee no call to ___UTF_8_put remaining with -O3 -fPIC -fno-strict-aliasing -fwrapv).
[Bug rtl-optimization/100225] [8/9/10/11/12 Regression] ICE in add_cross_iteration_register_deps, at ddg.c:291
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100225 Alex Coplan changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW CC||acoplan at gcc dot gnu.org Last reconfirmed||2021-04-23 --- Comment #3 from Alex Coplan --- Confirmed.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 Jakub Jelinek changed: What|Removed |Added Priority|P3 |P2
[Bug tree-optimization/100222] Redundant mark_irreducible_loops () in predicate.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100222 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Richard Biener --- Fixed.
[Bug tree-optimization/100222] Redundant mark_irreducible_loops () in predicate.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100222 --- Comment #2 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:500305a92ef85e6b87ad428a35221c62f4037b93 commit r12-82-g500305a92ef85e6b87ad428a35221c62f4037b93 Author: Richard Biener Date: Fri Apr 23 11:16:52 2021 +0200 tree-optimization/100222 - remove redundant mark_irreducible_loops calls loop_optimizer_init (LOOPS_NORMAL) already performs this (quite expensive) marking. 2021-04-23 Richard Biener PR tree-optimization/100222 * predict.c (pass_profile::execute): Remove redundant call to mark_irreducible_loops. (report_predictor_hitrates): Likewise.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #4 from Jakub Jelinek --- That seems like quite undesirable API change. Can't the backend when it sees long double register vars for the fN registers change the mode from TFmode to that new FPRX2mode, so that old code keeps working?
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 Ilya Leoshkevich changed: What|Removed |Added CC||iii at linux dot ibm.com --- Comment #3 from Ilya Leoshkevich --- There main problem here is that `register long double f0 asm ("f0")` does not make sense on z14 anymore. long doubles are stored in vector registers now, not in floating-point register pairs. If we skip the hard reg, the code will end up having the following semantics: vr0[0:128] = 1.0L; asm("/* expect the value in vr0[0:64] . vr2[0:64] */"); and fail during the run time. So I think it's better to use the "best effort" approach and force it into a pseudo, even if this would mean that the user-specified register is not honored: --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -16814,6 +16814,12 @@ s390_md_asm_adjust (vec , vec , gcc_assert (allows_reg); /* Copy input value from a vector register into a FPR pair. */ rtx fprx2 = gen_reg_rtx (FPRX2mode); + if (REG_P (inputs[i]) && HARD_REGISTER_P (inputs[i])) + { + rtx orig_input = inputs[i]; + inputs[i] = gen_reg_rtx (TFmode); + emit_move_insn (inputs[i], orig_input); + } emit_insn (gen_tf_to_fprx2 (fprx2, inputs[i])); inputs[i] = fprx2; input_modes[i] = FPRX2mode; I need to check whether we can keep the output logic as is. Ideally the code should be adapted and use the __LONG_DOUBLE_VX__ macro like this: #ifdef __LONG_DOUBLE_VX__ register long double f0 asm ("v0"); #else register long double f0 asm ("f0"); #endif f0 = 1.0L; #ifdef __LONG_DOUBLE_VX__ asm("" : : "v" (f0)); #else asm("" : : "f" (f0)); #endif Maybe a warning recommending to do this should be printed.
[Bug fortran/100227] [8/9/10/11/12 Regression] write with implicit loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100227 Dominique d'Humieres changed: What|Removed |Added CC||tkoenig at gcc dot gnu.org Known to fail||11.0, 12.0 Status|UNCONFIRMED |NEW Last reconfirmed||2021-04-23 Ever confirmed|0 |1 --- Comment #2 from Dominique d'Humieres --- Workaround: use -fno-frontend-optimize.
[Bug rtl-optimization/100225] [8/9/10/11/12 Regression] ICE in add_cross_iteration_register_deps, at ddg.c:291
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100225 --- Comment #2 from Martin Liška --- Ah, you are right, sorry.
[Bug target/99488] dwz: /usr/lib/gcc/mips64el-linux-gnuabi64/11/go1: Found two copies of .debug_line_str section
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99488 --- Comment #12 from YunQiang Su --- This problem disappears if we build gcc 11 with binutils 2.36.
[Bug rtl-optimization/100225] [8/9/10/11/12 Regression] ICE in add_cross_iteration_register_deps, at ddg.c:291
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100225 Alexander Monakov changed: What|Removed |Added Blocks|85099 | CC||amonakov at gcc dot gnu.org, ||zhroma at gcc dot gnu.org --- Comment #1 from Alexander Monakov --- Hi Martin, this is a modulo-scheduling bug; I think you added "Blocks: sel-sched" by mistake — removing, and Cc'ing Roman. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85099 [Bug 85099] [meta-bug] selective scheduling issues
[Bug rtl-optimization/100230] ASan: alloc-dealloc-mismatch in early-remat.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100230 --- Comment #1 from Alex Coplan --- Testing a fix.
[Bug rtl-optimization/100230] ASan: alloc-dealloc-mismatch in early-remat.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100230 Alex Coplan changed: What|Removed |Added Last reconfirmed||2021-04-23 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |acoplan at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED
[Bug rtl-optimization/100230] New: ASan: alloc-dealloc-mismatch in early-remat.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100230 Bug ID: 100230 Summary: ASan: alloc-dealloc-mismatch in early-remat.c Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- Bootstrapping on aarch64 --with-build-config=bootstrap-asan and running the testsuite shows the following issue: $ cat test.c int a, b; void c() { while (b) a += b++; } $ gcc/xgcc -B gcc -c test.c -march=armv8.2-a+sve -O2 -ftree-vectorize = ==22323==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x92f0d900 #0 0x75ed5c in operator delete(void*, unsigned long) /home/alecop01/toolchain/src/gcc/libsanitizer/asan/asan_new_delete.cpp:172 #1 0x33b033c in sort_candidates /home/alecop01/toolchain/src/gcc/gcc/early-remat.c:1062 #2 0x33b033c in run /home/alecop01/toolchain/src/gcc/gcc/early-remat.c:2567 #3 0x33b033c in execute /home/alecop01/toolchain/src/gcc/gcc/early-remat.c:2629 #4 0x151ebd4 in execute_one_pass(opt_pass*) /home/alecop01/toolchain/src/gcc/gcc/passes.c:2567 #5 0x15201a0 in execute_pass_list_1 /home/alecop01/toolchain/src/gcc/gcc/passes.c:2656 #6 0x15201c4 in execute_pass_list_1 /home/alecop01/toolchain/src/gcc/gcc/passes.c:2657 #7 0x1520270 in execute_pass_list(function*, opt_pass*) /home/alecop01/toolchain/src/gcc/gcc/passes.c:2667 #8 0xbb7c34 in cgraph_node::expand() /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1830 #9 0xbb7c34 in cgraph_node::expand() /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1783 #10 0xbba6d4 in expand_all_functions /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1994 #11 0xbba6d4 in symbol_table::compile() /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2358 #12 0xbc18a8 in symbol_table::compile() /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2271 #13 0xbc18a8 in symbol_table::finalize_compilation_unit() /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2539 #14 0x1793f44 in compile_file /home/alecop01/toolchain/src/gcc/gcc/toplev.c:482 #15 0x6d4ffc in do_compile /home/alecop01/toolchain/src/gcc/gcc/toplev.c:2201 #16 0x6d4ffc in toplev::main(int, char**) /home/alecop01/toolchain/src/gcc/gcc/toplev.c:2340 #17 0x6df804 in main /home/alecop01/toolchain/src/gcc/gcc/main.c:39 #18 0x973276dc in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x206dc) #19 0x6e271c (/data/alecop01/builds/gcc11-bstrap-asan/gcc/cc1+0x6e271c) 0x92f0d900 is located 0 bytes inside of 28-byte region [0x92f0d900,0x92f0d91c) allocated by thread T0 here: #0 0x75e16c in operator new[](unsigned long) /home/alecop01/toolchain/src/gcc/libsanitizer/asan/asan_new_delete.cpp:102 #1 0x33b027c in sort_candidates /home/alecop01/toolchain/src/gcc/gcc/early-remat.c:1056 #2 0x33b027c in run /home/alecop01/toolchain/src/gcc/gcc/early-remat.c:2567 #3 0x33b027c in execute /home/alecop01/toolchain/src/gcc/gcc/early-remat.c:2629 #4 0x151ebd4 in execute_one_pass(opt_pass*) /home/alecop01/toolchain/src/gcc/gcc/passes.c:2567 #5 0x15201a0 in execute_pass_list_1 /home/alecop01/toolchain/src/gcc/gcc/passes.c:2656 #6 0x15201c4 in execute_pass_list_1 /home/alecop01/toolchain/src/gcc/gcc/passes.c:2657 #7 0x1520270 in execute_pass_list(function*, opt_pass*) /home/alecop01/toolchain/src/gcc/gcc/passes.c:2667 #8 0xbb7c34 in cgraph_node::expand() /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1830 #9 0xbb7c34 in cgraph_node::expand() /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1783 #10 0xbba6d4 in expand_all_functions /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:1994 #11 0xbba6d4 in symbol_table::compile() /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2358 #12 0xbc18a8 in symbol_table::compile() /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2271 #13 0xbc18a8 in symbol_table::finalize_compilation_unit() /home/alecop01/toolchain/src/gcc/gcc/cgraphunit.c:2539 #14 0x1793f44 in compile_file /home/alecop01/toolchain/src/gcc/gcc/toplev.c:482 #15 0x6d4ffc in do_compile /home/alecop01/toolchain/src/gcc/gcc/toplev.c:2201 #16 0x6d4ffc in toplev::main(int, char**) /home/alecop01/toolchain/src/gcc/gcc/toplev.c:2340 #17 0x6df804 in main /home/alecop01/toolchain/src/gcc/gcc/main.c:39 #18 0x973276dc in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x206dc) #19 0x6e271c (/data/alecop01/builds/gcc11-bstrap-asan/gcc/cc1+0x6e271c) The fix looks obvious.
[Bug target/100216] arm: UB in arm_canonicalize_comparison (shift exponent 127 is too large for 64-bit type)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100216 Richard Earnshaw changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2021-04-23 Status|UNCONFIRMED |NEW --- Comment #2 from Richard Earnshaw --- Confirmed by visual inspection. Clearly this code was written at a time when the largest integral mode on Arm was DImode. It won't work for wider modes and it won't do anything for non-integral modes. Needs an overhaul.
[Bug c++/98297] [8/9/10/11 Regression] ICE in cp_parser_elaborated_type_specifier, at cp/parser.c:19653
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98297 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #5 from Jakub Jelinek --- Note, the testcase FAILs on the 8 branch, the emitted error is different. $ gcc-9/obj28/gcc/cc1plus -quiet -std=c++11 /tmp/pr98297.C -o /tmp/pr98297.s /tmp/pr98297.C:5:1: warning: ‘b’ attribute directive ignored [-Wattributes] 5 | a ; // { dg-error "does not declare anything" } | ^~~ /tmp/pr98297.C:5:1: error: declaration does not declare anything [-fpermissive] $ gcc-8/obj32/gcc/cc1plus -quiet -std=c++11 /tmp/pr98297.C -o /tmp/pr98297.s /tmp/pr98297.C:5:1: warning: ‘b’ attribute directive ignored [-Wattributes] a ; // { dg-error "does not declare anything" } ^~~ /tmp/pr98297.C:5:1: error: name of class shadows template template parameter ‘a’ $ gcc-8/obj30/gcc/cc1plus -quiet -std=c++11 /tmp/pr98297.C -o /tmp/pr98297.s /tmp/pr98297.C:5:1: internal compiler error: Segmentation fault a ; // { dg-error "does not declare anything" } ^~~ gcc-8/obj30 is 5 months old snapshot which expectedly ICEs, but the middle error is different from what the test expects.