[Bug libstdc++/105720] New: std::views::split_view wrong behaviour in case of partial match
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105720 Bug ID: 105720 Summary: std::views::split_view wrong behaviour in case of partial match Product: gcc Version: 10.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: andij.cr at gmail dot com Target Milestone: --- compiled with g++-10 -std=c++20 split_view_wrong.cpp -lfmt godbolt link https://gcc.godbolt.org/z/47TxWovd4 fmtlib used for exposition only #include #include #include #include auto words_no_bug = std::string_view{"Hello-_-C++-_-20-_-!-_-"}; auto words_bug = std::string_view{"Hello--_-C++-_-20-_-!-_-"}; auto delim = std::string_view{"-_-"}; // needed because split_view is lazy in gcc 10.3 auto range_to_str = [](auto &) { return fmt::format("{}", fmt::join(r, "")); }; int main() { fmt::print("no bug: '{}' tokens: {}\n", words_no_bug, words_no_bug | std::views::split(delim) | std::views::transform(range_to_str)); fmt::print("bug: '{}' tokens: {}\n", words_bug, words_bug | std::views::split(delim) | std::views::transform(range_to_str)); } this code applies split to tokenize a text compiled with gcc-10.3 it wrongly produces no bug: 'Hello-_-C++-_-20-_-!-_-' tokens: ["Hello", "C++", "20", "!"] bug: 'Hello--_-C++-_-20-_-!-_-' tokens: ["Hello-", "20", "!"] while compiled with gcc-11.3 is correctly produces no bug: 'Hello-_-C++-_-20-_-!-_-' tokens: ["Hello", "C++", "20", "!"] bug: 'Hello--_-C++-_-20-_-!-_-' tokens: ["Hello-", "C++", "20", "!"] notice how the substring "--_-C++" instead of being split in ["-", "C++"] is split as ["-"], skipping the "C++" token. it's fixed from gcc-11, but i couldn't find a mention in the release notes about it
[Bug tree-optimization/104959] New: nested lambda capture pack by ref will load from nullptr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104959 Bug ID: 104959 Summary: nested lambda capture pack by ref will load from nullptr Product: gcc Version: 10.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: andij.cr at gmail dot com Target Milestone: --- testcase: #include template auto line = [](Ts &&...args) { if constexpr (sizeof...(Ts) != 0) { ([&] { assert( != nullptr); }(), ...); } }; int main() { line<10>(false); } compiling and executing this with g++ 10.3 -std=c++20 -O1 -fsanitize=undefined will trigger the assertion. this code is a reduction of a more complex code, where the bug caused a crash. compiling with -O0 or with GCC 11 will not trigger the assertion. each template, lambda, if constexpr (sizeof...) seems to be necessary to trigger the bug the assert needs to be here to trigger the load of args using a different method (e.g. using args in an expression) will also trigger -Wuninitialized compiler explorer link: https://gcc.godbolt.org/z/W7EMTP4W8 note that in the assembly __assert_fail is called directly this seems similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68177 and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97938
[Bug tree-optimization/104275] New: Os does not apply return value optimization while O2 and O3 does
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104275 Bug ID: 104275 Summary: Os does not apply return value optimization while O2 and O3 does Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: andij.cr at gmail dot com Target Milestone: --- tested from gcc 8 to gcc 11 an identity function (mark) interposed in a call stack that ends in a complex type is reasonably elided in O2 and O3, but at Os it creates a somewhat strange assembly. tested on arm32 and x86_64. for a less artificial example, where the problem still appears: https://gcc.godbolt.org/z/GbKrGKa6f code: https://godbolt.org/z/v95jEvvzc // condensed result of a constexpr trasformation. // in this form, it would be nice if it was transparent to the value template auto mark(Ts&& head) noexcept -> decltype(auto) { return static_cast(head); } #include // generic producer of a complex type auto generate() -> std::vector; // here is a stack of functions using mark namespace { // in an anonymous namespace to nudge the compiler to inline them auto user_base() { return mark(generate()); } auto user_mark() { return mark(user_base()); } auto user_mark2() { return mark(user_mark()); } auto user_mark3() { return mark(user_mark2()); } } // namespace // this function has a normal assembly at O2 and O3 // but a silly one at Os auto user_mark4() { return mark(user_mark3()); } compiled with -std=c++17 -O2 user_mark4(): pushr12 mov r12, rdi sub rsp, 32 mov rdi, rsp callgenerate() mov rax, QWORD PTR [rsp] mov QWORD PTR [r12], rax mov rax, QWORD PTR [rsp+8] mov QWORD PTR [r12+8], rax mov rax, QWORD PTR [rsp+16] mov QWORD PTR [r12+16], rax add rsp, 32 mov rax, r12 pop r12 ret compiled with -std=c++17 -Os user_mark4(): pushr13 pushr12 mov r12, rdi pushrbp pushrbx sub rsp, 40 lea rdi, [rsp+8] callgenerate() lea rdi, [rsp+8] mov r13, QWORD PTR [rsp+8] mov rbp, QWORD PTR [rsp+16] mov QWORD PTR [rsp+8], 0 mov rbx, QWORD PTR [rsp+24] mov QWORD PTR [rsp+16], 0 mov QWORD PTR [rsp+24], 0 callstd::_Vector_base >::~_Vector_base() [base object destructor] lea rdi, [rsp+8] mov QWORD PTR [rsp+24], 0 mov QWORD PTR [rsp+16], 0 mov QWORD PTR [rsp+8], 0 callstd::_Vector_base >::~_Vector_base() [base object destructor] lea rdi, [rsp+8] mov QWORD PTR [rsp+24], 0 mov QWORD PTR [rsp+16], 0 mov QWORD PTR [rsp+8], 0 callstd::_Vector_base >::~_Vector_base() [base object destructor] lea rdi, [rsp+8] mov QWORD PTR [rsp+24], 0 mov QWORD PTR [rsp+16], 0 mov QWORD PTR [rsp+8], 0 callstd::_Vector_base >::~_Vector_base() [base object destructor] mov QWORD PTR [r12], r13 lea rdi, [rsp+8] mov QWORD PTR [r12+8], rbp mov QWORD PTR [r12+16], rbx mov QWORD PTR [rsp+24], 0 mov QWORD PTR [rsp+16], 0 mov QWORD PTR [rsp+8], 0 callstd::_Vector_base >::~_Vector_base() [base object destructor] add rsp, 40 mov rax, r12 pop rbx pop rbp pop r12 pop r13 ret
[Bug c++/93513] New: internal compiler error internal compiler error: unexpected expression ‘(char)(e)’ of kind cast_expr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93513 Bug ID: 93513 Summary: internal compiler error internal compiler error: unexpected expression ‘(char)(e)’ of kind cast_expr Product: gcc Version: 9.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: andij.cr at gmail dot com Target Milestone: --- compiling this c++ code enum class error {}; template void afunction(F) { error{char(0)}; } with g++ 9.2 with std=c++17 or std=c++20 will give internal compiler error: unexpected expression ‘(char)(0)’ of kind cast_expr 4 | error{char(0)}; |^ in contrast, with std=c++14: error: cannot convert ‘char’ to ‘error’ in initialization 4 | error{char(0)}; | ^~~ | | | char checking with compiler explorer, it seems that gcc 8.3 does not generate this error: https://gcc.godbolt.org/z/yZ9ckH
[Bug middle-end/91674] New: [ARM/thumb] redundant memcpy does not get optimized away on thumb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91674 Bug ID: 91674 Summary: [ARM/thumb] redundant memcpy does not get optimized away on thumb Product: gcc Version: 8.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: andij.cr at gmail dot com Target Milestone: --- consider this c++ function #include #include #include auto to_bytes(uint32_t arg){ std::array out{}; std::memcpy(out.data(), , sizeof(arg)); return out; } on a little endian arch this function could be no-op. compiled with g++ -Os we get: to_bytes(unsigned int): mov eax, edi ret on arm this somewhat works: compiled with arm-none-eabi-g++ -Os to_bytes(unsigned int): sub sp, sp, #8 add sp, sp, #8 bx lr notice the redundant sub followed by an add but if if thumb is forced, the full optimization is not performed compiled with arm-none-eabi-g++ -Os -march=armv7-m -mtune=cortex-m3 to_bytes(unsigned int): mov r3, r0 movsr0, #0 uxtbr2, r3 bfi r0, r2, #0, #8 ubfxr2, r3, #8, #8 bfi r0, r2, #8, #8 ubfxr2, r3, #16, #8 bfi r0, r2, #16, #8 lsrsr3, r3, #24 sub sp, sp, #8 bfi r0, r3, #24, #8 add sp, sp, #8 bx lr in contrast, cross compiling with clang7 produces the desired optimization: compiled with clang++7 --target=arm-none-eabi -march=armv7-m -mtune=cortex-m3 to_bytes(unsigned int): bx lr notice also how there is no redundant stack pointer manipulation