[Bug libstdc++/105720] New: std::views::split_view wrong behaviour in case of partial match

2022-05-24 Thread andij.cr at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105720

Bug ID: 105720
   Summary: std::views::split_view wrong behaviour in case of
partial match
   Product: gcc
   Version: 10.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andij.cr at gmail dot com
  Target Milestone: ---

compiled with 
g++-10 -std=c++20 split_view_wrong.cpp -lfmt

godbolt link https://gcc.godbolt.org/z/47TxWovd4

fmtlib used for exposition only

#include 
#include 
#include 
#include 

auto words_no_bug = std::string_view{"Hello-_-C++-_-20-_-!-_-"};
auto words_bug = std::string_view{"Hello--_-C++-_-20-_-!-_-"};
auto delim = std::string_view{"-_-"};

// needed because split_view is lazy in gcc 10.3
auto range_to_str = [](auto &) {
  return fmt::format("{}", fmt::join(r, ""));
};

int main() {
  fmt::print("no bug: '{}' tokens: {}\n", words_no_bug,
 words_no_bug | std::views::split(delim) |
 std::views::transform(range_to_str));

  fmt::print("bug: '{}' tokens: {}\n", words_bug,
 words_bug | std::views::split(delim) |
 std::views::transform(range_to_str));
}

this code applies split to tokenize a text

compiled with gcc-10.3 it wrongly produces

no bug: 'Hello-_-C++-_-20-_-!-_-' tokens: ["Hello", "C++", "20", "!"]
bug: 'Hello--_-C++-_-20-_-!-_-' tokens: ["Hello-", "20", "!"]

while compiled with gcc-11.3 is correctly produces 

no bug: 'Hello-_-C++-_-20-_-!-_-' tokens: ["Hello", "C++", "20", "!"]
bug: 'Hello--_-C++-_-20-_-!-_-' tokens: ["Hello-", "C++", "20", "!"]


notice how the substring "--_-C++" instead of being split in ["-", "C++"]
is split as ["-"], skipping the "C++" token.

it's fixed from gcc-11, but i couldn't find a mention in the release notes
about it

[Bug tree-optimization/104959] New: nested lambda capture pack by ref will load from nullptr

2022-03-16 Thread andij.cr at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104959

Bug ID: 104959
   Summary: nested lambda capture pack by ref will load from
nullptr
   Product: gcc
   Version: 10.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andij.cr at gmail dot com
  Target Milestone: ---

testcase:

#include 

template 
auto line = [](Ts &&...args) {
  if constexpr (sizeof...(Ts) != 0) {
([&] { assert( != nullptr); }(), ...);
  }
};

int main() { line<10>(false); }

compiling and executing this with 

g++ 10.3 -std=c++20 -O1 -fsanitize=undefined

will trigger the assertion. 
this code is a reduction of a more complex code, where the bug caused a crash.
compiling with -O0 or with GCC 11 will not trigger the assertion.


each template, lambda, if constexpr (sizeof...) seems to be necessary 
to trigger the bug
the assert needs to be here to trigger the load of args
using a different method (e.g. using args in an expression)
will also trigger -Wuninitialized

compiler explorer link:
https://gcc.godbolt.org/z/W7EMTP4W8

note that in the assembly __assert_fail is called directly 

this seems similar to 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68177
and 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97938

[Bug tree-optimization/104275] New: Os does not apply return value optimization while O2 and O3 does

2022-01-28 Thread andij.cr at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104275

Bug ID: 104275
   Summary: Os does not apply return value optimization while O2
and O3 does
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andij.cr at gmail dot com
  Target Milestone: ---

tested from gcc 8 to gcc 11

an identity function (mark) interposed in a call stack that ends in a complex
type is reasonably elided in O2 and O3, but at Os it creates a somewhat strange
assembly.
tested on arm32 and x86_64. 
for a less artificial example, where the problem still appears:
https://gcc.godbolt.org/z/GbKrGKa6f

code:

https://godbolt.org/z/v95jEvvzc

// condensed result of a constexpr trasformation.
// in this form, it would be nice if it was transparent to the value
template 
auto mark(Ts&& head) noexcept -> decltype(auto) {
return static_cast(head);
}

#include 
// generic producer of a complex type
auto generate() -> std::vector;


// here is a stack of functions using mark
namespace {
// in an anonymous namespace to nudge the compiler to inline them
auto user_base() { return mark(generate()); }
auto user_mark() { return mark(user_base()); }
auto user_mark2() { return mark(user_mark()); }
auto user_mark3() { return mark(user_mark2()); }
}  // namespace

// this function has a normal assembly at O2 and O3
// but a silly one at Os
auto user_mark4() { return mark(user_mark3()); }


compiled with 
-std=c++17 -O2

user_mark4():
pushr12
mov r12, rdi
sub rsp, 32
mov rdi, rsp
callgenerate()
mov rax, QWORD PTR [rsp]
mov QWORD PTR [r12], rax
mov rax, QWORD PTR [rsp+8]
mov QWORD PTR [r12+8], rax
mov rax, QWORD PTR [rsp+16]
mov QWORD PTR [r12+16], rax
add rsp, 32
mov rax, r12
pop r12
ret

compiled with
-std=c++17 -Os 
user_mark4():
pushr13
pushr12
mov r12, rdi
pushrbp
pushrbx
sub rsp, 40
lea rdi, [rsp+8]
callgenerate()
lea rdi, [rsp+8]
mov r13, QWORD PTR [rsp+8]
mov rbp, QWORD PTR [rsp+16]
mov QWORD PTR [rsp+8], 0
mov rbx, QWORD PTR [rsp+24]
mov QWORD PTR [rsp+16], 0
mov QWORD PTR [rsp+24], 0
callstd::_Vector_base
>::~_Vector_base() [base object destructor]
lea rdi, [rsp+8]
mov QWORD PTR [rsp+24], 0
mov QWORD PTR [rsp+16], 0
mov QWORD PTR [rsp+8], 0
callstd::_Vector_base
>::~_Vector_base() [base object destructor]
lea rdi, [rsp+8]
mov QWORD PTR [rsp+24], 0
mov QWORD PTR [rsp+16], 0
mov QWORD PTR [rsp+8], 0
callstd::_Vector_base
>::~_Vector_base() [base object destructor]
lea rdi, [rsp+8]
mov QWORD PTR [rsp+24], 0
mov QWORD PTR [rsp+16], 0
mov QWORD PTR [rsp+8], 0
callstd::_Vector_base
>::~_Vector_base() [base object destructor]
mov QWORD PTR [r12], r13
lea rdi, [rsp+8]
mov QWORD PTR [r12+8], rbp
mov QWORD PTR [r12+16], rbx
mov QWORD PTR [rsp+24], 0
mov QWORD PTR [rsp+16], 0
mov QWORD PTR [rsp+8], 0
callstd::_Vector_base
>::~_Vector_base() [base object destructor]
add rsp, 40
mov rax, r12
pop rbx
pop rbp
pop r12
pop r13
ret

[Bug c++/93513] New: internal compiler error internal compiler error: unexpected expression ‘(char)(e)’ of kind cast_expr

2020-01-30 Thread andij.cr at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93513

Bug ID: 93513
   Summary: internal compiler error internal compiler error:
unexpected expression ‘(char)(e)’ of kind cast_expr
   Product: gcc
   Version: 9.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andij.cr at gmail dot com
  Target Milestone: ---

compiling this c++ code

enum class error {};
template 
void afunction(F) {
  error{char(0)};
}


with g++ 9.2
with std=c++17 or std=c++20

will give 
internal compiler error: unexpected expression ‘(char)(0)’ of kind cast_expr
4 |   error{char(0)};
  |^

in contrast, with std=c++14:
error: cannot convert ‘char’ to ‘error’ in initialization
4 |   error{char(0)};
  | ^~~
  | |
  | char


checking with compiler explorer, it seems that gcc 8.3 does not generate this
error: https://gcc.godbolt.org/z/yZ9ckH

[Bug middle-end/91674] New: [ARM/thumb] redundant memcpy does not get optimized away on thumb

2019-09-05 Thread andij.cr at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91674

Bug ID: 91674
   Summary: [ARM/thumb] redundant memcpy does not get optimized
away on thumb
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andij.cr at gmail dot com
  Target Milestone: ---

consider this c++ function

#include 
#include 
#include 
auto to_bytes(uint32_t arg){
std::array out{};
std::memcpy(out.data(), , sizeof(arg));
return out;
}

on a little endian arch this function could be no-op. 
compiled with g++ -Os we get:
to_bytes(unsigned int):
mov eax, edi
ret 

on arm this somewhat works:
compiled with arm-none-eabi-g++ -Os
to_bytes(unsigned int):
sub sp, sp, #8
add sp, sp, #8
bx  lr

notice the redundant sub followed by an add

but if if thumb is forced, the full optimization is not performed
compiled with arm-none-eabi-g++ -Os -march=armv7-m -mtune=cortex-m3
to_bytes(unsigned int):
mov r3, r0
movsr0, #0
uxtbr2, r3
bfi r0, r2, #0, #8
ubfxr2, r3, #8, #8
bfi r0, r2, #8, #8
ubfxr2, r3, #16, #8
bfi r0, r2, #16, #8
lsrsr3, r3, #24
sub sp, sp, #8
bfi r0, r3, #24, #8
add sp, sp, #8
bx  lr

in contrast, cross compiling with clang7 produces the desired optimization:
compiled with clang++7 --target=arm-none-eabi -march=armv7-m -mtune=cortex-m3
to_bytes(unsigned int):
bx  lr

notice also how there is no redundant stack pointer manipulation