[Bug c++/100611] coroutines: destructor called too many times for coroutine lambda stored object
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100611 --- Comment #9 from Avi Kivity --- At least, on 13.2.1. Maybe a backport is required.
[Bug c++/100611] coroutines: destructor called too many times for coroutine lambda stored object
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100611 --- Comment #8 from Avi Kivity --- Congratulations on getting the account! The bug is fixed though.
[Bug c++/66034] Enhancement request: fiber-local storage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66034 --- Comment #3 from Avi Kivity --- Yes, this is relevant to user-level threads, not coroutines.
[Bug c/103960] Clang's -Wunknown-attributes is more useful than -Wattributes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103960 --- Comment #6 from Avi Kivity --- Ah! thanks.
[Bug c/103960] Clang's -Wunknown-attributes is more useful than -Wattributes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103960 Avi Kivity changed: What|Removed |Added CC||avi at scylladb dot com --- Comment #4 from Avi Kivity --- It would be nice to have -fignore-attribute-namespace=clang, so we can ignore [[clang::very_useful_attribute(blah)]] on gcc and [[gnu::another_good_thing(foo)]] on clang, instead of having to wrap everything with macros. (complicated a little by clang recognizing some attributes in the gnu namespace).
[Bug c++/109227] coroutines: ICE in tree check: expected record_type or union_type or qual_union_type, have array_type in build_special_member_call, at cp/call.cc:11067
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109227 --- Comment #1 from Avi Kivity --- Did you forget to attach bad.cc?
[Bug c++/98056] coroutines: ICE tree check: expected record_type or union_type or qual_union_type, have array_type since r11-2183-g0f66b8486cea8668
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98056 --- Comment #22 from Avi Kivity --- This is wonderful, thank you.
[Bug tree-optimization/109053] New: [missed optimization] value-range tracking fails in simple case with __builtin_unreachable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109053 Bug ID: 109053 Summary: [missed optimization] value-range tracking fails in simple case with __builtin_unreachable Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: avi at scylladb dot com Target Milestone: --- I'm trying to use __builtin_unreachable() to inject assumptions about values of variables into this code. In this case, the fact that a reference count must be one or greater. Consider the code: struct refcounted { int* p; refcounted() : p(new int(1)) {} ~refcounted() { assume_stuff(); if (!--*p) { delete p; } } refcounted(const refcounted& x) : p(x.p) { assume_stuff(); ++*p; assume_stuff(); } refcounted& operator=(const refcounted& x) { assume_stuff(); x.assume_stuff(); if (this != ) { ++*x.p; if (!--*p) { delete p; } p = x.p; } assume_stuff(); x.assume_stuff(); return *this; } void assume_stuff() const { if (*p <= 0) { __builtin_unreachable(); } } }; refcounted assign(refcounted& a, refcounted& b) { auto x = a; a = b; return x; } In the assign() function, although we assign to `a`, we also return it (as `x`), so there's never a reason to call operator delete. Yet the code does. assign(refcounted&, refcounted&): mov%rdi,%rax mov(%rsi),%rdi mov%rdi,(%rax) addl $0x1,(%rdi) ; gcc now knows that (%rdi) is 2 or greater cmp%rdx,%rsi je 68 push %rbp mov%rdx,%rbp push %rbx mov%rsi,%rbx sub$0x18,%rsp mov(%rdx),%rcx addl $0x1,(%rcx) mov(%rdi),%edx sub$0x1,%edx ; gcc now knows that (%rdi) is 1 or greater je 40 ; so how can it be zero? ; if gcc tracked the ranges correctly, it would have eliminated the branch and made assign() a leaf function mov%edx,(%rdi) mov%rcx,(%rbx) add$0x18,%rsp pop%rbx pop%rbp ret cs nopw 0x0(%rax,%rax,1) mov$0x4,%esi mov%rax,0x8(%rsp) call 4f R_X86_64_PLT32 operator delete(void*, unsigned long)-0x4 mov0x0(%rbp),%rcx mov0x8(%rsp),%rax mov%rcx,(%rbx) add$0x18,%rsp pop%rbx pop%rbp ret nopw 0x0(%rax,%rax,1) ret Also on: https://godbolt.org/z/Tnehj86hc
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #17 from Avi Kivity --- This was apparently fixed by 58a7b1e354530d8dfe7d8fb859c8b8b5a9140f1f. At least, I no longer observe copying of __old4 in gimple. Instead, I see addresses taken, and if I squint I can see that object properly destroyed later on. I can't test on my full-scale program due to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98056. I note that one of the duplicates is titled "When constructing object, calling function and performing co_await in same statement, temporary is erroneously moved trivially", which is exactly what this PR is about, so it can be safely marked as a duplicate as well. Please consider backporting https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99576, this is a subtle code generation bug.
[Bug middle-end/105204] -Wuse-after-free=1 inconsistency with conditional free
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105204 --- Comment #3 from Avi Kivity --- Does one have to declare all preconditions? I happen to know that shared_ptr with a reference count of zero is impossible, and I can understand the compiler not being able to infer it, but just because it's unable to infer it doesn't mean my code is wrong. Maybe the warning should be renamed -Wmaybe-use-after-free to indicate the compiler just thinks there may be a bug, but also acknowledges it doesn't know everything.
[Bug rtl-optimization/107772] function prologue generated even though it's only needed in an unlikely path
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107772 --- Comment #5 from Avi Kivity --- It indeed generates better code. However, it requires that I duplicate the function body, which can be hard at times (consider f == std::transform and "if (*b != 0) { *b = g(*b); }" as a lambda input.
[Bug rtl-optimization/107772] function prologue generated even though it's only needed in an unlikely path
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107772 --- Comment #2 from Avi Kivity --- I expect something like this: f(int*, int*): cmp rdi, rsi je .L10 .L4: cmp DWORD PTR [rsi], 0 jne .L14 .L3 add rsi, 4 cmp rsi, rdi jne .L4 .L10 ret .section .text.cold .L14: pushrsi pushrdi mov rax, DWORD PTR [rsi] callg(int) pop rdi pop rsi mov DWORD PTR [rsi], eax jmp .L3
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #16 from Avi Kivity --- Bug hasn't fixed itself as of ccb9c7b129206209cfc315ab1a0432b5f517bdd9.
[Bug rtl-optimization/107772] New: [missed optimization] function prologue generated even though it's only needed in an unlikely path
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107772 Bug ID: 107772 Summary: [missed optimization] function prologue generated even though it's only needed in an unlikely path Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: avi at scylladb dot com Target Milestone: --- Consider int g(int); void f(int* b, int* e) { while (b != e) { if (__builtin_expect(*b != 0, false)) [[unlikely]] { *b = g(*b); } ++b; } } If we believe the __builtin_expect and/or unlikely annotations (had both for extra safety), the loop usually does nothing. So we would expect any register saving and restoring to be pushed to the unlikely section. Yet (-O3): f(int*, int*): cmp rdi, rsi je .L10 pushrbp mov rbp, rsi pushrbx mov rbx, rdi sub rsp, 8 .L4: mov edi, DWORD PTR [rbx] testedi, edi jne .L14 .L3: add rbx, 4 cmp rbp, rbx jne .L4 add rsp, 8 pop rbx pop rbp ret .L14: callg(int) mov DWORD PTR [rbx], eax jmp .L3 .L10: ret I count 8 instructions that could/should have been pushed to .L14.
[Bug c++/106847] deprecated class data member makes the class generate diagnostics even when the member is not used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106847 --- Comment #3 from Avi Kivity --- But I need it to be copied. It's a struct of options: struct options { bool frob = false; [[derecated]] bool quux = false; // will be removed soon bool grue = false; }; The fact that it's deprecated doesn't mean it can be coerced to false. Sure, I can create custom constructors and assignment operators, but I lose designated initializers. btw, the destructor doesn't trigger the deprecation warning, even though it does use the deprecated member. I think it's more reasonable to only emit the warning when the member is explicitly used.
[Bug c++/106847] deprecated class data member makes the class generate diagnostics even when the member is not used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106847 --- Comment #1 from Avi Kivity --- Clang equivalent: https://github.com/llvm/llvm-project/issues/55774
[Bug c++/106847] New: deprecated class data member makes the class generate diagnostics even when the member is not used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106847 Bug ID: 106847 Summary: deprecated class data member makes the class generate diagnostics even when the member is not used Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: avi at scylladb dot com Target Milestone: --- A class with a deprecated member: struct foo { [[deprecated]] bool x; }; Cannot be used without lots of noise from the compiler: foo make_a_foo() { foo f; return f; } : In function 'foo make_a_foo()': :10:12: warning: 'f' is used uninitialized [-Wuninitialized] 10 | return f; |^ :9:9: note: 'f' declared here 9 | foo f; | ^ Compiler returned: 0 Although I didn't touch x, the compiler thinks I did. IMO synthesized constructors (and assignment operators) should suppress the warning and only explicit access by the user should trigger the warning.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #15 from Avi Kivity --- I see that both sides of the assignment are created in flatten_await_stmt, case case TARGET_EXPR.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #14 from Avi Kivity --- Do you confirm that my observation about the generated gimple corresponds to the bug? If so, is there a central point where this anonymous variables are named? Perhaps we can breakpoint on D.2741 and see where the copy is generated.
[Bug c++/63164] unnecessary calls to __dynamic_cast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63164 --- Comment #6 from Avi Kivity --- I already owe you many beers, but this one is special.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #11 from Avi Kivity --- Created attachment 52899 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52899=edit simple(r) reproducer I was able to reduce it to something simple (102 lines), but I was only able to verify that wrong code is generated on the gimple level (where my understanding is very limited). Still, to my inexperienced eyes it looks like an object with a copy constructor is being copied bitwise. Compile with (gcc 11.3.1) g++ -Wfatal-errors -w -std=gnu++20 -march=westmere -fvisibility=hidden -c -o table.o table.i -fdump-tree-gimple -g and observe in table.i.006t.gimple: struct lw_shared_ptr D.2740 [value-expr: frame_ptr->D.2740_4_5]; and then frame_ptr->D.2741_4_5.__old4 = frame_ptr->D.2740_4_5; in other places, variables of the same type are copied using the copy constructor.
[Bug analyzer/105287] [12/13 Regression] ICE in analyzer get_region_for_local on C++ await cond_var
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105287 Avi Kivity changed: What|Removed |Added CC||avi at scylladb dot com --- Comment #10 from Avi Kivity --- This causes a wrong-code regression, PR105426.
[Bug c++/105426] [wrong-code][regression][coroutines] range-for temporaries are not persisted in coroutines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105426 --- Comment #1 from Avi Kivity --- Created attachment 52898 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52898=edit reproducer
[Bug c++/105426] New: [wrong-code][regression][coroutines] range-for temporaries are not persisted in coroutines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105426 Bug ID: 105426 Summary: [wrong-code][regression][coroutines] range-for temporaries are not persisted in coroutines Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: avi at scylladb dot com Target Milestone: --- The attached reproducer, compiled with: g++ --std=c++20 coroutine-unpersisted-range-for-temporary.cc -g -O3 generates correct results for gcc before 15a176a833f and for clang, but incorrect results in 15a176a833f and later.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #10 from Avi Kivity --- I have something like this: │ 2680 future<> system_keyspace_make(distributed& dist_db, distributed& dist_ss, sharded& dist_gossiper, db::config& cfg) { │ ... │ │ 2687 for (auto&& table : system_keyspace::all_tables(db_config)) { ... │ 2695 co_await db.create_keyspace(ksm, dist_ss.local().get_erm_factory(), true, replica::database::system_keyspace::yes); And I see a crash as if the std::vector returned by all_tables() is not persisted. This worked in previous gcc versions and works in clang. I don't now what the standard says, but it really should persist temporaries in range for that contains a suspension point, or a lot of buggy code will be written.
[Bug c++/63164] unnecessary calls to __dynamic_cast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63164 --- Comment #4 from Avi Kivity --- Oops, last comment intended for a separate bug. Curse the bug list thing.
[Bug c++/63164] unnecessary calls to __dynamic_cast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63164 --- Comment #3 from Avi Kivity --- I have something like this: │ 2680 future<> system_keyspace_make(distributed& dist_db, distributed& dist_ss, sharded& dist_gossiper, db::config& cfg) { │ │ 2681 register_virtual_tables(dist_db, dist_ss, dist_gossiper, cfg); ... │ │ 2687 for (auto&& table : system_keyspace::all_tables(db_config)) { ... │ 2695 co_await db.create_keyspace(ksm, dist_ss.local().get_erm_factory(), true, replica::database::system_keyspace::yes); And I see a crash as if the std::vector returned by all_tables() is not persisted. This worked in previous gcc versions and works in clang. I don't now what the standard says, but it really should persist temporaries in range for that contains a suspension point, or a lot of buggy code will be written.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #9 from Avi Kivity --- I think it makes it worse, now seeing similar crashes, but earlier.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #8 from Avi Kivity --- Hoping this is a duplicate of PR105287. Checking.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #7 from Avi Kivity --- To give more context: struct ._anon_973 D.2159793 [value-expr: frame_ptr->D.2159793_4_7]; struct lw_shared_ptr D.2159792 [value-expr: frame_ptr->D.2159792_4_7]; (I determined that _anon_973 is the non-coroutine-lambda that is giving me grief) frame_ptr->D.2159793_4_7.__this = _51; frame_ptr->D.2159793_4_7.__old4 = frame_ptr->D.2159792_4_7; _52 = frame_ptr->__closure; _53 = _52->__newtabs; frame_ptr->D.2159793_4_7.__newtabs = _53; Here I suspect copying __old4 is bypassing the copy constructor. The source of the copy (D.2159792) has lw_shared_ptr type. I'm not fluent in gimple so maybe I'm misinterpreting it.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #6 from Avi Kivity --- Some more findings: if I replace the lambda (which is not a coroutine, but is contained in a coroutine lambda) with an equivalent struct, the problem goes away, both at runtime and in terms of an __old4 copy. +struct inner_lambda { +table* zis; +lw_shared_ptr old4; +std::vector& newtabs; +inner_lambda(table* zis, lw_shared_ptr& old3, std::vector& newtabs) +: zis(zis), old4(old3), newtabs(newtabs) {} +future<> operator()() { +tlogger.info("updating cache {}", fmt::ptr(old4.get())); +return zis->update_cache(old4, newtabs); +} +}; + tlogger.info("before updating cache {}", fmt::ptr(old3.get())); -co_await with_scheduling_group(_config.memtable_to_cache_scheduling_group, [this, old4 = old3, ] () -> future<> { +co_await with_scheduling_group(_config.memtable_to_cache_scheduling_group, /* [this, old4 = old3, ] () mutable -> future<> { tlogger.info("updating cache {}", fmt::ptr(old4.get())); return update_cache(old4, newtabs); -}); +} */ inner_lambda(this, old3, newtabs)); tlogger.info("updating cache {} done", fmt::ptr(old3.get())); _memtables->erase(old3); tlogger.debug("Memtable for {}.{} replaced, into {} sstables", old3->schema()->ks_name(), old3->schema()->cf_name(), newtabs.size()); tlogger.info("try_flush_memtable_to_sstable post_flush:end old {} refcnt {}", fmt::ptr(old3.get()), old3.use_count()); co_return stop_iteration::yes;
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #5 from Avi Kivity --- Can you confirm my suspicion about the copying of __old4 in gimple? If that's the culprit I believe I can mechanically reduce the reproducer to something easily manageable.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #3 from Avi Kivity --- I see this alleged copy in gcc 10.2.1 too.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #2 from Avi Kivity --- I see it in 006t.gimple too: try { _50 = frame_ptr->__closure; _51 = _50->__this; frame_ptr->D.2159793_4_7.__this = _51; frame_ptr->D.2159793_4_7.__old4 = frame_ptr->D.2159792_4_7; _52 = frame_ptr->__closure; _53 = _52->__newtabs; frame_ptr->D.2159793_4_7.__newtabs = _53; newtabs is captured by reference, so the copy from _53 is reasonable. But old4 is captured by value, and so should go through the copy constructor. Looking at references to old3, it always has its address taken and then passed to the copy constructor.
[Bug c++/105373] miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 --- Comment #1 from Avi Kivity --- I randomly looked at 023t.ssa (mainly because I recognized the acronym). _45 = frame_ptr_182(D)->__closure; _46 = _45->__this; frame_ptr_182(D)->D.2159984_4_7 = _46->_config.memtable_to_cache_scheduling_group; _47 = _ptr_182(D)->D.2159792_4_7; _48 = frame_ptr_182(D)->__closure; _49 = &_48->__old3; seastar::lw_shared_ptr::lw_shared_ptr (_47, _49); _50 = frame_ptr_182(D)->__closure; _51 = _50->__this; frame_ptr_182(D)->D.2159793_4_7.__this = _51; frame_ptr_182(D)->D.2159793_4_7.__old4 = frame_ptr_182(D)->D.2159792_4_7; _52 = frame_ptr_182(D)->__closure; _53 = _52->__newtabs; frame_ptr_182(D)->D.2159793_4_7.__newtabs = _53; _54 = _ptr_182(D)->D.2159793_4_7; frame_ptr_182(D)->D.2160366_4_7 = seastar::with_scheduling_group, sstable_write_permit&&)::_ZN7replica5table29try_flush_memtable_to_sstableEN7seastar13lw_shared_ptrINS_8memtableEEEO20sstable_write_permit.Frame*)operator()(replica::table::try_flush_memtable_to_sstable(replica::table::try_flush_memtable_to_sstable(seastar::lw_shared_ptr, sstable_write_permit&&)::_ZN7replica5table29try_flush_memtable_to_sstableEN7seastar13lw_shared_ptrINS_8memtableEEEO20sstable_write_permit.Frame*):: mutable::_ZZN7replica5table29try_flush_memtable_to_sstableEN7seastar13lw_shared_ptrINS_8memtableEEEO20sstable_write_permitENUlvE_clEv.Frame*)operator()(replica::table::try_flush_memtable_to_sstable(replica::table::try_flush_memtable_to_sstable(seastar::lw_shared_ptr, sstable_write_permit&&)::_ZN7replica5table29try_flush_memtable_to_sstableEN7seastar13lw_shared_ptrINS_8memtableEEEO20sstable_write_permit.Frame*)operator()(replica::table::try_flush_memtable_to_sstable(replica::table::try_flush_memtable_to_sstable(seastar::lw_shared_ptr, sstable_write_permit&&)::_ZN7replica5table29try_flush_memtable_to_sstableEN7seastar13lw_shared_ptrINS_8memtableEEEO20sstable_write_permit.Frame*):: mutable::_ZZN7replica5table29try_flush_memtable_to_sstableEN7seastar13lw_shared_ptrINS_8memtableEEEO20sstable_write_permitENUlvE_clEv.Frame*):: mutable::_ZZZN7replica5table29try_flush_memtable_to_sstableEN7seastar13lw_shared_ptrINS_8memtableEEEO20sstable_write_permitENUlvE_clEvENUlvE1_clEv.Frame*):: > (frame_ptr_182(D)->D.2159984_4_7, _54); [return slot optimization] _55 = _ptr_182(D)->D.2160366_4_7; frame_ptr_182(D)->Aw5_4_7 = seastar::operator co_await (_55); [return slot optimization] In the line frame_ptr_182(D)->D.2159793_4_7.__old4 = frame_ptr_182(D)->D.2159792_4_7; Is this not a bitwise copy, which should be instead a call to the copy constructor?
[Bug c++/105373] New: miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373 Bug ID: 105373 Summary: miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor Product: gcc Version: 11.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: avi at scylladb dot com Target Milestone: --- This is a bug in a complex piece of code, so I'll need guidance on what further information to provide (e.g. intermediate code dumps). It reproduces with various levels of debug information, and the code works with clang. At the heart there is a shared pointer, which I've enhanced so that all pointers that point to the same object keep track of each other (also the pointee points back to one of the pointers). The pointee keeps an incrementing generation count. In the end I have two pointer objects that are bitwise equal, even though they should have different generation counts and different doubly-linked-list pointers. This proves that a bitwise copy happened. It doesn't prove a miscompile, since my code could have decided to perform a bitwise copy, but the fact that it works in clang indicates it's a gcc bug. The bug happens with asan too, and with different sizeof(the smart pointer) so it's not some stray write. This is the snippet that causes the trouble: 0tlogger.info("before updating cache {}", fmt::ptr(old3.get())); 1co_await with_scheduling_group(_config.memtable_to_cache_scheduling_group, [this, old4 = old3, ] () -> future<> { 2tlogger.info("updating cache {}", fmt::ptr(old4.get())); 3return update_cache(old4, newtabs); 4}); old3 and old4 are all copies of the same smart pointer (there are also old and old1 and old2 elsewhere, but they are correct). In the call to update_cache(), we attempt to make a copy of old4, and an internal check finds the link list is corrupted. Inspecting old4 in the debugger (from the printout in line 0) and the source of the copy in line 3 shows they are the same, but have different addresses: 0 INFO 2022-04-25 12:04:14,722 [shard 0] table - before updating cache 0x60666c40 1 copying @0x6520bef8 <- @0x652108f0 0x60666c40 refcnt 6 gen 43 2 INFO 2022-04-25 12:04:14,722 [shard 0] table - updating cache 0x60666c40 3 (gdb) p this $1 = (const seastar::lw_shared_ptr * const) 0x6520bed0 4 (gdb) p *this 5 $2 = {_ptr = 0x60666c40, _next = 0x652108f0, _prev = 0x6597ae48, _generation = 43} 6 (gdb) x/4gx 0x6520bef8 7 0x6520bef8: 0x60666c40 0x652108f0 8 0x6520bf08: 0x6597ae48 0x002b 9 (gdb) x/4gx this 10 0x6520bed0: 0x60666c40 0x652108f0 11 0x6520bee0: 0x6597ae48 0x002b in line 3 I dump old4 (from the printout in line 1, where old3 is copied into old4). But "this" doesn't point to old4., it points to a bitwise copy of old4, as shown in lines 7-8 and 10-11. Note that both old4 and the bad copy "this" are members of the coroutine frame. I realize this isn't enough to analyze the situation, I'm happy to provide more information if you direct me how.
[Bug sanitizer/105336] truncated address sanitizer stack traces
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105336 --- Comment #6 from Avi Kivity --- (the reproducer was executed by gcc 12 prerelease, not gcc 11)
[Bug sanitizer/105336] truncated address sanitizer stack traces
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105336 --- Comment #5 from Avi Kivity --- I reduced it to a few lines (attached, intentionally triggers use-after-free). The culprit is -Og. With g++ coroutine-asan.cc -o coroutine-asan --std=c++20 -fsanitize=address -Og I see READ of size 8 at 0x60700020 thread T0 #0 0x4018e2 in test() (/home/avi/tests/coroutine-asan+0x4018e2) #1 0x40192b in main (/home/avi/tests/coroutine-asan+0x40192b) #2 0x7fb2e0a1d58f in __libc_start_call_main (/lib64/libc.so.6+0x2d58f) #3 0x7fb2e0a1d648 in __libc_start_main_alias_1 (/lib64/libc.so.6+0x2d648) #4 0x401144 in _start (/home/avi/tests/coroutine-asan+0x401144) 0x60700020 is located 0 bytes inside of 80-byte region [0x60700020,0x60700070) freed by thread T0 here: #0 0x7fb2e0fdebc8 in operator delete(void*) (/lib64/libasan.so.8+0xbbbc8) #1 0x40164f in test_coroutine(test_coroutine(std::__n4861::coroutine_handle&)::_Z14test_coroutineRNSt7__n486116coroutine_handleIvEE.Frame*) [clone .actor] (/home/avi/tests/coroutine-asan+0x40164f) #2 0x40200f (/home/avi/tests/coroutine-asan+0x40200f) previously allocated by thread T0 here: #0 0x7fb2e0fde188 in operator new(unsigned long) (/lib64/libasan.so.8+0xbb188) #1 0x4016c0 in test_coroutine(std::__n4861::coroutine_handle&) (/home/avi/tests/coroutine-asan+0x4016c0) The stack traces are truncated, compared to g++ coroutine-asan.cc -o coroutine-asan --std=c++20 -fsanitize=address which yields READ of size 8 at 0x60700020 thread T0 #0 0x401cef in std::__n4861::coroutine_handle::resume() const (/home/avi/tests/coroutine-asan+0x401cef) #1 0x401b0f in test() (/home/avi/tests/coroutine-asan+0x401b0f) #2 0x401b85 in main (/home/avi/tests/coroutine-asan+0x401b85) #3 0x7f79b8be958f in __libc_start_call_main (/lib64/libc.so.6+0x2d58f) #4 0x7f79b8be9648 in __libc_start_main_alias_1 (/lib64/libc.so.6+0x2d648) #5 0x4011a4 in _start (/home/avi/tests/coroutine-asan+0x4011a4) 0x60700020 is located 0 bytes inside of 80-byte region [0x60700020,0x60700070) freed by thread T0 here: #0 0x7f79b91aabc8 in operator delete(void*) (/lib64/libasan.so.8+0xbbbc8) #1 0x4019a5 in test_coroutine(test_coroutine(std::__n4861::coroutine_handle&)::_Z14test_coroutineRNSt7__n486116coroutine_handleIvEE.Frame*) [clone .actor] (/home/avi/tests/coroutine-asan+0x4019a5) #2 0x401cf7 in std::__n4861::coroutine_handle::resume() const (/home/avi/tests/coroutine-asan+0x401cf7) #3 0x401b0f in test() (/home/avi/tests/coroutine-asan+0x401b0f) #4 0x401b85 in main (/home/avi/tests/coroutine-asan+0x401b85) #5 0x7f79b8be958f in __libc_start_call_main (/lib64/libc.so.6+0x2d58f) previously allocated by thread T0 here: #0 0x7f79b91aa188 in operator new(unsigned long) (/lib64/libasan.so.8+0xbb188) #1 0x401293 in test_coroutine(std::__n4861::coroutine_handle&) (/home/avi/tests/coroutine-asan+0x401293) #2 0x401af9 in test() (/home/avi/tests/coroutine-asan+0x401af9) #3 0x401b85 in main (/home/avi/tests/coroutine-asan+0x401b85) #4 0x7f79b8be958f in __libc_start_call_main (/lib64/libc.so.6+0x2d58f) The traces go all the away to main.
[Bug sanitizer/105336] truncated address sanitizer stack traces
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105336 --- Comment #4 from Avi Kivity --- Created attachment 52856 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52856=edit intentionally buggy reproducer
[Bug sanitizer/105336] truncated address sanitizer stack traces
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105336 --- Comment #3 from Avi Kivity --- I have a multi-gigabyte reproducer. Unfortunately it's part of a huge program that didn't build with gcc until very recently. It will be quite a task to reduce it.
[Bug sanitizer/105336] truncated address sanitizer stack traces
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105336 --- Comment #1 from Avi Kivity --- I guess I should mention the programs uses C++20 coroutines, and it could be the case that debug information for coroutines is generated incorrectly.
[Bug sanitizer/105336] New: truncated address sanitizer stack traces
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105336 Bug ID: 105336 Summary: truncated address sanitizer stack traces Product: gcc Version: 11.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: avi at scylladb dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone: --- Trying to debug a program with gcc 11 branch (d26c3e4f733fcb07d90680491dd1d7a9d08c4705), I get truncated asan stack traces: seastar::internal::repeater(flush_permit&) const::{lambda()#1}> = ==313819==ERROR: AddressSanitizer: heap-use-after-free on address 0x6143f848 at pc 0x040627a3 bp 0x7fff62f15fb0 sp 0x7fff62f15fa8 READ of size 8 at 0x6143f848 thread T0 #0 0x40627a2 in seastar::debug_shared_ptr_counter_type::check() const seastar/include/seastar/core/shared_ptr_debug_helper.hh:63 #1 0x505eab6 in seastar::debug_shared_ptr_counter_type::operator long() const seastar/include/seastar/core/shared_ptr_debug_helper.hh:40 #2 0x505eab6 in seastar::lw_shared_ptr::use_count() const seastar/include/seastar/core/shared_ptr.hh:356 #3 0x505eab6 in operator() replica/table.cc:620 #4 0x5061947 in invoke)>&, seastar::future > seastar/include/seastar/core/future.hh:2141 #5 0x5061947 in operator() seastar/include/seastar/core/future.hh:1658 #6 0x5061947 in call seastar/include/seastar/util/noncopyable_function.hh:153 #7 0x45d1383 in seastar::noncopyable_function (seastar::future&&)>::operator()(seastar::future&&) const seastar/include/seastar/util/noncopyable_function.hh:209 #8 0x45d1383 in seastar::future::then_wrapped_nrvo, seastar::noncopyable_function (seastar::future&&)> >(seastar::noncopyable_function (seastar::future&&)>&&)::{lambda(seastar::internal::promise_base_with_type&&, seastar::noncopyable_function (seastar::future&&)>&, seastar::future_state&&)#1}::operator()(seastar::internal::promise_base_with_type&&, seastar::noncopyable_function (seastar::future&&)>&, seastar::future_state&&) const::{lambda()#1}::operator()() const seastar/include/seastar/core/future.hh:1674 #9 0x45d1383 in void seastar::futurize >::satisfy_with_result_of::then_wrapped_nrvo, seastar::noncopyable_function (seastar::future&&)> >(seastar::noncopyable_function (seastar::future&&)>&&)::{lambda(seastar::internal::promise_base_with_type&&, seastar::noncopyable_function (seastar::future&&)>&, seastar::future_state&&)#1}::operator()(seastar::internal::promise_base_with_type&&, seastar::noncopyable_function (seastar::future&&)>&, seastar::future_state&&) const::{lambda()#1}>(seastar::internal::promise_base_with_type&&, seastar::future::then_wrapped_nrvo, seastar::noncopyable_function (seastar::future&&)> >(seastar::noncopyable_function (seastar::future&&)>&&)::{lambda(seastar::internal::promise_base_with_type&&, seastar::noncopyable_function (seastar::future&&)>&, seastar::future_state&&)#1}::operator()(seastar::internal::promise_base_with_type&&, seastar::noncopyable_function (seastar::future&&)>&, seastar::future_state&&) const::{lambda()#1}&&) seastar/include/seastar/core/future.hh:2126 #10 0x45d2191 in seastar::future::then_wrapped_nrvo, seastar::noncopyable_function (seastar::future&&)> >(seastar::noncopyable_function (seastar::future&&)>&&)::{lambda(seastar::internal::promise_base_with_type&&, seastar::noncopyable_function (seastar::future&&)>&, seastar::future_state&&)#1}::operator()(seastar::internal::promise_base_with_type&&, seastar::noncopyable_function (seastar::future&&)>&, seastar::future_state&&) const seastar/include/seastar/core/future.hh:1673 #11 0x45d2191 in seastar::continuation, seastar::noncopyable_function (seastar::future&&)>, seastar::future::then_wrapped_nrvo, seastar::noncopyable_function (seastar::future&&)> >(seastar::noncopyable_function (seastar::future&&)>&&)::{lambda(seastar::internal::promise_base_with_type&&, seastar::noncopyable_function (seastar::future&&)>&, seastar::future_state&&)#1}, void>::run_and_dispose() seastar/include/seastar/core/future.hh:773 #12 0x17fc8b74 in seastar::reactor::run_tasks(seastar::reactor::task_queue&) seastar/src/core/reactor.cc:2344 #13 0x17fcd0ec in seastar::reactor::run_some_tasks() seastar/src/core/reactor.cc:2754 #14 0x17fd2b00 in seastar::reactor::do_run() seastar/src/core/reactor.cc:2923 #15 0x17fceba8 in seastar::reactor::run() seastar/src/core/reactor.cc:2806 #16 0x17d0a3e0 in seastar::app_template::run_deprecated(int, char**, std::function&&) seastar/src/core/app-template.cc:265 #17 0x17d07eb0 in seastar::app_template::run(int, char**, std::function ()>&&) seastar/src/core/app-template.cc:156 #18 0x3d67f67 in scylla_main
[Bug c++/104020] [coroutines] ICE in co_await function call with initializer_list arguments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104020 Avi Kivity changed: What|Removed |Added CC||avi at scylladb dot com --- Comment #1 from Avi Kivity --- Duplicate of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98056
[Bug c++/103328] [11 Regression] ICE in remap_gimple_stmt with coroutines since r11-7419-g0f161cc8494cf728
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103328 --- Comment #27 from Avi Kivity --- Wonderful, thank you.
[Bug c++/103328] [11/12 Regression] ICE in remap_gimple_stmt with coroutines since r11-7419-g0f161cc8494cf728
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103328 --- Comment #24 from Avi Kivity --- Many thanks for the patch and the commit. Can we have it backported to 11.x?
[Bug sanitizer/95137] Sanitizers seem to be missing support for coroutines
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95137 --- Comment #50 from Avi Kivity --- Your reproducer does pass in trunk. So perhaps just a missing backport.
[Bug c++/103328] [11/12 Regression] ICE in remap_gimple_stmt with coroutines since r11-7419-g0f161cc8494cf728
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103328 --- Comment #22 from Avi Kivity --- Maintainers, please review the patch posted in [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591907.html
[Bug c++/103328] [11/12 Regression] ICE in remap_gimple_stmt, at tree-inline.c:1921 since r11-7419-g0f161cc8494cf728
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103328 --- Comment #21 from Avi Kivity --- Benno, many thanks for the fix. Please consider posting it to gcc-patches as that may speed up review and merging.
[Bug rtl-optimization/104484] -freorder-block-and-partition not splitting into sections with __builin_expect()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104484 --- Comment #4 from Avi Kivity --- gcc could infer that the check for f2 and the call to very_heavy are cold, and move that little block to a .cold section, whether or not very_heavy is cold.
[Bug rtl-optimization/104484] -freorder-block-and-partition not splitting into sections with __builin_expect()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104484 --- Comment #2 from Avi Kivity --- (In reply to Andrew Pinski from comment #1) > But the assembly output does not make sense to the code you gave either. Apart from the missing .section directives, I didn't notice anything odd. Maybe movl/testl could be optimized into cmpl. Ah, I moved this code from C++ so the function signatures are bad, should be fun(void). With the signatures adjusted, I get light: .LFB2: .cfi_startproc subq$8, %rsp .cfi_def_cfa_offset 16 callfun movlf1(%rip), %edx testl %edx, %edx jne .L8 .L6: addq$8, %rsp .cfi_remember_state .cfi_def_cfa_offset 8 jmp fun .p2align 4,,10 .p2align 3 .L8: .cfi_restore_state movlf2(%rip), %eax testl %eax, %eax je .L6 callvery_heavy jmp .L6 .cfi_endproc Which still misses the .section directives and the peephole optimization to merge movl/testl.
[Bug tree-optimization/104484] New: -freorder-block-and-partition not splitting into sections with __builin_expect()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104484 Bug ID: 104484 Summary: -freorder-block-and-partition not splitting into sections with __builin_expect() Product: gcc Version: 11.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: avi at scylladb dot com Target Milestone: --- I expected code guarded by __builtin_expect() to be pushed into .text.cold, but that's not happening: int f1, f2; inline int is() { return __builtin_expect(f1, 0) && f2; } void heavy() { extern void very_heavy(); if (is()) { very_heavy(); } } void fun(); void light() { fun(); heavy(); fun(); } with -O3 -freorder-blocks-and-partition: light: .LFB2: .cfi_startproc subq$8, %rsp .cfi_def_cfa_offset 16 xorl%eax, %eax callfun movlf1(%rip), %edx testl %edx, %edx jne .L8.p2align 4,,10 .p2align 3 .L8: .cfi_restore_state movlf2(%rip), %eax testl %eax, %eax je .L6 xorl%eax, %eax callvery_heavy jmp .L6 .cfi_endproc .L6: xorl%eax, %eax addq$8, %rsp .cfi_remember_state .cfi_def_cfa_offset 8 jmp fun I expected .L8 to be in another section, but it isn't.
[Bug tree-optimization/104292] [missed optimization] boolean addition generates suboptimal code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104292 --- Comment #1 from Avi Kivity --- btw, I see that the equivalent bool_and generates optimal code. bool_and(bool, bool): movl%esi, %eax andl%edi, %eax ret Perhaps bool is written with the expectation that any non-zero value is true? But aren't non-zero values other than 1 undefined behavior?
[Bug tree-optimization/104292] New: [missed optimization] boolean addition generates suboptimal code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104292 Bug ID: 104292 Summary: [missed optimization] boolean addition generates suboptimal code Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: avi at scylladb dot com Target Milestone: --- bool bool_or(bool b1, bool b2) { return b1 + b2; } generates bool_or(bool, bool): movzbl %dil, %edi movzbl %sil, %esi addl%esi, %edi setne %al ret Whereas it could generate bool_or(bool, bool): or %edi, %esi mov %esi, %eax ret For a net gain of two instructions (and the final mov can often be elided, so up to three). I encountered this while using std::plus<> with boolean types. It could be optimized as a specialization of std::plus, but I think a magic transformation in the optimizer would make it apply in more places.
[Bug target/103554] -mavx generates worse code on scalar code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103554 --- Comment #7 from Avi Kivity --- Sorry, I was unclear. That's the entire program with a huge number of slp opportunities. Each iteration in the program is ~40k instructions. It's not directly related to the test case, which is artificial, and arose from me exploring the differences between gcc and clang code generation. It does demonstrate that clang's slp is counterproductive. I promise to repeat the experiment with gcc once it stops ICEing.
[Bug target/103554] -mavx generates worse code on scalar code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103554 --- Comment #5 from Avi Kivity --- Here's some big-picture data. Compiled with clang, which seems to ignore these STLF issues. no-slp: 42641.91 tps ( 75.1 allocs/op, 12.1 tasks/op, 44929 insns/op) 42446.41 tps ( 75.1 allocs/op, 12.1 tasks/op, 44870 insns/op) 42495.03 tps ( 75.1 allocs/op, 12.1 tasks/op, 44931 insns/op) 42703.40 tps ( 75.1 allocs/op, 12.1 tasks/op, 44916 insns/op) 42798.98 tps ( 75.1 allocs/op, 12.1 tasks/op, 44963 insns/op) slp: 41536.46 tps ( 75.1 allocs/op, 12.1 tasks/op, 44828 insns/op) 41482.05 tps ( 75.1 allocs/op, 12.1 tasks/op, 44802 insns/op) 41707.23 tps ( 75.1 allocs/op, 12.1 tasks/op, 44874 insns/op) 41811.10 tps ( 75.1 allocs/op, 12.1 tasks/op, 44847 insns/op) 41764.39 tps ( 75.1 allocs/op, 12.1 tasks/op, 44846 insns/op) So slp definitely has negative impact on ops/sec, even though it reduces instructions/op. This is on an older machine (newer ones have ~5X perf, with 3X higher IPC and the rest due to higher frequency).
[Bug target/103554] -mavx generates worse code on scalar code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103554 --- Comment #3 from Avi Kivity --- > _Note_ that likely the suboptimal solution presented here is faster because > it avoids STLF penalties from the calls stack setup which very likely uses > scalar or differently aligned vector moves. Interesting point. Agner says (Icelake): > A read that is bigger than the write, or a read that covers both written and > unwritten bytes, > fails to forward. The write-to-read latency is 19-20 clock cycles. However, the same code is generated when `in` is a reference, in which case it may not be in the store queue at all, so we're paying two extra instructions for nothing. movhps is also 2 uops, so we're paying 3 uops to load 2 elements.
[Bug target/103554] New: -mavx generates worse code on scalar code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103554 Bug ID: 103554 Summary: -mavx generates worse code on scalar code Product: gcc Version: 11.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: avi at scylladb dot com Target Milestone: --- Test case: struct s1 { long a, b, c, d, e, f, g, h; }; s1 move(s1 in) { s1 ret; ret.a = in.d; ret.b = in.e; ret.c = in.a; ret.d = in.b; return ret; } -O3 generates: move(s1): movq 8(%rsp), %xmm0 movq 32(%rsp), %xmm1 movq %rdi, %rax movhps 16(%rsp), %xmm0 movhps 40(%rsp), %xmm1 movups %xmm1, (%rdi) movups %xmm0, 16(%rdi) ret -O3 -mavx generates: move(s1): pushq %rbp movq%rdi, %rax movq%rsp, %rbp vmovq 16(%rbp), %xmm2 vmovq 40(%rbp), %xmm3 vpinsrq $1, 24(%rbp), %xmm2, %xmm1 vpinsrq $1, 48(%rbp), %xmm3, %xmm0 vinsertf128 $0x1, %xmm1, %ymm0, %ymm0 vmovdqu %ymm0, (%rdi) vzeroupper popq%rbp ret Clang -O3 generates this simple code, with or without -mavx (-mavx does use VEX instructions): move(s1): # @move(s1) movq %rdi, %rax movups 32(%rsp), %xmm0 movups %xmm0, (%rdi) movaps 8(%rsp), %xmm0 movups %xmm0, 16(%rdi) retq
[Bug c++/97848] [missed optimization] tls init function check emitted for consinit thread_local variables (C++20)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97848 Avi Kivity changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #2 from Avi Kivity --- Fixed on trunk (12.0) foo_good(): movq x@gottpoff(%rip), %rax movl %fs:(%rax), %eax ret set_foo(int): movq x@gottpoff(%rip), %rax movl %edi, %fs:(%rax) ret
[Bug tree-optimization/85516] [missed-optimization] gcc does not convert multiple compares against constants to a shift+bitmask test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85516 Avi Kivity changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #7 from Avi Kivity --- Marking as fixed, since at least 11.2.
[Bug tree-optimization/85516] [missed-optimization] gcc does not convert multiple compares against constants to a shift+bitmask test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85516 --- Comment #6 from Avi Kivity --- gcc 11.2 produces optimized code with the original example: check_mask(E): cmpl $9, %edi ja .L3 movl %edi, %ecx movl $562, %eax shrq %cl, %rax andl $1, %eax ret .L3: xorl %eax, %eax ret
[Bug c++/103319] [coroutines] ICE in is_this_parameter, at cp/semantics.c:10672
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103319 --- Comment #2 from Avi Kivity --- A gentle ping, can we backport this?