[Bug tree-optimization/112533] New: missed optimization (~A & C) == (~B & C) => (A & C) == (B & C)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112533 Bug ID: 112533 Summary: missed optimization (~A & C) == (~B & C) => (A & C) == (B & C) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- On this code static bool is_even(unsigned a) { return a % 2 == 0; } bool same_evenness(unsigned a, unsigned b) { return is_even(a) == is_even(b); } GCC -02 currently produces same_evenness: notl%esi // (1) notl%edi // (2) andl$1, %esi andl$1, %edi cmpb%dil, %sil sete%al ret The NOTs (1) and (2) are redundant. It would be great if GCC could optimize them out.
[Bug libstdc++/112480] optional::reset emits inefficient code when T is trivially-destructible
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112480 --- Comment #7 from Ivan Sorokin --- (In reply to Jonathan Wakely from comment #6) > + // The following seems redundant but improves codegen, see PR 112480. > + if constexpr (is_trivially_destructible_v<_Tp>) > + this->_M_engaged = false; >} In theory non-trivial destructors that are optimizible to no-op can also benefit from the same optimization. I don't know how often non-trivial no-op destructors occur in practice. Perhaps we can ignore such case.
[Bug libstdc++/112480] optional::reset emits inefficient code when T is trivially-destructible
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112480 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #5 from Ivan Sorokin --- Perhaps something like this would do the trick? void _M_reset() { if (_M_engaged) _M_destroy(); else _M_engaged = _M_engaged; } On one hand _M_engaged = _M_engaged allows merging then and else branches without introducing new writes, on the other it can be optimized to no-op if the branches are not merged.
[Bug c++/112410] New: error when auto(x) is used in a variable initializer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112410 Bug ID: 112410 Summary: error when auto(x) is used in a variable initializer Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- int x = auto(42); // OK int y(auto(42)); // error On the second line GCC -std=c++23 gives an error: error: non-function 'y' declared as implicit template I believe the code is correct and should compile without errors.
[Bug target/99087] suboptimal codegen for division by constant 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99087 Ivan Sorokin changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #2 from Ivan Sorokin --- Since GCC 12 the issue no longer reproduces. Closing as fixed. https://godbolt.org/z/ss7Y84a9f
[Bug tree-optimization/111718] Missed optimization of '(a+a)/a'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111718 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #1 from Ivan Sorokin --- GCC does the optimization if the return from the function is replaced with __builtin_unreachable: unsigned n1, n2; void func1(unsigned a) { if (a <= 10 || a >= 20) __builtin_unreachable(); n1 = a + a; n2 = (a + a)/a; } func1(unsigned int): mov DWORD PTR n2[rip], 2 add edi, edi mov DWORD PTR n1[rip], edi ret https://godbolt.org/z/Tjsz6neTs Perhaps this issue has the same underlying cause as the PR80015.
[Bug middle-end/111541] New: missing optimization x & ~c | (y | c) -> x | (y | c)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111541 Bug ID: 111541 Summary: missing optimization x & ~c | (y | c) -> x | (y | c) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- On this function clang generates shorter code: unsigned foo(unsigned x, unsigned y, unsigned c) { return x & ~c | (y | c); } Clang notices that the expression can be simplified to x | (y | c). It would be great if GCC can do the same. https://gcc.godbolt.org/z/dMo4nEjrs This issue is symmetric to the one described in PR 98710. The idea behind this simplification is the following: when we are working with bitsets, "|" can be read as adding bits and "&~" as removing. Therefore the expression "x & ~c | (y | c)" can be read as removing "c" from "x" and then adding "y | c". So the simplification x & ~c | (y | c) -> x | (y | c) means there is no need to remove "c" if later we add something containing "c".
[Bug middle-end/98710] missing optimization (x | c) & ~(y | c) -> x & ~(y | c)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98710 --- Comment #8 from Ivan Sorokin --- > How often these show up, I have no idea. Perhaps I should have written this in the original message. The original expression "(x | c) & ~(y | c)" is obviously a reduced version of what happens in real code. The idea is the following: When we are working with bitsets "|" can be read as adding bits and "&~" as removing. Therefore the expression can be read as first adding "c" to "x" and then removing "y | c" from the result. So the simplification (x | c) & ~(y | c) -> x & ~(y | c) means there is no need to add "c" if later we remove something containing "c".
[Bug middle-end/98710] missing optimization (x | c) & ~(y | c) -> x & ~(y | c)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98710 --- Comment #7 from Ivan Sorokin --- (In reply to Andrew Pinski from comment #6) > Fixed. Thank you!
[Bug middle-end/109986] missing fold (~a | b) ^ a => ~(a & b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109986 --- Comment #5 from Ivan Sorokin --- (In reply to CVS Commits from comment #4) > commit r14-2751-g2a3556376c69a1fb588dcf25225950575e42784f > Author: Drew Ross > Co-authored-by: Jakub Jelinek Thank you!
[Bug gcov-profile/110561] gcov counts closing bracket in a function as executable, lowering coverage statistics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110561 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #1 from Ivan Sorokin --- Smaller reproducer: struct non_trivial { non_trivial(); non_trivial(non_trivial const&); non_trivial& operator=(non_trivial const&); ~non_trivial(); void escape(); }; non_trivial foobar() { non_trivial result; result.escape(); return result; } https://gcc.godbolt.org/z/s5fG6ezxd Here is the code generated after escape() until the end of function: .LEHB1: callnon_trivial::escape() .LEHE1: .loc 1 15 12 nop mov rax, QWORD PTR __gcov0.foobar()[rip+24] add rax, 1 mov QWORD PTR __gcov0.foobar()[rip+24], rax jmp .L5 # unconditional jump to function exit .L4: mov rbx, rax# exceptional case mov rax, QWORD PTR __gcov0.foobar()[rip+16] add rax, 1 mov QWORD PTR __gcov0.foobar()[rip+16], rax .loc 1 16 1 mov rax, QWORD PTR [rbp-24] mov rdi, rax callnon_trivial::~non_trivial() [complete object destructor] mov rax, QWORD PTR __gcov0.foobar()[rip+32] # increments the counter for } add rax, 1 mov QWORD PTR __gcov0.foobar()[rip+32], rax mov rax, rbx mov rdi, rax .LEHB2: call_Unwind_Resume .LEHE2: .L5: mov rax, QWORD PTR [rbp-24]# doesn't increment the counter for } mov rbx, QWORD PTR [rbp-8] leave .cfi_def_cfa 7, 8 ret .cfi_endproc >From what I understand the counter for } is incremented in the exceptional codepath that is executed when an exception is thrown from escape(). Normal codepath doesn't increment it. This looks like a bug to me. The counter for } should be either incremented on both codepaths or should not exist at all.
[Bug middle-end/110534] New: confusing -Wuninitialized when strict aliasing is violated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110534 Bug ID: 110534 Summary: confusing -Wuninitialized when strict aliasing is violated Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- GCC gives -Wuninitialized on this code: #include uint16_t test() { uint32_t foo32[4] = {0, 0, 0, 0}; uint16_t* foo16 = reinterpret_cast(&foo32[0]); return foo16[0]; } :7:19: warning: 'foo32' is used uninitialized [-Wuninitialized] 7 | return foo16[0]; | ^ :5:14: note: 'foo32' declared here 5 | uint32_t foo32[4] = {0, 0, 0, 0}; | ^ This issue was originally published on reddit: https://www.reddit.com/r/cpp/comments/14lc9w9/gcc_warnings_for_uninitialized_variables_is/ The poster found the warning quite confusing and I agree with them. I believe the ideal behavior would be to show -Wstrict-aliasing on this code and avoid showing -Wuninitialized.
[Bug middle-end/109986] missing fold (~a | b) ^ a => ~(a & b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109986 --- Comment #3 from Ivan Sorokin --- I tried to investigate why GCC is able to simplify `(a | b) ^ a` and `(a | ~b) ^ a` from comment 2, but not similarly looking `(~a | b) ^ a` from comment 0. `(a | b) ^ a` matches the following pattern from match.pd: /* (X | Y) ^ X -> Y & ~ X*/ (simplify (bit_xor:c (convert1? (bit_ior:c @@0 @1)) (convert2? @0)) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (convert (bit_and @1 (bit_not @0) `(a | ~b) ^ a` matches another pattern: /* (~X | C) ^ D -> (X | C) ^ (~D ^ C) if (~D ^ C) can be simplified. */ (simplify (bit_xor:c (bit_ior:cs (bit_not:s @0) @1) @2) (bit_xor (bit_ior @0 @1) (bit_xor! (bit_not! @2) @1))) With substitution `X = b, C = a, D = a` it gives: (b | a) ^ (~a ^ a) (b | a) ^ -1 ~(b | a) `(~a | b) ^ a` is not simplifiable by this pattern because it requires that `~D ^ C` is simplifiable further, but `~a ^ b` is not. In any case, even if it were applicable it would produce `(a | b) ^ (~a ^ b)` which has more operations than the original expression.
[Bug middle-end/109986] missing fold (~a | b) ^ a => ~(a & b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109986 --- Comment #1 from Ivan Sorokin --- (In reply to Ivan Sorokin from comment #0) > int foo(int a, int b) > { > return (~a | b) ^ a; > } > > This can be optimized to `return ~(a | b);`. This transformation is done by > LLVM, but not by GCC. Correction: it can be optimized to `return ~(a & b);`.
[Bug middle-end/109986] New: missing fold (~a | b) ^ a => ~(a & b)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109986 Bug ID: 109986 Summary: missing fold (~a | b) ^ a => ~(a & b) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- int foo(int a, int b) { return (~a | b) ^ a; } This can be optimized to `return ~(a | b);`. This transformation is done by LLVM, but not by GCC.
[Bug tree-optimization/71990] Function multiversioning prohibits inlining
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71990 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #5 from Ivan Sorokin --- I encountered the same issue as in this PR. GCC never inlines functions with target_clone attribute. Therefore when the function is small and benefits from inlining applying target_clone to it can pessimize it. It would be great if GCC were able to inline functions with target_clone attribute and propagate target_clone attribute as comment #3 suggests. On this example: __attribute__((target_clones("default", "arch=x86-64-v3"))) static int foo(int a, int b) { return a & ~b; } int bar(int a, int b) { return foo(a, b); } An ideal behavior would be foo to be inlined into bar (and bar becoming multiversioned) and foo removed completely as none of its usages left.
[Bug analyzer/109570] detect fclose on unopened or NULL files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109570 --- Comment #1 from Ivan Sorokin --- Generalizing. Perhaps similarly free(NULL) can be detected? void* obj = malloc(...); if (!obj) { free(obj); return false; } Unliky fclose(NULL), free(NULL) is completely well defined operation, but it does nothing and perhaps should be removed.
[Bug analyzer/109570] New: detect fclose on unopened or NULL files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109570 Bug ID: 109570 Summary: detect fclose on unopened or NULL files Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: analyzer Assignee: dmalcolm at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- While cleaning up one not particularly well written program I noticed this code fragment: FILE* file = fopen(...); if (!file) { fclose(file); return false; } Passing NULL to fclose is undefined behavior. Perhaps -fanalyzer could warn about code like this?
[Bug rtl-optimization/109527] New: redundant register assignment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109527 Bug ID: 109527 Summary: redundant register assignment Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- On this function short test(short* a) { *a = 1; return *a; } latest gcc -O2 generates: test(short*): mov eax, 1 mov WORD PTR [rdi], ax mov eax, 1 ret I believe the second assignment to eax is redundant and can be removed: test(short*): mov eax, 1 mov WORD PTR [rdi], ax ret
[Bug c++/108219] [12 Regression] requirement fails on a valid expression since r12-5253-g4df7f8c79835d569
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108219 --- Comment #5 from Ivan Sorokin --- (In reply to Patrick Palka from comment #4) > Fixed for GCC 13 so far Thank you very much!
[Bug c++/66968] Incorrect template argument shown in diagnostic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66968 --- Comment #10 from Ivan Sorokin --- One more case (from 108676): template struct X {}; template X f(); template X g(); int main() { g(); } Here 'X' is printed in the error message instead of 'X'.
[Bug c++/108676] template parameters are misprinted in function signature
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108676 Ivan Sorokin changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #3 from Ivan Sorokin --- (In reply to Jonathan Wakely from comment #2) > Probably a dup of PR 66968 Yes, it looks similar enough. Thank you! *** This bug has been marked as a duplicate of bug 66968 ***
[Bug c++/66968] Incorrect template argument shown in diagnostic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66968 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #9 from Ivan Sorokin --- *** Bug 108676 has been marked as a duplicate of this bug. ***
[Bug c++/108676] template parameters are misprinted in function signature
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108676 --- Comment #1 from Ivan Sorokin --- I added a broken link to godbolt, here is a valid one: https://godbolt.org/z/EE5eezW1r
[Bug c++/108676] New: GCC prints function signature incorrectly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108676 Bug ID: 108676 Summary: GCC prints function signature incorrectly Product: gcc Version: 12.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Consider this code: template struct X {}; template X f(); template X g(); int main() { g(); } On GCC 12.2 it gives this error message: :13:12: error: no matching function for call to 'g()' 13 | g(); | ~~~^~ :9:7: note: candidate: 'template X g()' 9 | X g(); | ^ Please note that the return type of 'g' is printed incorrectly. It should say 'X' instead of 'X'. https://godbolt.org/z/EeWoo16M
[Bug c++/108219] New: requirement fails on a valid expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108219 Bug ID: 108219 Summary: requirement fails on a valid expression Product: gcc Version: 12.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- This code compiles OK on clang, MSVC and GCC prior to 12: template concept test = requires { new T[1]{{ 42 }}; }; struct foobar { foobar(int); }; int main() { static_assert(test); new foobar[1]{{ 42 }}; } But on GCC 12 it produces an error: :14:19: error: static assertion failed 14 | static_assert(test); | ^~~~ :14:19: note: constraints not satisfied :2:9: required by the constraints of 'template concept test' :2:16: in requirements [with T = foobar] :4:5: note: the required expression 'new T(1)' is invalid, because 4 | new T[1]{{ 42 }}; | ^~~~ :4:5: error: could not convert '()' from '' to 'foobar' 4 | new T[1]{{ 42 }}; | ^~~~ | | | I believe the error is incorrect and that this is a regression in GCC 12.
[Bug c++/107529] New: constexpr evaluator doesn't check for destroyed objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107529 Bug ID: 107529 Summary: constexpr evaluator doesn't check for destroyed objects Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- I believe this function contains undefined behavior and should not be allowed to evaluate at compile-time. The call to `std::destroy_at(p)` should end the lifetime of `*p` and accesses to `*p` after that should be invalid. #include struct mytype { constexpr mytype() : x(42) {} constexpr ~mytype() {} int x; }; constexpr int foo() { std::allocator alloc; mytype* p = alloc.allocate(1); std::construct_at(p); std::destroy_at(p); // destroy *p int result = p->x;// access alloc.deallocate(p, 1); return result; } static_assert(foo() == 42);
[Bug c++/107528] New: constexpr evaluator doesn't check for deallocate of mismatched size
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107528 Bug ID: 107528 Summary: constexpr evaluator doesn't check for deallocate of mismatched size Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- This functions causes undefined behavior and should not be evaluated at compile-time. The problem is the second argument of `deallocate` function (number of objects to deallocate). It must be equal to the number of objects that were allocated. #include constexpr int foo() { std::allocator alloc; int* p = alloc.allocate(1); alloc.deallocate(p, 3); return 42; } static_assert(foo() == 42);
[Bug c++/107161] gcc doesn't constant fold member if any other member is mutable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107161 --- Comment #2 from Ivan Sorokin --- > Do constexpr/consteval work in such circumstances? Yes, constexpr works for variables like "p.a": extern constexpr mytype p = {1, 2}; int foo() { constexpr int t = p.a + 10; return t; } foo(): mov eax, 11 ret https://godbolt.org/z/K9a69E4ar
[Bug c++/107161] New: gcc doesn't constant fold member if any other member is mutable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107161 Bug ID: 107161 Summary: gcc doesn't constant fold member if any other member is mutable Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- On this code: struct mytype { int a; mutable int b; }; extern mytype const p = {1, 2}; int foo() { return p.a + 10; } int bar() { return p.b + 10; } GCC -O2 generates: foo(): mov eax, DWORD PTR p[rip] add eax, 10 ret bar(): mov eax, DWORD PTR p[rip+4] add eax, 10 ret While clang folds "p.a + 10" into 11: foo():# @foo() mov eax, 11 ret bar():# @bar() mov eax, dword ptr [rip + p+4] add eax, 10 ret I think GCC should do the same.
[Bug libstdc++/103382] condition_variable::wait() is not cancellable because it is marked noexcept
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103382 Ivan Sorokin changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #7 from Ivan Sorokin --- As the bug is fixed I'm closing the issue.
[Bug tree-optimization/101706] bool0^bool1^1 -> bool0 == bool1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101706 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #2 from Ivan Sorokin --- This issue was fixed in PR106379 by Richard Biener. After https://gcc.gnu.org/g:375668e0508fbe173af1ed519d8ae2b79f388d94 for both fa and fb we have: fa(bool&, bool&, bool&): movzx eax, BYTE PTR [rsi] cmp BYTE PTR [rdi], al seteBYTE PTR [rdx] ret fb(bool&, bool&, bool&): movzx eax, BYTE PTR [rsi] cmp BYTE PTR [rdi], al seteBYTE PTR [rdx] ret I think the issue can be closed now.
[Bug tree-optimization/98709] gcc optimizes bitwise operations, but doesn't optimize logical ones
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98709 Ivan Sorokin changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from Ivan Sorokin --- This issue was fixed in PR106379 by Richard Biener. https://gcc.gnu.org/g:375668e0508fbe173af1ed519d8ae2b79f388d94
[Bug middle-end/19987] [meta-bug] fold missing optimizations in general
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19987 Bug 19987 depends on bug 98709, which changed state. Bug 98709 Summary: gcc optimizes bitwise operations, but doesn't optimize logical ones https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98709 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug middle-end/105762] [12/13 Regression] -Warray-bounds false positives for integer-to-pointer casts since r12-2132-ga110855667782dac
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105762 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #4 from Ivan Sorokin --- Perhaps the warning message could be improved? The warning is saying about arrays but there are no arrays in the original code. I think it would be great if the warning said something about {invalid/wild/cast from int} pointer. English is not my strong suit, perhaps something like this: warning: dereferencing wild pointer '(int*)1ul' is undefined
[Bug c++/105864] storing nullptr_t to memory should not generate any instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105864 --- Comment #5 from Ivan Sorokin --- I would suggest (In reply to Andrew Pinski from comment #4) > nullptr_t t, t1 = nullptr; > __builtin_memcpy(&a[0], &t, sizeof(t)); > So I suspect this should be marked as invalid. The questions is how GCC defines memcpy'ing from nullptr_t. Should it be required to read zero bytes? Or null pointer value? What about systems where the value of null pointer is not zero? In any case I don't think memcpy'ing nullptr_t into a different type is particularly useful or used anywhere (I might be wrong). So I suggest defining nullptr_t as an empty type containing only padding bytes. In this case memcpy should just read the padding bytes.
[Bug tree-optimization/105864] New: storing nullptr_t to memory should not generate any instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105864 Bug ID: 105864 Summary: storing nullptr_t to memory should not generate any instructions Product: gcc Version: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Currently storing a nullptr_t to memory causes 0 to be written to that memory. As there is no way to read this value back without invoking undefined behavior I believe GCC can omit storing it. This will make nullptr_t behave more similar to an empty struct that has only padding bytes in it. using nullptr_t = decltype(nullptr); void test(nullptr_t* p) { *p = nullptr; } struct empty {}; void test(empty* p) { *p = empty(); } test(decltype(nullptr)*): mov QWORD PTR [rdi], 0 ret test(empty*): ret
[Bug middle-end/105862] New: missed inlining opportunity of _Sp_counted_deleter::_M_destroy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105862 Bug ID: 105862 Summary: missed inlining opportunity of _Sp_counted_deleter::_M_destroy Product: gcc Version: 12.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- This sample is reduced from a real usage of shared_ptr. #include #include struct sp_counted_base { sp_counted_base() noexcept : use_count(1) {} virtual void destroy() noexcept {} void release() noexcept { if (--use_count == 0) destroy(); } private: sp_counted_base(sp_counted_base const&) = delete; sp_counted_base& operator=(sp_counted_base const&) = delete; int use_count; }; struct sp_counted_deleter final : sp_counted_base { virtual void destroy() noexcept { ::operator delete(this); } }; void test() { sp_counted_deleter* mem = static_cast(::operator new(sizeof(sp_counted_deleter))); ::new (mem) sp_counted_deleter(); sp_counted_base* pi = mem; pi->release(); } https://godbolt.org/z/dG8h7f1Kn sp_counted_deleter::destroy(): jmp operator delete(void*) test(): sub rsp, 8 mov edi, 16 calloperator new(unsigned long) mov QWORD PTR [rax], OFFSET FLAT:vtable for sp_counted_deleter+16 mov rdi, rax mov DWORD PTR [rax+8], 0 add rsp, 8 jmp sp_counted_deleter::destroy() In the output assembly the call to sp_counted_deleter::destroy is left uninlined. I tested the same sample on Clang and it somehow manages to inline this function. It would be great if GCC was able to inline it too.
[Bug c++/104503] [12 regression][modules] bits/shared_ptr_base.h: error: must ‘#include ’ before using ‘typeid’
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104503 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #4 from Ivan Sorokin --- Could you please review the resolution? In 2.cpp nothing requires . Getting an error message about something that is not even used in the file can't be right.
[Bug sanitizer/105141] #pragma pack(1) causes incorrect UBSAN warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105141 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #8 from Ivan Sorokin --- Note that when __attribute__((packed)) is used GCC produces a warning: warning: taking address of packed member of '' may result in an unaligned pointer value [-Waddress-of-packed-member] 10 | int *d = &c.b; | ^~~~ Perhaps a similar warning should be reported for #pragma packed structs. https://godbolt.org/z/Yr13WhbG8 struct { char a; int b; } __attribute__((packed)) c; int main() { int *d = &c.b; __builtin_printf("%d\n", *d); }
[Bug c++/105099] New: In lookup for namespace name qualifiers only namespaces should be considered
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105099 Bug ID: 105099 Summary: In lookup for namespace name qualifiers only namespaces should be considered Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Consider this code: namespace a { namespace c {} struct a {}; namespace b = a::c; // (1) using namespace a::c; // (2) } Currently GCC prints an error: file.cpp:9:22: error: 'c' is not a namespace-name 9 | namespace b = a::c; | ^ file.cpp:10:24: error: 'c' is not a namespace-name 10 | using namespace a::c; |^ If I interpret the standard correctly the code should compile without errors because during the lookup of the qualifier the struct "a" should be ignored and the namespace "a" should be found. [basic.lookup.udir]p1: In a using-directive or namespace-alias-definition, during the lookup for a namespace-name or for a name in a nested-name-specifier only namespace names are considered. https://eel.is/c++draft/basic.lookup.udir https://godbolt.org/z/vaWjx4cKj
[Bug c++/103566] New: confusing error message for typedefs with initializers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103566 Bug ID: 103566 Summary: confusing error message for typedefs with initializers Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- On this code GCC says: typedef int foo = 42; error: typedef 'foo' is initialized (use 'decltype' instead) I believe this error message is quite confusing. It wasn't clear to me how 'decltype' could help here. After a bit of git-blaming I found the original commit that added this message: author Zack Weinberg Sat, 19 Oct 2002 03:14:11 + (03:14 +) commit 4a7510cb22da4809d18e3bb3fc453cf671d6926a c-decl.c, decl.c (start_decl): Point users of the old initialized- typedef extension at __typeof__. - error ("typedef `%D' is initialized", decl); + error ("typedef `%D' is initialized (use __typeof__ instead)", decl); Unfortunately the commit wasn't accompanied by a testcase, but I assume in the past there was some GCC-specific extension "initialized typedef" that worked like decltype/typeof. I believe that the error message although beneficial in the past is confusing for users today. I would like to suggest removing "(use 'decltype' instead)" text.
[Bug tree-optimization/103559] Can't optimize away < 0 check on sqrt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103559 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #4 from Ivan Sorokin --- (In reply to Andrew Pinski from comment #1) > I think there is another bug about this. Perhaps related to PR91645. The bug report itself is about slightly different issue, but the comments discusses the same problem.
[Bug libstdc++/103382] condition_variable::wait() is not cancellable because it is marked noexcept
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103382 --- Comment #3 from Ivan Sorokin --- > Huh, I thought it was noexcept. Then yes, we should remove it. Thank you very much! I'm looking forward for a fix. > There are still lots of other places where the stadnard does require > 'noexcept' and cancellation will terminate. Do you have any specific functions in mind? If so perhaps something can be done about them too. Some people claim that noexcept and cancellation and mutually incompatible, but I don't think this is the case. I believe that by following a simple discipline noexcept and cancellation can interact very well. First of all not all noexcept functions are problematic: noexcept functions that don't call cancellation points are perfectly fine. The noexcept functions that do call some cancellation points can be fixed by suppression/restoring of cancellation. For example, a destructor that calls close() which is a cancellation point should just suppress/restore cancellation. Same for a destructor that calls pthread_join(). One might say that because of this we lose some cancellation points and this is true, but I believe that noexcept are the places where program can not recover preserving exception guarantees and having cancellation suppressed in these places is perfectly fine.
[Bug libstdc++/103382] condition_variable::wait() is not cancellable because it is marked noexcept
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103382 --- Comment #1 from Ivan Sorokin --- Please note there was a related issue PR67726. I hope it is possible to meet the requirements mentioned in the issue as well as enabling cancellation.
[Bug libstdc++/103382] New: condition_variable::wait() is not cancellable because it is marked noexcept
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103382 Bug ID: 103382 Summary: condition_variable::wait() is not cancellable because it is marked noexcept Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- At the moment condition_variable::wait() is marked noexcept. It means that if pthread_cond_wait() acts as cancellation point and throws an exception the program is terminated with std::terminate(). This program demonstrates the issue: #include #include int main() { std::mutex m; std::condition_variable cv; std::thread th([&] { std::unique_lock lock(m); cv.wait(lock); }); pthread_cancel(th.native_handle()); th.join(); } This program terminates with SIGABRT. Because of this using condition_variable::wait() in cancellable threads is tricky: the programmer has to guard all calls to condition_variable::wait() with disabling/restoring cancellation state. Also this stops the thread from being cancellable while in wait(). Therefore the outer thread has to know which condition_variable the thread waits and notify this condition_variable after pthread_cancel(). Also one should add cancellation point pthread_testcancel() immediately after restoring cancellation state after wait(). I believe it would be great if condition_variable::wait interacted nicer with POSIX-cancellation. I would like to suggest removing noexcept from condition_variable::wait(). This also matches the C++ standard very well [thread.condition.condvar] where condition_variable::wait() is not marked as noexcept.
[Bug c++/102881] gcc totally broken when trailing return type combine with decltype lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102881 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #2 from Ivan Sorokin --- PR92707 also features lambda inside decltype. Perhaps they are related.
[Bug tree-optimization/102888] New: missing case for combining / and % into one operation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102888 Bug ID: 102888 Summary: missing case for combining / and % into one operation Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Normally GCC combines a/b and a%b into one operation when they are computed in the same basic-block. The example below has two functions. For one GCC is able to combine the operations and for other not (presumably because of complicated control-flow). I believe the two functions are functionally equivalent. unsigned long long reduce(unsigned long long a, unsigned long long b) { while ((a % b) == 0) a /= b; return a; } unsigned long long reduce_opt(unsigned long long a, unsigned long long b) { for (;;) { unsigned long long quot = a / b; unsigned long long rem = a % b; if (rem != 0) break; a = quot; } return a; } reduce.L3: mov rax, r8 xor edx, edx div rsi xor edx, edx mov r8, rax div rsi testrdx, rdx je .L3 reduce_opt.L8: xor edx, edx mov r8, rax div rsi testrdx, rdx je .L8 https://godbolt.org/z/9dqs8avE5 It would be great if GCC generated the same code for both of these functions.
[Bug c++/102704] New: NRVO for throw expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102704 Bug ID: 102704 Summary: NRVO for throw expression Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Consider this code: struct mytype { mytype(); mytype(mytype const&); mytype(mytype&&); }; void test() { mytype e; throw e; } Currently for function test() GCC generates the following sequence of calls (pseudocode): char e[sizeof(mytype)]; mytype_default_ctor(e); p = __cxa_allocate_exception(); mytype_move_ctor(p, e); __cxa_throw(p); I believe a trick similar to NRVO for returns can be made here. When a variable meets NRVO criteria, compiler can remove the local variable and replace it with a storage allocated by __cxa_allocate_exception. Here what I believe can be generated: p = __cxa_allocate_exception(); mytype_default_ctor(p); __cxa_throw(p);
[Bug c++/61355] gcc doesn't normalize type in non-type template parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61355 --- Comment #6 from Ivan Sorokin --- (In reply to Patrick Palka from comment #5) > Fixed for GCC 12. Thanks!
[Bug target/102355] New: excessive stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102355 Bug ID: 102355 Summary: excessive stack usage Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- void escape(unsigned long long& a); void foobar() { unsigned long long local; escape(local); } For the function "foobar" GCC allocates excessive stack space: foobar(): sub rsp, 24 lea rdi, [rsp+8] callescape(unsigned long long&) add rsp, 24 ret The function "foobar" only needs 8 bytes of stack space, but GCC allocates 24. Please note, that this excessive allocation isn't needed for stack alignment: 8 bytes of local variables are enough to keep the stack aligned. I also tested Clang and it allocates 8 bytes. GCC makes this stack layout: 8 bytes padding 8 bytes variable "local" 8 bytes padding 8 bytes return address I believe the problem is related to the fact that GCC aligns the stack twice: the first time after the return address placement and the second time after the local variables are placed. Playing with -mpreferred-stack-boundary confirms this: -mpreferred-stack-boundary | stack usage 3 8 4 (default)24 556 6 120 https://godbolt.org/z/h56aoKvvh In all cases the stack usage is twice as much (minus 8 bytes for return address) as the required alignment. I believe stack space can be conserved by doing alignment only once.
[Bug tree-optimization/98774] gcc -O3 does not vectorize some operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98774 --- Comment #4 from Ivan Sorokin --- I retested the sample on GCC 11.2. https://godbolt.org/z/xrarP3zbY Compared to Clang 12.0.1 GCC still generates 6 more instructions in total and does 6 mulpd against Clang's 4 mulpd.
[Bug c++/102335] New: gcc misses -Wunused-value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102335 Bug ID: 102335 Summary: gcc misses -Wunused-value Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- struct mytype { int memfun [[gnu::pure]] (); }; void test() { mytype x; x.memfun();// -Wunused-value mytype().memfun(); // no -Wunused-value } https://godbolt.org/z/vc49jWGqn The code above contains two usages of a [[gnu::pure]] function with ignored return value. GCC detect only the first one. I believe the second one deserves a warning too. Clang shows a warning on both usages.
[Bug rtl-optimization/3507] appalling optimisation with sub/cmp on multiple targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3507 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #60 from Ivan Sorokin --- Another similar case. On this function: unsigned wrap(unsigned index, unsigned limit) { if (index >= limit) index -= limit; return index; } GCC 11.1 -O2 generates: wrap(unsigned int, unsigned int): mov edx, edi mov eax, edi sub edx, esi cmp edi, esi cmovnb eax, edx ret I believe cmp here is redundant as the flags are already set after sub. After removing cmp we get: wrap(unsigned int, unsigned int): mov edx, edi mov eax, edi sub edx, esi cmovnb eax, edx ret Now the register edx becomes unneeded: wrap(unsigned int, unsigned int): mov eax, edi sub edi, esi cmovnb eax, edi ret
[Bug middle-end/95014] gcc fails to merge two identical returns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95014 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #1 from Ivan Sorokin --- I think this might be a duplicate of 82689.
[Bug middle-end/99797] accessing uninitialized automatic variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99797 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #10 from Ivan Sorokin --- (Disclaimer: I'm not a GCC developer, I'm just a random guy who reads bugzilla and tried making some simple changes to GCC a few times) (In reply to Martin Uecker from comment #9) > The behavior of GCC is dangerous as the example in comment #1 show. You can > not reason at all about the generated code. My reasoning normally boils down to this: As the program invokes UB therefore the exact behavior depends on the compiler, the compiler version, the OS and other factors. I would like to note that the optimization performed by compiler are not designed to break user's code. They were designed to optimize some typical redundancies in programs. It just happened that their combination breaks unpredictably the code invoking UB. Normally it is difficult/impossible not to break the code invoking UB without regressing some optimizations. Also optimizations performed by compiler change over time, so the exact result of the breakage inevitably depends on the specific compiler version. In theory GCC already has an option that limits the effects of UB: -O0. I believe this is the only forward-compatible option for that. If we want to be more precise we can disable only -fno-tree-ccp, but these fine-grained optimization options changes from one compiler version to another. > The "optimize based on the assumption that UB can not happen" philosophy > amplifies even minor programming errors into something dangerous. Unfortunately this is easier said than done. I far as I know all major compilers do optimization based on UB. Consider this: const int PI = 3; int tau() { return 2 * PI; // can this be folded into 6? } GCC folds 2 * PI into 6 even with -O0. This optimization is based on UB. Because in some other function one can write: void evil() { const_cast(PI) = 4; } As some usages of PI can be folded and some can be not. The ones that were folded would see PI = 3, the ones that were not folded would see PI = 4. One can argue that the constant folding is fundamentally an optimization based on UB. I believe few optimizations will be left, if we disable all that rely on UB. > This, of course, also applies to other UB (in varying degrees). For signed > overflow we have -fsanitize=signed-integer-overflow which can help detect and > mitigate such errors, e.g. by trapping at run-time. And also this is allowed > by UB. > In case of UB the choice of what to do lies with the compiler, but I think it > is a bug if this choice is unreasonable and does not serve its users well. Do you have some specific proposal in mind? Currently a user has these 5 options: 1. Using -O0 suppressing optimizations. 2. Using -fno-tree-ccp suppressing this specific optimization. 3. Using -Wall and relying on warnings. 4. (in theory) Using static analyzer -fanalyzer. It doesn't detect this error at the moment, but I believe can be taught detecting this. 5. Using dynamic analyzer like valgrind. It seems that you find existing options insufficient and want another one.
[Bug analyzer/94355] support for C++ new expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94355 --- Comment #7 from Ivan Sorokin --- For me the support for operator new works well for trivially constructible types. For a non-trivially constructible type I got a false positive: struct foo { foo(); }; int main() { delete new foo(); } In function 'int main()': cc1plus: warning: use of possibly-NULL 'operator new(1)' where non-null expected [CWE-690] [-Wanalyzer-possible-null-argument] 'int main()': event 1 | |:5:20: |5 | delete new foo(); | |^ | || | |(1) this call could return NULL | 'int main()': event 2 | |cc1plus: | (2): argument 'this' ('operator new(1)') from (1) could be NULL where non-null expected | :1:14: note: argument 'this' of 'foo::foo()' must be non-null 1 | struct foo { foo(); }; | ^~~ Compiler returned: 0 https://godbolt.org/z/nPff9EGsY Also the error location seems to be wrong. Removing "()" from "delete new foo()" fixes the error location.
[Bug analyzer/94355] support for C++ new expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94355 --- Comment #6 from Ivan Sorokin --- I played with -fanalyzer on godbolt (GCC trunk). I noticed that -fanalyzer doesn't report double free in this (convoluted) case: #include int main() { int* p = new int; delete p; free(p); }
[Bug c++/100039] New: GCC can not bind lvalue to lvalue reference in brace-initialized-temporary expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100039 Bug ID: 100039 Summary: GCC can not bind lvalue to lvalue reference in brace-initialized-temporary expression Product: gcc Version: 10.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Consider this program: typedef int& ref; int main() { int a; ref{a}; } This is accepted by clang, msvc and icc. GCC 10.3 rejects this code with a message: error: cannot bind non-const lvalue reference of type 'ref' {aka 'int&'} to an rvalue of type 'int' I believe the error message is incorrect, because "a" is not an rvalue here. It is lvalue, therefore it should be allowed to bind to lvalue reference. https://godbolt.org/z/TWY9GPq3E
[Bug sanitizer/99418] sanitizer checks for accessing multidimentional VLA-array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418 --- Comment #8 from Ivan Sorokin --- If I understand #c5 correctly the minimal reproducer should be this: void g(int&); void f() { int a[10]; int& p = a[10]; // (1) g(a[10]); // (2) } Both (1) and (2) are undefined and -fsanitize=bounds can help checking this.
[Bug sanitizer/99418] sanitizer checks for accessing multidimentional VLA-array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418 --- Comment #7 from Ivan Sorokin --- (In reply to Martin Liška from comment #3) > That said, can we close it as resolved? I'm sorry for not being clear from the beginning. The original report was about -fsanitize=bounds sanitizer which sometimes allows accessing one past the end element. Now after #c4 I see that language rules make it excessively complicated for compiler to do this. I believe that one past the end is important error to check for, but I understand why compilers might choose to avoid doing it. Feel free to close the issue if implementing it is infeasible.
[Bug sanitizer/99418] sanitizer checks for accessing multidimentional VLA-array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418 --- Comment #6 from Ivan Sorokin --- (In reply to Jakub Jelinek from comment #4) > Asan can't by design detect neither #c0 nor #c1, only ubsan can. > The reason why ubsan has that off by one stuff is that in C/C++, > &mas[n - 1][m] is not undefined behavior, only mas[n - 1][m] is. That is very unfortunate. For standard containers subscripting with wrond index is undefined behavior no matter if it is followed by taking of address. I assumed the same rules apply for builtin arrays. If one need just a point one can easily write a + n instead of &a[n]. Now I see that this is not the case and built-in arrays behave differently. > For #c1, the big question is what exactly is UB in C++, whether already > binding a reference to the object after the end of the array or only > actually accessing that reference. If the former, ubsan could treat > REFERENCE_TYPE differently, if the latter, then I'm afraid it can't do that, > and ubsan by design has to be done early before all the optimizations change > the IL so much that it is completely lost what were the user errors in it. > For the method calls, there really isn't a reference in the IL either, this > argument is a pointer, but .UBSAN_BOUNDS calls are added in the FE and so > perhaps it could know it is a method call and treat it as a reference. > So, something can be done but we need answers on where the UB in C++ exactly > happens. For -fsanitize=null the rules are quite subtle: dereferencing by itself (*p) doesn't check for nullptr, but binding a reference (int& q = *p;) does. Perhaps similar rules can be employed for past-the-end element: taking pointer to it is fine, but passing the pointer as this parameter to function is UB? At least this would be consistent with null pointers.
[Bug sanitizer/99418] sanitizer checks for accessing multidimentional VLA-array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418 --- Comment #2 from Ivan Sorokin --- It looks like this is related to ignore_off_by_one parameter of ubsan_instrument_bounds. As can be seen in gimple the problematic .UBSAN_BOUNDS checks against array size plus 1.
[Bug sanitizer/99418] sanitizer checks for accessing multidimentional VLA-array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418 --- Comment #1 from Ivan Sorokin --- Here is the reduced example. It doesn't SIGSEGV, but it doesn't report any sanitizer errors either: $ g++ -g -fsanitize=bounds 3.cpp $ cat 3.cpp #include void escape(int& a) {} void test(size_t n, size_t m) { int mas[n][m]; escape(mas[n - 1][m]); } int main() { test(4, 3); } Surprisingly if I replace taking a reference with writing to the array it will show an error.
[Bug sanitizer/99418] New: sanitizer checks for accessing multidimentional VLA-array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418 Bug ID: 99418 Summary: sanitizer checks for accessing multidimentional VLA-array Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone: --- The example below accesses array past its size, but sanitizers don't show any errors. If I change index m to m + 1 an error will be shown. This makes me think that compiler does some checks, but perhaps they are incomplete for multidimentional VLA-arrays. GCC 10.2. #include std::string shortest_match(size_t n, size_t m) { std::string mas[n][m]; mas[n - 1][m] = ""; // mas[n - 1][m + 1] will show an errors return mas[n - 1][m - 1]; } int main() { shortest_match(4, 3); } $ g++ -g -fsanitize=address,undefined -std=c++17 2.cpp && ./a.out AddressSanitizer:DEADLYSIGNAL = ==26974==ERROR: AddressSanitizer: SEGV on unknown address 0x (pc 0x7f59ea2ad2d6 bp 0x sp 0x7ffc78389ea0 T0) ==26974==The signal is caused by a WRITE memory access. ==26974==Hint: address points to the zero page. #0 0x7f59ea2ad2d6 in std::__cxx11::basic_string, std::allocator >::_M_replace(unsigned long, unsigned long, char const*, unsigned long) (/lib/libstdc++.so.6+0x13c2d6) #1 0x401658 in shortest_match[abi:cxx11](unsigned long, unsigned long) /home/ivan/2.cpp:6 #2 0x4019eb in main /home/ivan/2.cpp:13 #3 0x7f59e950ec7c in __libc_start_main (/lib/libc.so.6+0x23c7c) #4 0x4011a9 in _start (/home/ivan/a.out+0x4011a9) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/lib/libstdc++.so.6+0x13c2d6) in std::__cxx11::basic_string, std::allocator >::_M_replace(unsigned long, unsigned long, char const*, unsigned long) ==26974==ABORTING
[Bug middle-end/99087] New: suboptimal codegen for division by constant 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99087 Bug ID: 99087 Summary: suboptimal codegen for division by constant 3 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- These two are functionally the same, but generate different code with g++ -O2: unsigned long long foo(unsigned long long a) { return a / 3; } unsigned long long bar(unsigned long long a) { return (unsigned __int128)a * 0x'''AAAB >> 65; } foo(unsigned long long): movabs rdx, -6148914691236517205 mov rax, rdi mul rdx mov rax, rdx shr rax ret bar(unsigned long long): movabs rax, -6148914691236517205 mul rdi mov rax, rdx shr rax ret For some reason for division GCC chooses different argument order which causes generation of one extra mov.
[Bug target/91400] __builtin_cpu_supports conjunction is optimized poorly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91400 --- Comment #2 from Ivan Sorokin --- I've sent a patch to gcc-patches mailing list: https://gcc.gnu.org/pipermail/gcc-patches/2021-February/564663.html
[Bug c++/82640] gcc doesn't show errors on anonymous local variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82640 Ivan Sorokin changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #1 from Ivan Sorokin --- It looks like this issue was fixed in GCC 8. Closing.
[Bug tree-optimization/98774] gcc -O3 does not vectorize some operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98774 --- Comment #3 from Ivan Sorokin --- (In reply to Hongtao.liu from comment #1) > It's fixed in current trunk https://godbolt.org/z/63576n I can confirm that now GCC does use packed multiplication mulpd. Although it is used somewhat inefficiently. The original program contained 8 multiplications and clang does 4 packed multiplication. GCC trunk does 6 packed multiplications. https://godbolt.org/z/EabPxT
[Bug c++/98814] Add fix-it hints for missing asterisk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98814 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #2 from Ivan Sorokin --- PR87850 looks similar. It discusses only pointers, but I think it can be generalized to any type that has operator*: pointers, iterators and smart-pointers.
[Bug tree-optimization/98775] missing optimization opportunity on nbody
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98775 --- Comment #1 from Ivan Sorokin --- Created attachment 50016 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50016&action=edit nbody-unrolled.cpp
[Bug tree-optimization/98775] New: missing optimization opportunity on nbody
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98775 Bug ID: 98775 Summary: missing optimization opportunity on nbody Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Created attachment 50015 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50015&action=edit nbody.cpp On the attached sample (208 LOC), clang 11.0 generates the code that is almost twice as fast as the one generated by GCC 10.2 (-O3 -ffast-math -flto). $ ./nbody 5000 4.0s for clang vs 7.5s for GCC. A quick look at the generated code shows that clang aggressively unrolled all inner loops. If I unroll all inner loops manually I get: $ ./nbody-unrolled 5000 3.7s for clang vs 6.3s for GCC. 17.6B instructions for clang vs 29.6B instructions for GCC. While the first sample is a subject to unrolling heuristic, the second is about optimizing the completely linear chunk of code with many floating point multiplications and additions. I tried reducing the sample further, but I only came up with PR98774.
[Bug tree-optimization/98774] New: gcc -O3 does not vectorize multiplication
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98774 Bug ID: 98774 Summary: gcc -O3 does not vectorize multiplication Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Created attachment 50014 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50014&action=edit nbody-update-velocity.cpp In the following sample GCC (-O3 -ffast-math) fails to vectorize operations. The results is that GCC 10.2 does 8 mulsd, while clang 11.0 does 4 mulpd. struct vec3 { double x, y, z; }; void update_velocities(vec3* __restrict velocity, double const* __restrict mass, vec3 const* __restrict dpos, double const* __restrict mag) { velocity[0] -= dpos[0] * (mass[1] * mag[0]); velocity[1] += dpos[0] * (mass[0] * mag[0]); } See an attachment for the complete sample.
[Bug middle-end/98710] New: missing optimization (x | c) & ~(y | c) -> x & ~(y | c)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98710 Bug ID: 98710 Summary: missing optimization (x | c) & ~(y | c) -> x & ~(y | c) Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- On this function clang generates slightly shorter code unsigned foo(unsigned x, unsigned y, unsigned c) { return (x | c) & ~(y | c); } because it notices that the expression can be simplified to x & ~(y | c). It would be great if GCC can do the same. https://godbolt.org/z/3ob6eb
[Bug middle-end/98709] New: gcc optimizes bitwise operations, but doesn't optimize logical ones
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98709 Bug ID: 98709 Summary: gcc optimizes bitwise operations, but doesn't optimize logical ones Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- GCC 10.2 produces very good code for this function noticing that both sides of conjuntion are the same: unsigned foo_bitwise(unsigned a, unsigned b) { return (~a ^ b) & ~(a ^ b); } foo_bitwise(unsigned int, unsigned int): xor edi, esi mov eax, edi not eax ret But when I write a similar function with logical operations it doesn't notice that: bool foo_logical(bool a, bool b) { return (!a ^ b) & !(a ^ b); } foo_logical(bool, bool): mov eax, esi xor eax, edi xor eax, 1 cmp dil, sil setedl and eax, edx ret I believe that in a similar manner it can be optimized to something like this: foo_logical(bool, bool): xor edi, esi mov eax, edi xor eax, 1 ret
[Bug c++/98660] -Wold-style-cast should not warn on casts that look like (decltype(x))(x)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98660 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #1 from Ivan Sorokin --- I'm not a GCC developer, but I'm just curious. Why the use of C-style cast is required here? Could you use static_cast instead? I mean instead of `(decltype(x))(x)` using `static_cast(x)`? Perhaps wrapping it in some macro in order to not duplicate `x` twice.
[Bug rtl-optimization/98555] Functions optimized to zero length break function pointer inequality
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98555 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #4 from Ivan Sorokin --- (In reply to Richard Biener from comment #1) > [for QOI/security/whatever we probably want to at least emit a ret > instruction] RET might be dangerous when the return type is non-void. Perhaps UD2 or INT3 would be better?
[Bug c++/98501] potential optimization for base<->derived pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98501 --- Comment #2 from Ivan Sorokin --- (In reply to Richard Biener from comment #1) > I think there's a duplicate of this PR. I searched the list of bugs and I found PR95663. Is it it?
[Bug c++/98501] New: potential optimization for base<->derived pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98501 Bug ID: 98501 Summary: potential optimization for base<->derived pointer casts Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Consider this code: struct base1 { int a; }; struct base2 { int b; }; struct derived : base1, base2 {}; derived& to_derived_bad(base2* b) { return *static_cast(b); } derived& to_derived_good(base2* b) { return static_cast(*b); } I believe both of these functions are functionally equivalent and should generate the same code. Both functions cast pointer from base to derived if it is not nullptr and both cause undefined behavior if it is nullptr. GCC optimizes to_derived_good() to a single subtraction, but it inserts nullptr-check into to_derived_bad(): to_derived_good(base2*): lea rax, [rdi-4] ret to_derived_bad(base2*): lea rax, [rdi-4] test rdi, rdi mov edx, 0 cmove rax, rdx ret Could GCC omit the nullptr-check in to_derived_bad?
[Bug c++/80016] error is positioned incorrectly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80016 Ivan Sorokin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #6 from Ivan Sorokin --- (In reply to Jonathan Wakely from comment #5) > I'd even argue the stating location still isn't right in this version, as > the error comes from ns::trait::value not the logical expression > containing it. On GCC 10.2 the error location is very nice: :13:46: error: incomplete type 'ns::trait' used in nested name specifier 13 | && ns::trait::value; | ^ https://godbolt.org/z/E7P7P8 I believe the bug can be considered fixed now. Thank you!
[Bug tree-optimization/47579] STL size() == 0 does unnecessary shift
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47579 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #3 from Ivan Sorokin --- Since 7.1 GCC doesn't produce any shifts on the test code as well as on the examples from comment #2. https://godbolt.org/z/f48EqP I think the bug can be closed now.
[Bug middle-end/56719] missed optimization: i > 0xffff || i*4 > 0xffff
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56719 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #8 from Ivan Sorokin --- On the test code clang since 3.5 and before 9.0 does something very surprising. It optimizes (A > 0x || B > 0x) into (A | B) > 0x. I don't think this is what the reporter expected, but still is a potential optimization for GCC. See https://godbolt.org/z/WqPhbW
[Bug rtl-optimization/48877] Inline asm for rdtsc generates silly code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48877 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #2 from Ivan Sorokin --- Modern GCC doesn't generate excessive moves for this example. It looks like the problem was fixed in 4.9.0: https://godbolt.org/z/MqE7sP . I think the bug can be closed now.
[Bug target/94852] -ffloat-store on x64 target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94852 --- Comment #6 from Ivan Sorokin --- (In reply to Richard Biener from comment #1) > @item -ffloat-store > @opindex ffloat-store > Do not store floating-point variables in registers, and inhibit other > options that might change whether a floating-point value is taken from a > register or memory. > > I think it does what it says? This is a follow-up for my previous comment. Perhaps I haven't explained myself properly, let me explain why I find the existing behavior a bit confusing. >From the documentation on -ffloat-store: "This option prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a double is supposed to have. Similarly for the x86 architecture." When a person uses -ffloat-store the desired effect is not the additional loads/stores, but the reproducible results across different compiler version/optimization options. It just happened that the cheapest way to go so is adding additional loads/stores. I'm pretty sure most users would be in favor of removing extra loads/stores when they don't affect the results. I understand that perhaps there are reasons why -ffloat-store should work the way it works now. If this is true, I would recommend updating the documentation by reflecting the cases (if they exists) when one might want to use -ffloat-store on x86-64. From what I understand now using -ffloat-store on x86-64 is just a mistake.
[Bug target/94852] -ffloat-store on x64 target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94852 --- Comment #4 from Ivan Sorokin --- (In reply to Richard Biener from comment #1) > @item -ffloat-store > @opindex ffloat-store > Do not store floating-point variables in registers, and inhibit other > options that might change whether a floating-point value is taken from a > register or memory. > > I think it does what it says? Yes, the behavior of the compiler and the documentation matches very well. The compiler works as intended. My report is not about a bug, but about a possible improvement. If ignoring or implementing a warning is considered undesirable, I would suggest expanding the documentation by clarifying the interaction between -ffloat-store and -mfpmath=sse. Something like this in the documentation would help: "If used together with -mfpmath=sse, -ffloat-store doesn't change the results of floating point operations. The only effect it has is severely pessimizing the generated code."
[Bug target/94852] New: -ffloat-store on x64 target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94852 Bug ID: 94852 Summary: -ffloat-store on x64 target Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- At the moment -ffloat-store significantly pessimizes the code generation regardless of whether -mfpmath=sse -msse2 are used or not: float f(float a, float b) { return a + b; } -O2: addss xmm0, xmm1 ret -O2 -ffloat-store: movss DWORD PTR [rsp-20], xmm0 movss xmm0, DWORD PTR [rsp-20] movss DWORD PTR [rsp-24], xmm1 addss xmm0, DWORD PTR [rsp-24] movss DWORD PTR [rsp-4], xmm0 movss xmm0, DWORD PTR [rsp-4] ret Note that -mfpmath=sse -msse2 are the defaults on x86-64. My understanding is that -ffloat-store doesn't affect the result of floating point operations when SSE math is used. If this is true -ffloat-store pessimizes generated code without any change in observable behavior. Recently I have found a steam game that targets x86-64 and was compiled with -ffloat-store (presumably by mistake). For details see: https://forums.factorio.com/viewtopic.php?f=30&t=81134 . When -ffloat-store was removed a developer reported a 35% speedup of the Linux version of the game. My guess is -ffloat-store might be used by mistake when someone tries to get reproducible results on x86 without realizing that the same flags affects the performance negatively on x86-64. To prevent issues like this in the future I think GCC could do two things: 1. Ignore -ffloat-store when it doesn't affect the result of floating-point operations pretending that redundant loads/stores are optimized. 2. Issue a warning when -ffloat-store doesn't affect the result of floating-point operations. Because there is no point in using a flag which only effect is pessimizing code generation.
[Bug analyzer/94355] New: support for C++ new expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94355 Bug ID: 94355 Summary: support for C++ new expression Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: analyzer Assignee: dmalcolm at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- At the moment static analyzer warns about leaked malloc. It would be great if C++ new expression were also supported. Example: void f() { char* p = new char; } Expected diagnostic: warning: leak of 'p' [CWE-401] [-Wanalyzer-malloc-leak] 3 | char* p = new char;
[Bug target/91824] unnecessary sign-extension after _mm_movemask_epi8 or __builtin_popcount
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91824 --- Comment #7 from Ivan Sorokin --- (In reply to Jakub Jelinek from comment #6) > Fixed. Thank you!
[Bug c++/93211] New: equivalence of dependent function calls doesn't check if the call is eligible for ADL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93211 Bug ID: 93211 Summary: equivalence of dependent function calls doesn't check if the call is eligible for ADL Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Consider this code: // https://gcc.godbolt.org/z/3U2TTd #include template void g(T); template void f() {} // (1) template void f() {} // (2) Question is whether (1) and (2) are (re)definition of the same function or definitions of two different functions. Currently GCC believes that this is a redefinition of the same function. I think (1) and (2) should be definitions of two different functions, because "decltype(g(T{}))" and "decltype(::g(T{}))" are susceptible to different SFINAE errors: the overload candidate set of (2) is fixed and the overload candidate set of (1) can be extended arbitrary by ADL.
[Bug c++/92707] New: type alias on type alias on lambda in unevaluated context does not work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92707 Bug ID: 92707 Summary: type alias on type alias on lambda in unevaluated context does not work Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- GCC shows an error on this code: template using foo = decltype([] {}); template using bar = foo; extern foo a; extern bar a; // error: 'bar' does not name a type The error is wrong because bar is a regular type alias. Clearly it names a type. If I replace the lambda with an integer the error goes away.
[Bug middle-end/91824] New: unnecessary sign-extension after _mm_movemask_epi8 or __builtin_popcount
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91824 Bug ID: 91824 Summary: unnecessary sign-extension after _mm_movemask_epi8 or __builtin_popcount Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- gcc -O2 -mpopcnt leaves unnecessary cdqe: #include #include void f(uint64_t& val, __m128i mask) { val += __builtin_popcount(_mm_movemask_epi8(mask)); } void g(uint64_t& val, __m128i mask) { val += __builtin_popcountll(_mm_movemask_epi8(mask)); } f: pmovmskb eax, xmm0 popcnt eax, eax cdqe add QWORD PTR [rdi], rax ret g: pmovmskb eax, xmm0 cdqe popcnt rax, rax add QWORD PTR [rdi], rax ret Both cdqe are unnecessary, because the results of both pmovmskb and __builtin_popcount can not be negative. Only lower 16 bits of pmovmskb can be non-zero. And the image of popcnt is either [0..32] or [0..64] depending on the argument.
[Bug tree-optimization/91400] New: __builtin_cpu_supports conjunction is optimized poorly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91400 Bug ID: 91400 Summary: __builtin_cpu_supports conjunction is optimized poorly Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Clang 8 optimizes both f() and g() to the same code: bool f() { return __builtin_cpu_supports("popcnt") && __builtin_cpu_supports("ssse3"); } bool g() { extern unsigned int cpu_model; return (cpu_model & 64) && (cpu_model & 4); } f()/g(): mov eax, dword ptr [rip + cpu_model] and eax, 68 cmp eax, 68 seteal ret GCC generates this code only for g(). For f() GCC generates less optimal: f(): mov edx, DWORD PTR __cpu_model[rip+12] mov eax, edx shr eax, 6 and eax, 1 and edx, 4 mov edx, 0 cmove eax, edx ret I believe it would be great if GCC is able to generate the same code for f() too.
[Bug middle-end/90345] too pessimistic check whether pointer may alias a local variable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90345 --- Comment #4 from Ivan Sorokin --- Making points-to analysis aware of SESE regions will definitely help here and is a nice thing to have. There is one more option. In my reduced test case the body of 'push_back' is unavailable, but when it is it can be analysed and an attribute can be added that 'push_back' only uses the received reference internally and does not escape it. >From my experiments this is what clang does: even when the body of 'push_back' is not inlined it generates different code for 'operator*=' depending on whether push_back escapes the received reference or not: void push_back(uint32_t const&) __attribute__((noinline)); void big_integer::push_back(uint32_t const& a) { __asm__("" : : : "memory"); //__asm__("" : : "g"(&a) : "memory"); } I guess with LTO enabled this type of analysis is quite powerful, as many 'const&' and 'this' parameters in C++ don't really escape.
[Bug middle-end/90345] New: too pessimistic check whether pointer may alias local variable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90345 Bug ID: 90345 Summary: too pessimistic check whether pointer may alias local variable Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Consider the following example (reduced from a real program): #include #include struct big_integer { void push_back(uint32_t const&); size_t size; uint32_t* digits; }; big_integer& operator*=(big_integer& a, uint32_t b) { uint64_t const BASE = 1ull << 32; uint32_t carry = 0; for (size_t i = 0; i != a.size; i++) { uint64_t sum = 1ull * a.digits[i] * b + carry; carry = static_cast(sum / BASE); a.digits[i] = static_cast(sum % BASE); } if (carry) { a.push_back(carry); //a.push_back(uint32_t(carry)); } return a; } GCC 9.1 compiles the inner loop to this: .L9: mov esi, DWORD PTR [rsp+12] ; load carry .L5: mov edx, DWORD PTR [rcx] add rcx, 4 imulrdx, r8 add rdx, rsi mov rsi, rdx shr rsi, 32 mov DWORD PTR [rsp+12], esi ; store carry mov DWORD PTR [rcx-4], edx cmp r9, rcx jne .L9 As one can see carry is spilled to stack and it is loaded and stored at each iteration of the loop. Loading and storing carry at each iteration is not needed: it is a local variable and its address is not taken. My guess is that GCC believes that it escapes because of the push_back after the loop. At least if I make a copy of carry before push_back'ing it (as shown in the comment) the problem goes away. I think that alias analysis can be improved here: carry may not alias a.digits[i] because it escapes only after the loop.
[Bug c++/86346] New: internal compiler error related to duduction guides
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86346 Bug ID: 86346 Summary: internal compiler error related to duduction guides Product: gcc Version: 8.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Here is the code: template struct bool_constant {}; using true_type = bool_constant; template constexpr bool is_same_v = false; template constexpr bool is_same_v = true; template struct vector { template static constexpr bool v = is_same_v || false; vector(bool_constant>, T); // (1) }; template<> struct vector // (2) { vector(true_type, bool); }; vector v { true_type{}, false }; // (3) The problem is that the deduction guide generated from the constructor (1) refers to a member v of the primary template. At (3) T is then deduced to be bool, but substitution T->bool should never be used for primary template, as there is explicit specialization vector (2). Clang gives an error on this code "note: candidate template ignored: substitution failure [with T = bool]: cannot reference member of primary template because deduced class template specialization 'vector' is an explicit specialization". I believe GCC should provide a similar message in this case instead of crashing.
[Bug c++/82910] New: marking data members private affects code generation of copying
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82910 Bug ID: 82910 Summary: marking data members private affects code generation of copying Product: gcc Version: 7.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vanyacpp at gmail dot com Target Milestone: --- Consider the following piece of code: struct pair { private: void* first; unsigned second; }; struct other { pair get() const; }; struct my { pair get(other const& other); pair current; pair* target; }; pair my::get(other const& other) { *target = other.get(); return current; } For the function my::get() GCC generates the following (quite inefficient) code: my::get(other const&): pushq %rbx movq %rdi, %rbx movq %rsi, %rdi subq $16, %rsp call other::get() const movq 16(%rbx), %rcx movq %rax, (%rsp) movq %rdx, 8(%rsp) movq %rax, (%rcx) movl 8(%rsp), %eax movl %eax, 8(%rcx) movq (%rbx), %rax movq 8(%rbx), %rdx addq $16, %rsp popq %rbx ret The expected generated code is: my::get(other const&): pushq %rbp pushq %rbx movq %rdi, %rbx subq $8, %rsp movq 16(%rdi), %rbp movq %rsi, %rdi call other::get() const movq %rax, 0(%rbp) # just storing to *my::target... movq %rdx, 8(%rbp) movq (%rbx), %rax # ... and then loading my::current movq 8(%rbx), %rdx addq $8, %rsp popq %rbx popq %rbp ret The issue can be worked around. One way to do this is to make the data members of pair public. Another way is changing pair::second type to unsigned long (to match the size of pointer). It would be great is GCC generates the second code irrespectively of private-ness or the size of pair::second.
[Bug target/82693] gcc/clang calling convension mismatch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82693 Ivan Sorokin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #8 from Ivan Sorokin --- Yep. It is. And the proposed resolution is exactly what I would like to see. Closing as duplicate. Thank you. *** This bug has been marked as a duplicate of bug 60336 ***
[Bug c++/60336] empty struct value is passed differently in C and C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60336 Ivan Sorokin changed: What|Removed |Added CC||vanyacpp at gmail dot com --- Comment #49 from Ivan Sorokin --- *** Bug 82693 has been marked as a duplicate of this bug. ***
[Bug target/82693] gcc/clang calling convension mismatch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82693 --- Comment #6 from Ivan Sorokin --- I added files to reproduce the issue: caller.cpp and callee.cpp are the files that need to be compiled with different compilers. empty.h is common header. build.sh is a shell script that compiles and run all four combinations caller/callee gcc/clang.
[Bug target/82693] gcc/clang calling convension mismatch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82693 Ivan Sorokin changed: What|Removed |Added Attachment #42451|0 |1 is obsolete|| --- Comment #5 from Ivan Sorokin --- Created attachment 42454 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42454&action=edit caller.cpp