[Bug tree-optimization/112533] New: missed optimization (~A & C) == (~B & C) => (A & C) == (B & C)

2023-11-14 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112533

Bug ID: 112533
   Summary: missed optimization (~A & C) == (~B & C) => (A & C) ==
(B & C)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

On this code

static bool is_even(unsigned a)
{
return a % 2 == 0;
}

bool same_evenness(unsigned a, unsigned b)
{
return is_even(a) == is_even(b);
}

GCC -02 currently produces

same_evenness:
notl%esi // (1)
notl%edi // (2)
andl$1, %esi
andl$1, %edi
cmpb%dil, %sil
sete%al
ret

The NOTs (1) and (2) are redundant. It would be great if GCC could optimize
them out.

[Bug libstdc++/112480] optional::reset emits inefficient code when T is trivially-destructible

2023-11-13 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112480

--- Comment #7 from Ivan Sorokin  ---
(In reply to Jonathan Wakely from comment #6)

> +   // The following seems redundant but improves codegen, see PR 112480.
> +   if constexpr (is_trivially_destructible_v<_Tp>)
> + this->_M_engaged = false;
>}


In theory non-trivial destructors that are optimizible to no-op can also
benefit from the same optimization.

I don't know how often non-trivial no-op destructors occur in practice. Perhaps
we can ignore such case.

[Bug libstdc++/112480] optional::reset emits inefficient code when T is trivially-destructible

2023-11-13 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112480

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #5 from Ivan Sorokin  ---
Perhaps something like this would do the trick?

void _M_reset()
{
if (_M_engaged)
_M_destroy();
else
_M_engaged = _M_engaged;
}

On one hand _M_engaged = _M_engaged allows merging then and else branches
without introducing new writes, on the other it can be optimized to no-op if
the branches are not merged.

[Bug c++/112410] New: error when auto(x) is used in a variable initializer

2023-11-06 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112410

Bug ID: 112410
   Summary: error when auto(x) is used in a variable initializer
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

int x = auto(42); // OK
int y(auto(42));  // error

On the second line GCC -std=c++23 gives an error:

error: non-function 'y' declared as implicit template

I believe the code is correct and should compile without errors.

[Bug target/99087] suboptimal codegen for division by constant 3

2023-10-19 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99087

Ivan Sorokin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Ivan Sorokin  ---
Since GCC 12 the issue no longer reproduces. Closing as fixed.

https://godbolt.org/z/ss7Y84a9f

[Bug tree-optimization/111718] Missed optimization of '(a+a)/a'

2023-10-07 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111718

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #1 from Ivan Sorokin  ---
GCC does the optimization if the return from the function is replaced with
__builtin_unreachable:

unsigned n1, n2;

void func1(unsigned a)
{
if (a <= 10 || a >= 20)
__builtin_unreachable();

n1 = a + a;
n2 = (a + a)/a;
}

func1(unsigned int):
mov DWORD PTR n2[rip], 2
add edi, edi
mov DWORD PTR n1[rip], edi
ret

https://godbolt.org/z/Tjsz6neTs

Perhaps this issue has the same underlying cause as the PR80015.

[Bug middle-end/111541] New: missing optimization x & ~c | (y | c) -> x | (y | c)

2023-09-22 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111541

Bug ID: 111541
   Summary: missing optimization x & ~c | (y | c) -> x | (y | c)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

On this function clang generates shorter code:

unsigned foo(unsigned x, unsigned y, unsigned c)
{
return x & ~c | (y | c);
}

Clang notices that the expression can be simplified to x | (y | c). It would be
great if GCC can do the same.

https://gcc.godbolt.org/z/dMo4nEjrs

This issue is symmetric to the one described in PR 98710. The idea behind this
simplification is the following: when we are working with bitsets, "|" can be
read as adding bits and "&~" as removing. Therefore the expression "x & ~c | (y
| c)" can be read as removing "c" from "x" and then adding "y | c".

So the simplification

x & ~c | (y | c) -> x | (y | c)

means there is no need to remove "c" if later we add something containing "c".

[Bug middle-end/98710] missing optimization (x | c) & ~(y | c) -> x & ~(y | c)

2023-09-22 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98710

--- Comment #8 from Ivan Sorokin  ---
> How often these show up, I have no idea.

Perhaps I should have written this in the original message.

The original expression "(x | c) & ~(y | c)" is obviously a reduced version of
what happens in real code. The idea is the following: When we are working with
bitsets "|" can be read as adding bits and "&~" as removing. Therefore the
expression can be read as first adding "c" to "x" and then removing "y | c"
from the result.

So the simplification

(x | c) & ~(y | c) -> x & ~(y | c)

means there is no need to add "c" if later we remove something containing "c".

[Bug middle-end/98710] missing optimization (x | c) & ~(y | c) -> x & ~(y | c)

2023-09-22 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98710

--- Comment #7 from Ivan Sorokin  ---
(In reply to Andrew Pinski from comment #6)
> Fixed.

Thank you!

[Bug middle-end/109986] missing fold (~a | b) ^ a => ~(a & b)

2023-07-27 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109986

--- Comment #5 from Ivan Sorokin  ---
(In reply to CVS Commits from comment #4)
> commit r14-2751-g2a3556376c69a1fb588dcf25225950575e42784f
> Author: Drew Ross 
> Co-authored-by: Jakub Jelinek 

Thank you!

[Bug gcov-profile/110561] gcov counts closing bracket in a function as executable, lowering coverage statistics

2023-07-05 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110561

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #1 from Ivan Sorokin  ---
Smaller reproducer:

struct non_trivial
{
non_trivial();
non_trivial(non_trivial const&);
non_trivial& operator=(non_trivial const&);
~non_trivial();

void escape();
};

non_trivial foobar()
{
non_trivial result;
result.escape();
return result;
}

https://gcc.godbolt.org/z/s5fG6ezxd

Here is the code generated after escape() until the end of function:

.LEHB1:
callnon_trivial::escape()
.LEHE1:
.loc 1 15 12
nop
mov rax, QWORD PTR __gcov0.foobar()[rip+24]
add rax, 1
mov QWORD PTR __gcov0.foobar()[rip+24], rax
jmp .L5 # unconditional
jump to function exit
.L4:
mov rbx, rax# exceptional case
mov rax, QWORD PTR __gcov0.foobar()[rip+16]
add rax, 1
mov QWORD PTR __gcov0.foobar()[rip+16], rax
.loc 1 16 1
mov rax, QWORD PTR [rbp-24]
mov rdi, rax
callnon_trivial::~non_trivial() [complete object destructor]
mov rax, QWORD PTR __gcov0.foobar()[rip+32] # increments the
counter for }
add rax, 1
mov QWORD PTR __gcov0.foobar()[rip+32], rax
mov rax, rbx
mov rdi, rax
.LEHB2:
call_Unwind_Resume
.LEHE2:
.L5:
mov rax, QWORD PTR [rbp-24]# doesn't increment
the counter for }
mov rbx, QWORD PTR [rbp-8]
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc

>From what I understand the counter for } is incremented in the exceptional
codepath that is executed when an exception is thrown from escape(). Normal
codepath doesn't increment it.

This looks like a bug to me. The counter for } should be either incremented on
both codepaths or should not exist at all.

[Bug middle-end/110534] New: confusing -Wuninitialized when strict aliasing is violated

2023-07-03 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110534

Bug ID: 110534
   Summary: confusing -Wuninitialized when strict aliasing is
violated
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

GCC gives -Wuninitialized on this code:

#include 
uint16_t test()
{
uint32_t foo32[4] = {0, 0, 0, 0};
uint16_t* foo16 = reinterpret_cast(&foo32[0]);
return foo16[0];
}

:7:19: warning: 'foo32' is used uninitialized [-Wuninitialized]
7 | return foo16[0];
  |   ^
:5:14: note: 'foo32' declared here
5 | uint32_t foo32[4] = {0, 0, 0, 0};
  |  ^

This issue was originally published on reddit:
https://www.reddit.com/r/cpp/comments/14lc9w9/gcc_warnings_for_uninitialized_variables_is/

The poster found the warning quite confusing and I agree with them.

I believe the ideal behavior would be to show -Wstrict-aliasing on this code
and avoid showing -Wuninitialized.

[Bug middle-end/109986] missing fold (~a | b) ^ a => ~(a & b)

2023-06-24 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109986

--- Comment #3 from Ivan Sorokin  ---
I tried to investigate why GCC is able to simplify `(a | b) ^ a` and `(a | ~b)
^ a` from comment 2, but not similarly looking `(~a | b) ^ a` from comment 0.

`(a | b) ^ a` matches the following pattern from match.pd:

/* (X | Y) ^ X -> Y & ~ X*/
(simplify
 (bit_xor:c (convert1? (bit_ior:c @@0 @1)) (convert2? @0))
 (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
  (convert (bit_and @1 (bit_not @0)

`(a | ~b) ^ a` matches another pattern:

/* (~X | C) ^ D -> (X | C) ^ (~D ^ C) if (~D ^ C) can be simplified.  */
(simplify
 (bit_xor:c (bit_ior:cs (bit_not:s @0) @1) @2)
  (bit_xor (bit_ior @0 @1) (bit_xor! (bit_not! @2) @1)))

With substitution `X = b, C = a, D = a` it gives:

(b | a) ^ (~a ^ a)
(b | a) ^ -1
~(b | a)

`(~a | b) ^ a` is not simplifiable by this pattern because it requires that `~D
^ C` is simplifiable further, but `~a ^ b` is not. In any case, even if it were
applicable it would produce `(a | b) ^ (~a ^ b)` which has more operations than
the original expression.

[Bug middle-end/109986] missing fold (~a | b) ^ a => ~(a & b)

2023-05-26 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109986

--- Comment #1 from Ivan Sorokin  ---
(In reply to Ivan Sorokin from comment #0)
> int foo(int a, int b)
> {
> return (~a | b) ^ a;
> }
> 
> This can be optimized to `return ~(a | b);`. This transformation is done by
> LLVM, but not by GCC.

Correction: it can be optimized to `return ~(a & b);`.

[Bug middle-end/109986] New: missing fold (~a | b) ^ a => ~(a & b)

2023-05-26 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109986

Bug ID: 109986
   Summary: missing fold (~a | b) ^ a => ~(a & b)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

int foo(int a, int b)
{
return (~a | b) ^ a;
}

This can be optimized to `return ~(a | b);`. This transformation is done by
LLVM, but not by GCC.

[Bug tree-optimization/71990] Function multiversioning prohibits inlining

2023-05-22 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71990

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #5 from Ivan Sorokin  ---
I encountered the same issue as in this PR. GCC never inlines functions with
target_clone attribute. Therefore when the function is small and benefits from
inlining applying target_clone to it can pessimize it.

It would be great if GCC were able to inline functions with target_clone
attribute and propagate target_clone attribute as comment #3 suggests.

On this example:

__attribute__((target_clones("default", "arch=x86-64-v3")))
static int foo(int a, int b)
{
return a & ~b;
}

int bar(int a, int b)
{
return foo(a, b);
}

An ideal behavior would be foo to be inlined into bar (and bar becoming
multiversioned) and foo removed completely as none of its usages left.

[Bug analyzer/109570] detect fclose on unopened or NULL files

2023-04-20 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109570

--- Comment #1 from Ivan Sorokin  ---
Generalizing. Perhaps similarly free(NULL) can be detected?

void* obj = malloc(...);
if (!obj)
{
free(obj);
return false;
}

Unliky fclose(NULL), free(NULL) is completely well defined operation, but it
does nothing and perhaps should be removed.

[Bug analyzer/109570] New: detect fclose on unopened or NULL files

2023-04-20 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109570

Bug ID: 109570
   Summary: detect fclose on unopened or NULL files
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

While cleaning up one not particularly well written program I noticed this code
fragment:

FILE* file = fopen(...);
if (!file)
{
fclose(file);
return false;
}

Passing NULL to fclose is undefined behavior. Perhaps -fanalyzer could warn
about code like this?

[Bug rtl-optimization/109527] New: redundant register assignment

2023-04-15 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109527

Bug ID: 109527
   Summary: redundant register assignment
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

On this function

short test(short* a)
{
*a = 1;
return *a;
}


latest gcc -O2 generates:

test(short*):
mov eax, 1
mov WORD PTR [rdi], ax
mov eax, 1
ret

I believe the second assignment to eax is redundant and can be removed:

test(short*):
mov eax, 1
mov WORD PTR [rdi], ax
ret

[Bug c++/108219] [12 Regression] requirement fails on a valid expression since r12-5253-g4df7f8c79835d569

2023-03-03 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108219

--- Comment #5 from Ivan Sorokin  ---
(In reply to Patrick Palka from comment #4)
> Fixed for GCC 13 so far

Thank you very much!

[Bug c++/66968] Incorrect template argument shown in diagnostic

2023-02-06 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66968

--- Comment #10 from Ivan Sorokin  ---
One more case (from 108676):

template 
struct X
{};

template 
X f();

template 
X g();

int main()
{
g();
}

Here 'X' is printed in the error message instead of 'X'.

[Bug c++/108676] template parameters are misprinted in function signature

2023-02-06 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108676

Ivan Sorokin  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #3 from Ivan Sorokin  ---
(In reply to Jonathan Wakely from comment #2)
> Probably a dup of PR 66968

Yes, it looks similar enough. Thank you!

*** This bug has been marked as a duplicate of bug 66968 ***

[Bug c++/66968] Incorrect template argument shown in diagnostic

2023-02-06 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66968

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #9 from Ivan Sorokin  ---
*** Bug 108676 has been marked as a duplicate of this bug. ***

[Bug c++/108676] template parameters are misprinted in function signature

2023-02-05 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108676

--- Comment #1 from Ivan Sorokin  ---
I added a broken link to godbolt, here is a valid one:
https://godbolt.org/z/EE5eezW1r

[Bug c++/108676] New: GCC prints function signature incorrectly

2023-02-05 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108676

Bug ID: 108676
   Summary: GCC prints function signature incorrectly
   Product: gcc
   Version: 12.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Consider this code:

template 
struct X
{};

template 
X f();

template 
X g();

int main()
{
g();
}

On GCC 12.2 it gives this error message:

:13:12: error: no matching function for call to 'g()'
   13 | g();
  | ~~~^~
:9:7: note: candidate: 'template X g()'
9 | X g();
  |   ^

Please note that the return type of 'g' is printed incorrectly. It should say
'X' instead of 'X'.

https://godbolt.org/z/EeWoo16M

[Bug c++/108219] New: requirement fails on a valid expression

2022-12-24 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108219

Bug ID: 108219
   Summary: requirement fails on a valid expression
   Product: gcc
   Version: 12.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

This code compiles OK on clang, MSVC and GCC prior to 12:

template 
concept test = requires
{
new T[1]{{ 42 }};
};

struct foobar
{
foobar(int);
};

int main()
{
static_assert(test);
new foobar[1]{{ 42 }};
}

But on GCC 12 it produces an error:

:14:19: error: static assertion failed
   14 | static_assert(test);
  |   ^~~~
:14:19: note: constraints not satisfied
:2:9:   required by the constraints of 'template concept test'
:2:16:   in requirements  [with T = foobar]
:4:5: note: the required expression 'new T(1)' is invalid, because
4 | new T[1]{{ 42 }};
  | ^~~~
:4:5: error: could not convert '()'
from '' to 'foobar'
4 | new T[1]{{ 42 }};
  | ^~~~
  | |
  | 

I believe the error is incorrect and that this is a regression in GCC 12.

[Bug c++/107529] New: constexpr evaluator doesn't check for destroyed objects

2022-11-04 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107529

Bug ID: 107529
   Summary: constexpr evaluator doesn't check for destroyed
objects
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

I believe this function contains undefined behavior and should not be allowed
to evaluate at compile-time.

The call to `std::destroy_at(p)` should end the lifetime of `*p` and accesses
to `*p` after that should be invalid.

#include 

struct mytype
{
constexpr mytype() : x(42) {}
constexpr ~mytype() {}

int x;
};

constexpr int foo()
{
std::allocator alloc;
mytype* p = alloc.allocate(1);
std::construct_at(p);
std::destroy_at(p);   // destroy *p
int result = p->x;// access
alloc.deallocate(p, 1);
return result;
}

static_assert(foo() == 42);

[Bug c++/107528] New: constexpr evaluator doesn't check for deallocate of mismatched size

2022-11-04 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107528

Bug ID: 107528
   Summary: constexpr evaluator doesn't check for deallocate of
mismatched size
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

This functions causes undefined behavior and should not be evaluated at
compile-time.

The problem is the second argument of `deallocate` function (number of objects
to deallocate). It must be equal to the number of objects that were allocated.

#include 

constexpr int foo()
{
std::allocator alloc;
int* p = alloc.allocate(1);
alloc.deallocate(p, 3);
return 42;
}

static_assert(foo() == 42);

[Bug c++/107161] gcc doesn't constant fold member if any other member is mutable

2022-10-06 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107161

--- Comment #2 from Ivan Sorokin  ---
> Do constexpr/consteval work in such circumstances?

Yes, constexpr works for variables like "p.a":

extern constexpr mytype p = {1, 2};

int foo()
{
constexpr int t = p.a + 10;
return t;
}

foo():
mov eax, 11
ret

https://godbolt.org/z/K9a69E4ar

[Bug c++/107161] New: gcc doesn't constant fold member if any other member is mutable

2022-10-05 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107161

Bug ID: 107161
   Summary: gcc doesn't constant fold member if any other member
is mutable
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

On this code:

struct mytype
{
int a;
mutable int b;
};

extern mytype const p = {1, 2};

int foo()
{
return p.a + 10;
}

int bar()
{
return p.b + 10;
}

GCC -O2 generates:
foo():
mov eax, DWORD PTR p[rip]
add eax, 10
ret
bar():
mov eax, DWORD PTR p[rip+4]
add eax, 10
ret

While clang folds "p.a + 10" into 11:
foo():# @foo()
mov eax, 11
ret
bar():# @bar()
mov eax, dword ptr [rip + p+4]
add eax, 10
ret

I think GCC should do the same.

[Bug libstdc++/103382] condition_variable::wait() is not cancellable because it is marked noexcept

2022-09-06 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103382

Ivan Sorokin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Ivan Sorokin  ---
As the bug is fixed I'm closing the issue.

[Bug tree-optimization/101706] bool0^bool1^1 -> bool0 == bool1

2022-08-10 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101706

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #2 from Ivan Sorokin  ---
This issue was fixed in PR106379 by Richard Biener.

After https://gcc.gnu.org/g:375668e0508fbe173af1ed519d8ae2b79f388d94 for both
fa and fb we have:

fa(bool&, bool&, bool&):
movzx   eax, BYTE PTR [rsi]
cmp BYTE PTR [rdi], al
seteBYTE PTR [rdx]
ret
fb(bool&, bool&, bool&):
movzx   eax, BYTE PTR [rsi]
cmp BYTE PTR [rdi], al
seteBYTE PTR [rdx]
ret

I think the issue can be closed now.

[Bug tree-optimization/98709] gcc optimizes bitwise operations, but doesn't optimize logical ones

2022-08-10 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98709

Ivan Sorokin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Ivan Sorokin  ---
This issue was fixed in PR106379 by Richard Biener.

https://gcc.gnu.org/g:375668e0508fbe173af1ed519d8ae2b79f388d94

[Bug middle-end/19987] [meta-bug] fold missing optimizations in general

2022-08-10 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19987
Bug 19987 depends on bug 98709, which changed state.

Bug 98709 Summary: gcc optimizes bitwise operations, but doesn't optimize 
logical ones
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98709

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug middle-end/105762] [12/13 Regression] -Warray-bounds false positives for integer-to-pointer casts since r12-2132-ga110855667782dac

2022-08-05 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105762

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #4 from Ivan Sorokin  ---
Perhaps the warning message could be improved? The warning is saying about
arrays but there are no arrays in the original code.

I think it would be great if the warning said something about
{invalid/wild/cast from int} pointer. English is not my strong suit, perhaps
something like this:

warning: dereferencing wild pointer '(int*)1ul' is undefined

[Bug c++/105864] storing nullptr_t to memory should not generate any instructions

2022-06-21 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105864

--- Comment #5 from Ivan Sorokin  ---
I would suggest (In reply to Andrew Pinski from comment #4)
>   nullptr_t t, t1 = nullptr;
>   __builtin_memcpy(&a[0], &t, sizeof(t));

> So I suspect this should be marked as invalid.

The questions is how GCC defines memcpy'ing from nullptr_t.

Should it be required to read zero bytes? Or null pointer value? What about
systems where the value of null pointer is not zero?

In any case I don't think memcpy'ing nullptr_t into a different type is
particularly useful or used anywhere (I might be wrong). So I suggest defining
nullptr_t as an empty type containing only padding bytes. In this case memcpy
should just read the padding bytes.

[Bug tree-optimization/105864] New: storing nullptr_t to memory should not generate any instructions

2022-06-06 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105864

Bug ID: 105864
   Summary: storing nullptr_t to memory should not generate any
instructions
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Currently storing a nullptr_t to memory causes 0 to be written to that memory.
As there is no way to read this value back without invoking undefined behavior
I believe GCC can omit storing it.

This will make nullptr_t behave more similar to an empty struct that has only
padding bytes in it.

using nullptr_t = decltype(nullptr);

void test(nullptr_t* p)
{
*p = nullptr;
}

struct empty
{};

void test(empty* p)
{
*p = empty();
}

test(decltype(nullptr)*):
mov QWORD PTR [rdi], 0
ret
test(empty*):
ret

[Bug middle-end/105862] New: missed inlining opportunity of _Sp_counted_deleter::_M_destroy

2022-06-06 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105862

Bug ID: 105862
   Summary: missed inlining opportunity of
_Sp_counted_deleter::_M_destroy
   Product: gcc
   Version: 12.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

This sample is reduced from a real usage of shared_ptr.

#include 
#include 


struct sp_counted_base
{
sp_counted_base() noexcept
: use_count(1)
{}

virtual void destroy() noexcept
{}

void release() noexcept
{
if (--use_count == 0)
destroy();
}

private:
sp_counted_base(sp_counted_base const&) = delete;
sp_counted_base& operator=(sp_counted_base const&) = delete;

int use_count;
};

struct sp_counted_deleter final : sp_counted_base
{
virtual void destroy() noexcept
{
::operator delete(this);
}
};


void test()
{
sp_counted_deleter* mem = static_cast(::operator
new(sizeof(sp_counted_deleter)));
::new (mem) sp_counted_deleter();
sp_counted_base* pi = mem;
pi->release();
}

https://godbolt.org/z/dG8h7f1Kn


sp_counted_deleter::destroy():
jmp operator delete(void*)
test():
sub rsp, 8
mov edi, 16
calloperator new(unsigned long)
mov QWORD PTR [rax], OFFSET FLAT:vtable for sp_counted_deleter+16
mov rdi, rax
mov DWORD PTR [rax+8], 0
add rsp, 8
jmp sp_counted_deleter::destroy()

In the output assembly the call to sp_counted_deleter::destroy is left
uninlined. I tested the same sample on Clang and it somehow manages to inline
this function. It would be great if GCC was able to inline it too.

[Bug c++/104503] [12 regression][modules] bits/shared_ptr_base.h: error: must ‘#include ’ before using ‘typeid’

2022-05-10 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104503

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #4 from Ivan Sorokin  ---
Could you please review the resolution? In 2.cpp nothing requires .
Getting an error message about something that is not even used in the file
can't be right.

[Bug sanitizer/105141] #pragma pack(1) causes incorrect UBSAN warning

2022-04-04 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105141

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #8 from Ivan Sorokin  ---
Note that when __attribute__((packed)) is used GCC produces a warning:

warning: taking address of packed member of '' may result in an
unaligned pointer value [-Waddress-of-packed-member]
   10 | int *d = &c.b;
  |  ^~~~

Perhaps a similar warning should be reported for #pragma packed structs.

https://godbolt.org/z/Yr13WhbG8

struct
{
  char a;
  int b;
} __attribute__((packed)) c;

int main()
{
int *d = &c.b;
__builtin_printf("%d\n", *d);
}

[Bug c++/105099] New: In lookup for namespace name qualifiers only namespaces should be considered

2022-03-29 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105099

Bug ID: 105099
   Summary: In lookup for namespace name qualifiers only
namespaces should be considered
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Consider this code:

namespace a
{
namespace c
{}

struct a
{};

namespace b = a::c;   // (1)
using namespace a::c; // (2)
}

Currently GCC prints an error:

file.cpp:9:22: error: 'c' is not a namespace-name
9 | namespace b = a::c;
  |  ^
file.cpp:10:24: error: 'c' is not a namespace-name
   10 | using namespace a::c;
  |^

If I interpret the standard correctly the code should compile without errors
because during the lookup of the qualifier the struct "a" should be ignored and
the namespace "a" should be found.

[basic.lookup.udir]p1: In a using-directive or namespace-alias-definition,
during the lookup for a namespace-name or for a name in a nested-name-specifier
only namespace names are considered.

https://eel.is/c++draft/basic.lookup.udir
https://godbolt.org/z/vaWjx4cKj

[Bug c++/103566] New: confusing error message for typedefs with initializers

2021-12-05 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103566

Bug ID: 103566
   Summary: confusing error message for typedefs with initializers
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

On this code GCC says:

typedef int foo = 42;
error: typedef 'foo' is initialized (use 'decltype' instead)

I believe this error message is quite confusing. It wasn't clear to me how
'decltype' could help here. After a bit of git-blaming I found the original
commit that added this message:

author  Zack Weinberg 
Sat, 19 Oct 2002 03:14:11 + (03:14 +)
commit  4a7510cb22da4809d18e3bb3fc453cf671d6926a

c-decl.c, decl.c (start_decl): Point users of the old initialized- typedef
extension at __typeof__.

-   error ("typedef `%D' is initialized", decl);
+   error ("typedef `%D' is initialized (use __typeof__ instead)", decl);

Unfortunately the commit wasn't accompanied by a testcase, but I assume in the
past there was some GCC-specific extension "initialized typedef" that worked
like decltype/typeof.

I believe that the error message although beneficial in the past is confusing
for users today. I would like to suggest removing "(use 'decltype' instead)"
text.

[Bug tree-optimization/103559] Can't optimize away < 0 check on sqrt

2021-12-05 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103559

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #4 from Ivan Sorokin  ---
(In reply to Andrew Pinski from comment #1)
> I think there is another bug about this.

Perhaps related to PR91645. The bug report itself is about slightly different
issue, but the comments discusses the same problem.

[Bug libstdc++/103382] condition_variable::wait() is not cancellable because it is marked noexcept

2021-11-24 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103382

--- Comment #3 from Ivan Sorokin  ---
> Huh, I thought it was noexcept. Then yes, we should remove it.

Thank you very much! I'm looking forward for a fix.

> There are still lots of other places where the stadnard does require 
> 'noexcept' and cancellation will terminate.

Do you have any specific functions in mind? If so perhaps something can be done
about them too.

Some people claim that noexcept and cancellation and mutually incompatible, but
I don't think this is the case. I believe that by following a simple discipline
noexcept and cancellation can interact very well.

First of all not all noexcept functions are problematic: noexcept functions
that don't call cancellation points are perfectly fine.

The noexcept functions that do call some cancellation points can be fixed by
suppression/restoring of cancellation. For example, a destructor that calls
close() which is a cancellation point should just suppress/restore
cancellation. Same for a destructor that calls pthread_join(). One might say
that because of this we lose some cancellation points and this is true, but I
believe that noexcept are the places where program can not recover preserving
exception guarantees and having cancellation suppressed in these places is
perfectly fine.

[Bug libstdc++/103382] condition_variable::wait() is not cancellable because it is marked noexcept

2021-11-23 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103382

--- Comment #1 from Ivan Sorokin  ---
Please note there was a related issue PR67726. I hope it is possible to meet
the requirements mentioned in the issue as well as enabling cancellation.

[Bug libstdc++/103382] New: condition_variable::wait() is not cancellable because it is marked noexcept

2021-11-23 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103382

Bug ID: 103382
   Summary: condition_variable::wait() is not cancellable because
it is marked noexcept
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

At the moment condition_variable::wait() is marked noexcept. It means that if
pthread_cond_wait() acts as cancellation point and throws an exception the
program is terminated with std::terminate(). This program demonstrates the
issue:

#include 
#include 

int main()
{
std::mutex m;
std::condition_variable cv;

std::thread th([&]
{
std::unique_lock lock(m);
cv.wait(lock);
});

pthread_cancel(th.native_handle());
th.join();
}

This program terminates with SIGABRT.

Because of this using condition_variable::wait() in cancellable threads is
tricky: the programmer has to guard all calls to condition_variable::wait()
with disabling/restoring cancellation state. Also this stops the thread from
being cancellable while in wait(). Therefore the outer thread has to know which
condition_variable the thread waits and notify this condition_variable after
pthread_cancel(). Also one should add cancellation point pthread_testcancel()
immediately after restoring cancellation state after wait().

I believe it would be great if condition_variable::wait interacted nicer with
POSIX-cancellation. I would like to suggest removing noexcept from
condition_variable::wait(). This also matches the C++ standard very well
[thread.condition.condvar] where condition_variable::wait() is not marked as
noexcept.

[Bug c++/102881] gcc totally broken when trailing return type combine with decltype lambda

2021-10-22 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102881

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #2 from Ivan Sorokin  ---
PR92707 also features lambda inside decltype. Perhaps they are related.

[Bug tree-optimization/102888] New: missing case for combining / and % into one operation

2021-10-21 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102888

Bug ID: 102888
   Summary: missing case for combining / and % into one operation
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Normally GCC combines a/b and a%b into one operation when they are computed in
the same basic-block. The example below has two functions. For one GCC is able
to combine the operations and for other not (presumably because of complicated
control-flow). I believe the two functions are functionally equivalent. 

unsigned long long reduce(unsigned long long a, unsigned long long b)
{
while ((a % b) == 0)
a /= b;

return a;
}

unsigned long long reduce_opt(unsigned long long a, unsigned long long b)
{
for (;;)
{
unsigned long long quot = a / b;
unsigned long long rem = a % b;
if (rem != 0)
break;
a = quot;
}

return a;
}

reduce.L3:
mov rax, r8
xor edx, edx
div rsi
xor edx, edx
mov r8, rax
div rsi
testrdx, rdx
je  .L3

reduce_opt.L8:
xor edx, edx
mov r8, rax
div rsi
testrdx, rdx
je  .L8

https://godbolt.org/z/9dqs8avE5

It would be great if GCC generated the same code for both of these functions.

[Bug c++/102704] New: NRVO for throw expression

2021-10-12 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102704

Bug ID: 102704
   Summary: NRVO for throw expression
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Consider this code:

struct mytype
{
mytype();
mytype(mytype const&);
mytype(mytype&&);
};

void test()
{
mytype e;
throw e;
}

Currently for function test() GCC generates the following sequence of calls
(pseudocode):

char e[sizeof(mytype)];
mytype_default_ctor(e);
p = __cxa_allocate_exception();
mytype_move_ctor(p, e);
__cxa_throw(p);

I believe a trick similar to NRVO for returns can be made here. When a variable
meets NRVO criteria, compiler can remove the local variable and replace it with
a storage allocated by __cxa_allocate_exception. Here what I believe can be
generated:

p = __cxa_allocate_exception();
mytype_default_ctor(p);
__cxa_throw(p);

[Bug c++/61355] gcc doesn't normalize type in non-type template parameters

2021-10-10 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61355

--- Comment #6 from Ivan Sorokin  ---
(In reply to Patrick Palka from comment #5)
> Fixed for GCC 12.

Thanks!

[Bug target/102355] New: excessive stack usage

2021-09-15 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102355

Bug ID: 102355
   Summary: excessive stack usage
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

void escape(unsigned long long& a);

void foobar()
{
unsigned long long local;
escape(local);
}

For the function "foobar" GCC allocates excessive stack space:

foobar():
sub rsp, 24
lea rdi, [rsp+8]
callescape(unsigned long long&)
add rsp, 24
ret

The function "foobar" only needs 8 bytes of stack space, but GCC allocates 24.
Please note, that this excessive allocation isn't needed for stack alignment: 8
bytes of local variables are enough to keep the stack aligned. I also tested
Clang and it allocates 8 bytes.

GCC makes this stack layout:
8 bytes padding
8 bytes variable "local"
8 bytes padding
8 bytes return address

I believe the problem is related to the fact that GCC aligns the stack twice:
the first time after the return address placement and the second time after the
local variables are placed. Playing with -mpreferred-stack-boundary confirms
this:

-mpreferred-stack-boundary | stack usage
 3 8
   4 (default)24
 556
 6   120

https://godbolt.org/z/h56aoKvvh

In all cases the stack usage is twice as much (minus 8 bytes for return
address) as the required alignment. I believe stack space can be conserved by
doing alignment only once.

[Bug tree-optimization/98774] gcc -O3 does not vectorize some operations

2021-09-14 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98774

--- Comment #4 from Ivan Sorokin  ---
I retested the sample on GCC 11.2.

https://godbolt.org/z/xrarP3zbY

Compared to Clang 12.0.1 GCC still generates 6 more instructions in total and
does 6 mulpd against Clang's 4 mulpd.

[Bug c++/102335] New: gcc misses -Wunused-value

2021-09-14 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102335

Bug ID: 102335
   Summary: gcc misses -Wunused-value
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

struct mytype
{
int memfun [[gnu::pure]] ();
};

void test()
{
mytype x;
x.memfun();// -Wunused-value
mytype().memfun(); // no -Wunused-value
}

https://godbolt.org/z/vc49jWGqn

The code above contains two usages of a [[gnu::pure]] function with ignored
return value. GCC detect only the first one. I believe the second one deserves
a warning too.

Clang shows a warning on both usages.

[Bug rtl-optimization/3507] appalling optimisation with sub/cmp on multiple targets

2021-06-14 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3507

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #60 from Ivan Sorokin  ---
Another similar case. On this function:

unsigned wrap(unsigned index, unsigned limit)
{
if (index >= limit)
index -= limit;
return index;
}

GCC 11.1 -O2 generates:

wrap(unsigned int, unsigned int):
mov edx, edi
mov eax, edi
sub edx, esi
cmp edi, esi
cmovnb  eax, edx
ret

I believe cmp here is redundant as the flags are already set after sub. After
removing cmp we get:

wrap(unsigned int, unsigned int):
mov edx, edi
mov eax, edi
sub edx, esi
cmovnb  eax, edx
ret

Now the register edx becomes unneeded:

wrap(unsigned int, unsigned int):
mov eax, edi
sub edi, esi
cmovnb  eax, edi
ret

[Bug middle-end/95014] gcc fails to merge two identical returns

2021-04-30 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95014

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #1 from Ivan Sorokin  ---
I think this might be a duplicate of 82689.

[Bug middle-end/99797] accessing uninitialized automatic variables

2021-04-19 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99797

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #10 from Ivan Sorokin  ---
(Disclaimer: I'm not a GCC developer, I'm just a random guy who reads bugzilla
and
tried making some simple changes to GCC a few times)

(In reply to Martin Uecker from comment #9)
> The behavior of GCC is dangerous as the example in comment #1 show. You can
> not reason at all about the generated code.

My reasoning normally boils down to this: As the program invokes UB therefore
the exact behavior depends on the compiler, the compiler version, the OS and
other factors.

I would like to note that the optimization performed by compiler are not
designed to break user's code. They were designed to optimize some typical
redundancies in programs. It just happened that their combination breaks
unpredictably the code invoking UB.

Normally it is difficult/impossible not to break the code invoking UB
without regressing some optimizations.

Also optimizations performed by compiler change over time, so the exact
result of the breakage inevitably depends on the specific compiler
version.

In theory GCC already has an option that limits the effects of UB: -O0. I
believe this is the only forward-compatible option for that. If we want to
be more precise we can disable only -fno-tree-ccp, but these fine-grained
optimization options changes from one compiler version to another.

> The "optimize based on the assumption that UB can not happen" philosophy
> amplifies even minor programming errors into something dangerous.

Unfortunately this is easier said than done. I far as I know all major
compilers do optimization based on UB. Consider this:

const int PI = 3;

int tau()
{
   return 2 * PI; // can this be folded into 6?
}

GCC folds 2 * PI into 6 even with -O0. This optimization is based on UB.
Because in some other function one can write:

void evil()
{
const_cast(PI) = 4;
}

As some usages of PI can be folded and some can be not. The ones that were
folded would see PI = 3, the ones that were not folded would see PI = 4.

One can argue that the constant folding is fundamentally an optimization
based on UB. I believe few optimizations will be left, if we disable all
that rely on UB.

> This, of  course, also applies to other UB (in varying degrees). For signed
> overflow we have -fsanitize=signed-integer-overflow which can help detect and
> mitigate such errors, e.g. by trapping at run-time. And also this is allowed
> by UB. 

> In case of UB the choice of what to do lies with the compiler, but I think it
> is a bug if this choice is unreasonable and does not serve its users well.

Do you have some specific proposal in mind?

Currently a user has these 5 options:
1. Using -O0 suppressing optimizations.
2. Using -fno-tree-ccp suppressing this specific optimization.
3. Using -Wall and relying on warnings.
4. (in theory) Using static analyzer -fanalyzer. It doesn't detect this error
   at the moment, but I believe can be taught detecting this.
5. Using dynamic analyzer like valgrind.

It seems that you find existing options insufficient and want another one.

[Bug analyzer/94355] support for C++ new expression

2021-04-12 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94355

--- Comment #7 from Ivan Sorokin  ---
For me the support for operator new works well for trivially constructible
types. For a non-trivially constructible type I got a false positive:

struct foo { foo(); };

int main()
{
delete new foo();
}

In function 'int main()':
cc1plus: warning: use of possibly-NULL 'operator new(1)' where non-null
expected [CWE-690] [-Wanalyzer-possible-null-argument]
  'int main()': event 1
|
|:5:20:
|5 | delete new foo();
|  |^
|  ||
|  |(1) this call could return NULL
|
  'int main()': event 2
|
|cc1plus:
| (2): argument 'this' ('operator new(1)') from (1) could be NULL where
non-null expected
|
:1:14: note: argument 'this' of 'foo::foo()' must be non-null
1 | struct foo { foo(); };
  |  ^~~
Compiler returned: 0

https://godbolt.org/z/nPff9EGsY

Also the error location seems to be wrong. Removing "()" from "delete new
foo()" fixes the error location.

[Bug analyzer/94355] support for C++ new expression

2021-04-12 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94355

--- Comment #6 from Ivan Sorokin  ---
I played with -fanalyzer on godbolt (GCC trunk). I noticed that -fanalyzer
doesn't report double free in this (convoluted) case:

#include 

int main()
{
int* p = new int;
delete p;
free(p);
}

[Bug c++/100039] New: GCC can not bind lvalue to lvalue reference in brace-initialized-temporary expression

2021-04-11 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100039

Bug ID: 100039
   Summary: GCC can not bind lvalue to lvalue reference in
brace-initialized-temporary expression
   Product: gcc
   Version: 10.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Consider this program:

typedef int& ref;

int main()
{
int a;
ref{a};
}

This is accepted by clang, msvc and icc. GCC 10.3 rejects this code with a
message:

error: cannot bind non-const lvalue reference of type 'ref' {aka 'int&'} to an
rvalue of type 'int'

I believe the error message is incorrect, because "a" is not an rvalue here. It
is lvalue, therefore it should be allowed to bind to lvalue reference.

https://godbolt.org/z/TWY9GPq3E

[Bug sanitizer/99418] sanitizer checks for accessing multidimentional VLA-array

2021-03-09 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418

--- Comment #8 from Ivan Sorokin  ---
If I understand #c5 correctly the minimal reproducer should be this:

void g(int&);

void f()
{
int a[10];
int& p = a[10]; // (1)
g(a[10]);   // (2)
}

Both (1) and (2) are undefined and -fsanitize=bounds can help checking this.

[Bug sanitizer/99418] sanitizer checks for accessing multidimentional VLA-array

2021-03-09 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418

--- Comment #7 from Ivan Sorokin  ---
(In reply to Martin Liška from comment #3)

> That said, can we close it as resolved?

I'm sorry for not being clear from the beginning. The original report was about
-fsanitize=bounds sanitizer which sometimes allows accessing one past the end
element.

Now after #c4 I see that language rules make it excessively complicated for
compiler to do this.

I believe that one past the end is important error to check for, but I
understand why compilers might choose to avoid doing it. Feel free to close the
issue if implementing it is infeasible.

[Bug sanitizer/99418] sanitizer checks for accessing multidimentional VLA-array

2021-03-09 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418

--- Comment #6 from Ivan Sorokin  ---
(In reply to Jakub Jelinek from comment #4)
> Asan can't by design detect neither #c0 nor #c1, only ubsan can.
> The reason why ubsan has that off by one stuff is that in C/C++,
> &mas[n - 1][m] is not undefined behavior, only mas[n - 1][m] is.

That is very unfortunate. For standard containers subscripting with wrond
index is undefined behavior no matter if it is followed by taking of address.
I assumed the same rules apply for builtin arrays. If one need just a point
one can easily write a + n instead of &a[n]. Now I see that this is not the
case and built-in arrays behave differently.

> For #c1, the big question is what exactly is UB in C++, whether already
> binding a reference to the object after the end of the array or only
> actually accessing that reference.  If the former, ubsan could treat
> REFERENCE_TYPE differently, if the latter, then I'm afraid it can't do that,
> and ubsan by design has to be done early before all the optimizations change
> the IL so much that it is completely lost what were the user errors in it.
> For the method calls, there really isn't a reference in the IL either, this
> argument is a pointer, but .UBSAN_BOUNDS calls are added in the FE and so
> perhaps it could know it is a method call and treat it as a reference.
> So, something can be done but we need answers on where the UB in C++ exactly
> happens.

For -fsanitize=null the rules are quite subtle: dereferencing by itself (*p)
doesn't check for nullptr, but binding a reference (int& q = *p;) does.
Perhaps similar rules can be employed for past-the-end element: taking pointer
to it is fine, but passing the pointer as this parameter to function is UB? At
least this would be consistent with null pointers.

[Bug sanitizer/99418] sanitizer checks for accessing multidimentional VLA-array

2021-03-06 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418

--- Comment #2 from Ivan Sorokin  ---
It looks like this is related to ignore_off_by_one parameter of
ubsan_instrument_bounds.

As can be seen in gimple the problematic .UBSAN_BOUNDS checks against array
size plus 1.

[Bug sanitizer/99418] sanitizer checks for accessing multidimentional VLA-array

2021-03-06 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418

--- Comment #1 from Ivan Sorokin  ---
Here is the reduced example. It doesn't SIGSEGV, but it doesn't report any
sanitizer errors either:

$ g++ -g -fsanitize=bounds 3.cpp
$ cat 3.cpp
#include 

void escape(int& a)
{}

void test(size_t n, size_t m)
{
int mas[n][m];
escape(mas[n - 1][m]);
}

int main()
{
test(4, 3);
}

Surprisingly if I replace taking a reference with writing to the array it will
show an error.

[Bug sanitizer/99418] New: sanitizer checks for accessing multidimentional VLA-array

2021-03-05 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99418

Bug ID: 99418
   Summary: sanitizer checks for accessing multidimentional
VLA-array
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

The example below accesses array past its size, but sanitizers don't show any
errors. If I change index m to m + 1 an error will be shown. This makes me
think that compiler does some checks, but perhaps they are incomplete for
multidimentional VLA-arrays.

GCC 10.2.

#include 

std::string shortest_match(size_t n, size_t m)
{
std::string mas[n][m];
mas[n - 1][m] = ""; // mas[n - 1][m + 1] will show an errors

return mas[n - 1][m - 1];
}

int main()
{
shortest_match(4, 3);
}

$ g++ -g -fsanitize=address,undefined -std=c++17 2.cpp && ./a.out 
AddressSanitizer:DEADLYSIGNAL
=
==26974==ERROR: AddressSanitizer: SEGV on unknown address 0x (pc
0x7f59ea2ad2d6 bp 0x sp 0x7ffc78389ea0 T0)
==26974==The signal is caused by a WRITE memory access.
==26974==Hint: address points to the zero page.
#0 0x7f59ea2ad2d6 in std::__cxx11::basic_string, std::allocator >::_M_replace(unsigned long,
unsigned long, char const*, unsigned long) (/lib/libstdc++.so.6+0x13c2d6)
#1 0x401658 in shortest_match[abi:cxx11](unsigned long, unsigned long)
/home/ivan/2.cpp:6
#2 0x4019eb in main /home/ivan/2.cpp:13
#3 0x7f59e950ec7c in __libc_start_main (/lib/libc.so.6+0x23c7c)
#4 0x4011a9 in _start (/home/ivan/a.out+0x4011a9)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/lib/libstdc++.so.6+0x13c2d6) in
std::__cxx11::basic_string, std::allocator
>::_M_replace(unsigned long, unsigned long, char const*, unsigned long)
==26974==ABORTING

[Bug middle-end/99087] New: suboptimal codegen for division by constant 3

2021-02-13 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99087

Bug ID: 99087
   Summary: suboptimal codegen for division by constant 3
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

These two are functionally the same, but generate different code with g++ -O2:

unsigned long long foo(unsigned long long a)
{
return a / 3;
}

unsigned long long bar(unsigned long long a)
{
return (unsigned __int128)a * 0x'''AAAB >> 65;
}

foo(unsigned long long):
movabs  rdx, -6148914691236517205
mov rax, rdi
mul rdx
mov rax, rdx
shr rax
ret
bar(unsigned long long):
movabs  rax, -6148914691236517205
mul rdi
mov rax, rdx
shr rax
ret

For some reason for division GCC chooses different argument order which causes
generation of one extra mov.

[Bug target/91400] __builtin_cpu_supports conjunction is optimized poorly

2021-02-04 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91400

--- Comment #2 from Ivan Sorokin  ---
I've sent a patch to gcc-patches mailing list:
https://gcc.gnu.org/pipermail/gcc-patches/2021-February/564663.html

[Bug c++/82640] gcc doesn't show errors on anonymous local variables

2021-01-30 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82640

Ivan Sorokin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Ivan Sorokin  ---
It looks like this issue was fixed in GCC 8. Closing.

[Bug tree-optimization/98774] gcc -O3 does not vectorize some operations

2021-01-27 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98774

--- Comment #3 from Ivan Sorokin  ---
(In reply to Hongtao.liu from comment #1)
> It's fixed in current trunk https://godbolt.org/z/63576n

I can confirm that now GCC does use packed multiplication mulpd. Although it is
used somewhat inefficiently. The original program contained 8 multiplications
and clang does 4 packed multiplication. GCC trunk does 6 packed
multiplications.

https://godbolt.org/z/EabPxT

[Bug c++/98814] Add fix-it hints for missing asterisk

2021-01-26 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98814

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #2 from Ivan Sorokin  ---
PR87850 looks similar. It discusses only pointers, but I think it can be
generalized to any type that has operator*: pointers, iterators and
smart-pointers.

[Bug tree-optimization/98775] missing optimization opportunity on nbody

2021-01-20 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98775

--- Comment #1 from Ivan Sorokin  ---
Created attachment 50016
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50016&action=edit
nbody-unrolled.cpp

[Bug tree-optimization/98775] New: missing optimization opportunity on nbody

2021-01-20 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98775

Bug ID: 98775
   Summary: missing optimization opportunity on nbody
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Created attachment 50015
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50015&action=edit
nbody.cpp

On the attached sample (208 LOC), clang 11.0 generates the code that is almost
twice as fast as the one generated by GCC 10.2 (-O3 -ffast-math -flto).

$ ./nbody 5000
4.0s for clang vs 7.5s for GCC.

A quick look at the generated code shows that clang aggressively unrolled all
inner loops. If I unroll all inner loops manually I get:

$ ./nbody-unrolled 5000
3.7s for clang vs 6.3s for GCC.
17.6B instructions for clang vs 29.6B instructions for GCC.

While the first sample is a subject to unrolling heuristic, the second is about
optimizing the completely linear chunk of code with many floating point
multiplications and additions.

I tried reducing the sample further, but I only came up with PR98774.

[Bug tree-optimization/98774] New: gcc -O3 does not vectorize multiplication

2021-01-20 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98774

Bug ID: 98774
   Summary: gcc -O3 does not vectorize multiplication
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Created attachment 50014
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50014&action=edit
nbody-update-velocity.cpp

In the following sample GCC (-O3 -ffast-math) fails to vectorize operations.
The results is that GCC 10.2 does 8 mulsd, while clang 11.0 does 4 mulpd.

struct vec3 { double x, y, z; };

void update_velocities(vec3* __restrict velocity,
   double const* __restrict mass,
   vec3 const* __restrict dpos,
   double const* __restrict mag)
{
velocity[0] -= dpos[0] * (mass[1] * mag[0]);
velocity[1] += dpos[0] * (mass[0] * mag[0]);
}

See an attachment for the complete sample.

[Bug middle-end/98710] New: missing optimization (x | c) & ~(y | c) -> x & ~(y | c)

2021-01-16 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98710

Bug ID: 98710
   Summary: missing optimization (x | c) & ~(y | c) -> x & ~(y |
c)
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

On this function clang generates slightly shorter code

unsigned foo(unsigned x, unsigned y, unsigned c)
{
return (x | c) & ~(y | c);
}

because it notices that the expression can be simplified to x & ~(y | c). It
would be great if GCC can do the same.

https://godbolt.org/z/3ob6eb

[Bug middle-end/98709] New: gcc optimizes bitwise operations, but doesn't optimize logical ones

2021-01-16 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98709

Bug ID: 98709
   Summary: gcc optimizes bitwise operations, but doesn't optimize
logical ones
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

GCC 10.2 produces very good code for this function noticing that both sides of
conjuntion are the same:

unsigned foo_bitwise(unsigned a, unsigned b)
{
return (~a ^ b) & ~(a ^ b);
}

foo_bitwise(unsigned int, unsigned int):
xor edi, esi
mov eax, edi
not eax
ret

But when I write a similar function with logical operations it doesn't notice
that:

bool foo_logical(bool a, bool b)
{
return (!a ^ b) & !(a ^ b);
}

foo_logical(bool, bool):
mov eax, esi
xor eax, edi
xor eax, 1
cmp dil, sil
setedl
and eax, edx
ret

I believe that in a similar manner it can be optimized to something like this:

foo_logical(bool, bool):
xor edi, esi
mov eax, edi
xor eax, 1
ret

[Bug c++/98660] -Wold-style-cast should not warn on casts that look like (decltype(x))(x)

2021-01-13 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98660

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #1 from Ivan Sorokin  ---
I'm not a GCC developer, but I'm just curious.

Why the use of C-style cast is required here? Could you use static_cast
instead? I mean instead of `(decltype(x))(x)` using
`static_cast(x)`? Perhaps wrapping it in some macro in order to
not duplicate `x` twice.

[Bug rtl-optimization/98555] Functions optimized to zero length break function pointer inequality

2021-01-09 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98555

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #4 from Ivan Sorokin  ---
(In reply to Richard Biener from comment #1)
> [for QOI/security/whatever we probably want to at least emit a ret
> instruction]

RET might be dangerous when the return type is non-void. Perhaps UD2 or INT3
would be better?

[Bug c++/98501] potential optimization for base<->derived pointer casts

2021-01-05 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98501

--- Comment #2 from Ivan Sorokin  ---
(In reply to Richard Biener from comment #1)
> I think there's a duplicate of this PR.

I searched the list of bugs and I found PR95663. Is it it?

[Bug c++/98501] New: potential optimization for base<->derived pointer casts

2021-01-02 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98501

Bug ID: 98501
   Summary: potential optimization for base<->derived pointer
casts
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Consider this code:

struct base1 { int a; };
struct base2 { int b; };
struct derived : base1, base2 {};

derived& to_derived_bad(base2* b)
{
return *static_cast(b);
}

derived& to_derived_good(base2* b)
{
return static_cast(*b);
}

I believe both of these functions are functionally equivalent and should
generate the same code. Both functions cast pointer from base to derived if it
is not nullptr and both cause undefined behavior if it is nullptr.

GCC optimizes to_derived_good() to a single subtraction, but it inserts
nullptr-check into to_derived_bad():

to_derived_good(base2*):
lea rax, [rdi-4]
ret
to_derived_bad(base2*):
lea rax, [rdi-4]
test rdi, rdi
mov edx, 0
cmove rax, rdx
ret

Could GCC omit the nullptr-check in to_derived_bad?

[Bug c++/80016] error is positioned incorrectly

2021-01-02 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80016

Ivan Sorokin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Ivan Sorokin  ---
(In reply to Jonathan Wakely from comment #5)
> I'd even argue the stating location still isn't right in this version, as
> the error comes from ns::trait::value not the logical expression
> containing it.

On GCC 10.2 the error location is very nice:

:13:46: error: incomplete type 'ns::trait' used in nested name
specifier
   13 | && ns::trait::value;
  |  ^

https://godbolt.org/z/E7P7P8

I believe the bug can be considered fixed now. Thank you!

[Bug tree-optimization/47579] STL size() == 0 does unnecessary shift

2020-12-25 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47579

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #3 from Ivan Sorokin  ---
Since 7.1 GCC doesn't produce any shifts on the test code as well as on the
examples from comment #2. https://godbolt.org/z/f48EqP

I think the bug can be closed now.

[Bug middle-end/56719] missed optimization: i > 0xffff || i*4 > 0xffff

2020-12-25 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56719

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #8 from Ivan Sorokin  ---
On the test code clang since 3.5 and before 9.0 does something very surprising.
It optimizes (A > 0x || B > 0x) into (A | B) > 0x. I don't think
this is what the reporter expected, but still is a potential optimization for
GCC.

See https://godbolt.org/z/WqPhbW

[Bug rtl-optimization/48877] Inline asm for rdtsc generates silly code

2020-12-25 Thread vanyacpp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48877

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #2 from Ivan Sorokin  ---
Modern GCC doesn't generate excessive moves for this example. It looks like the
problem was fixed in 4.9.0: https://godbolt.org/z/MqE7sP .

I think the bug can be closed now.

[Bug target/94852] -ffloat-store on x64 target

2020-04-29 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94852

--- Comment #6 from Ivan Sorokin  ---
(In reply to Richard Biener from comment #1)
> @item -ffloat-store
> @opindex ffloat-store
> Do not store floating-point variables in registers, and inhibit other
> options that might change whether a floating-point value is taken from a
> register or memory.
> 
> I think it does what it says?

This is a follow-up for my previous comment.

Perhaps I haven't explained myself properly, let me explain why I find the
existing behavior a bit confusing.

>From the documentation on -ffloat-store:
"This option prevents undesirable excess precision on machines such as the
68000 where the floating registers (of the 68881) keep more precision than a
double is supposed to have. Similarly for the x86 architecture."

When a person uses -ffloat-store the desired effect is not the additional
loads/stores, but the reproducible results across different compiler
version/optimization options. It just happened that the cheapest way to go so
is adding additional loads/stores.

I'm pretty sure most users would be in favor of removing extra loads/stores
when they don't affect the results.

I understand that perhaps there are reasons why -ffloat-store should work the
way it works now. If this is true, I would recommend updating the documentation
by reflecting the cases (if they exists) when one might want to use
-ffloat-store on x86-64. From what I understand now using -ffloat-store on
x86-64 is just a mistake.

[Bug target/94852] -ffloat-store on x64 target

2020-04-29 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94852

--- Comment #4 from Ivan Sorokin  ---
(In reply to Richard Biener from comment #1)
> @item -ffloat-store
> @opindex ffloat-store
> Do not store floating-point variables in registers, and inhibit other
> options that might change whether a floating-point value is taken from a
> register or memory.
> 
> I think it does what it says?

Yes, the behavior of the compiler and the documentation matches very well. The
compiler works as intended. My report is not about a bug, but about a possible
improvement.

If ignoring or implementing a warning is considered undesirable, I would
suggest expanding the documentation by clarifying the interaction between
-ffloat-store and -mfpmath=sse.

Something like this in the documentation would help: "If used together with
-mfpmath=sse, -ffloat-store doesn't change the results of floating point
operations. The only effect it has is severely pessimizing the generated code."

[Bug target/94852] New: -ffloat-store on x64 target

2020-04-29 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94852

Bug ID: 94852
   Summary: -ffloat-store on x64 target
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

At the moment -ffloat-store significantly pessimizes the code generation
regardless of whether -mfpmath=sse -msse2 are used or not:

float f(float a, float b)
{
return a + b;
}

-O2:
addss   xmm0, xmm1
ret

-O2 -ffloat-store:
movss   DWORD PTR [rsp-20], xmm0
movss   xmm0, DWORD PTR [rsp-20]
movss   DWORD PTR [rsp-24], xmm1
addss   xmm0, DWORD PTR [rsp-24]
movss   DWORD PTR [rsp-4], xmm0
movss   xmm0, DWORD PTR [rsp-4]
ret

Note that -mfpmath=sse -msse2 are the defaults on x86-64. My understanding is
that -ffloat-store doesn't affect the result of floating point operations when
SSE math is used. If this is true -ffloat-store pessimizes generated code
without any change in observable behavior.

Recently I have found a steam game that targets x86-64 and was compiled with
-ffloat-store (presumably by mistake). For details see:
https://forums.factorio.com/viewtopic.php?f=30&t=81134 . When -ffloat-store was
removed a developer reported a 35% speedup of the Linux version of the game.

My guess is -ffloat-store might be used by mistake when someone tries to get
reproducible results on x86 without realizing that the same flags affects the
performance negatively on x86-64.

To prevent issues like this in the future I think GCC could do two things:
1. Ignore -ffloat-store when it doesn't affect the result of floating-point
operations pretending that redundant loads/stores are optimized.
2. Issue a warning when -ffloat-store doesn't affect the result of
floating-point operations. Because there is no point in using a flag which only
effect is pessimizing code generation.

[Bug analyzer/94355] New: support for C++ new expression

2020-03-27 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94355

Bug ID: 94355
   Summary: support for C++ new expression
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: analyzer
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

At the moment static analyzer warns about leaked malloc. It would be great if
C++ new expression were also supported.

Example:

void f()
{
char* p = new char;
}

Expected diagnostic:

warning: leak of 'p' [CWE-401] [-Wanalyzer-malloc-leak]

3 | char* p = new char;

[Bug target/91824] unnecessary sign-extension after _mm_movemask_epi8 or __builtin_popcount

2020-02-01 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91824

--- Comment #7 from Ivan Sorokin  ---
(In reply to Jakub Jelinek from comment #6)
> Fixed.

Thank you!

[Bug c++/93211] New: equivalence of dependent function calls doesn't check if the call is eligible for ADL

2020-01-09 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93211

Bug ID: 93211
   Summary: equivalence of dependent function calls doesn't check
if the call is eligible for ADL
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Consider this code:

// https://gcc.godbolt.org/z/3U2TTd
#include 

template
void g(T);

template
void f() {} // (1)

template
void f() {} // (2)

Question is whether (1) and (2) are (re)definition of the same function or
definitions of two different functions. Currently GCC believes that this is a
redefinition of the same function.

I think (1) and (2) should be definitions of two different functions, because
"decltype(g(T{}))" and "decltype(::g(T{}))" are susceptible to different SFINAE
errors: the overload candidate set of (2) is fixed and the overload candidate
set of (1) can be extended arbitrary by ADL.

[Bug c++/92707] New: type alias on type alias on lambda in unevaluated context does not work

2019-11-28 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92707

Bug ID: 92707
   Summary: type alias on type alias on lambda in unevaluated
context does not work
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

GCC shows an error on this code:

template 
using foo = decltype([] {});

template 
using bar = foo;

extern foo a;
extern bar a; // error: 'bar' does not name a type

The error is wrong because bar is a regular type alias. Clearly it names a
type.

If I replace the lambda with an integer the error goes away.

[Bug middle-end/91824] New: unnecessary sign-extension after _mm_movemask_epi8 or __builtin_popcount

2019-09-19 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91824

Bug ID: 91824
   Summary: unnecessary sign-extension after _mm_movemask_epi8 or
__builtin_popcount
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

gcc -O2 -mpopcnt leaves unnecessary cdqe:

#include 
#include 

void f(uint64_t& val, __m128i mask)
{
val += __builtin_popcount(_mm_movemask_epi8(mask));
}

void g(uint64_t& val, __m128i mask)
{
val += __builtin_popcountll(_mm_movemask_epi8(mask));
}

f:
  pmovmskb eax, xmm0
  popcnt eax, eax
  cdqe
  add QWORD PTR [rdi], rax
  ret
g:
  pmovmskb eax, xmm0
  cdqe
  popcnt rax, rax
  add QWORD PTR [rdi], rax
  ret

Both cdqe are unnecessary, because the results of both pmovmskb and
__builtin_popcount can not be negative.

Only lower 16 bits of pmovmskb can be non-zero. And the image of popcnt is
either [0..32] or [0..64] depending on the argument.

[Bug tree-optimization/91400] New: __builtin_cpu_supports conjunction is optimized poorly

2019-08-08 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91400

Bug ID: 91400
   Summary: __builtin_cpu_supports conjunction is optimized poorly
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Clang 8 optimizes both f() and g() to the same code:

bool f()
{
return __builtin_cpu_supports("popcnt") && __builtin_cpu_supports("ssse3");
}

bool g()
{
extern unsigned int cpu_model;
return (cpu_model & 64) && (cpu_model & 4);
}

f()/g():
mov eax, dword ptr [rip + cpu_model]
and eax, 68
cmp eax, 68
seteal
ret

GCC generates this code only for g(). For f() GCC generates less optimal:

f():
mov edx, DWORD PTR __cpu_model[rip+12]
mov eax, edx
shr eax, 6
and eax, 1
and edx, 4
mov edx, 0
cmove   eax, edx
ret

I believe it would be great if GCC is able to generate the same code for f()
too.

[Bug middle-end/90345] too pessimistic check whether pointer may alias a local variable

2019-05-07 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90345

--- Comment #4 from Ivan Sorokin  ---
Making points-to analysis aware of SESE regions will definitely help here and
is a nice thing to have.

There is one more option. In my reduced test case the body of 'push_back' is
unavailable, but when it is it can be analysed and an attribute can be added
that 'push_back' only uses the received reference internally and does not
escape it.

>From my experiments this is what clang does: even when the body of 'push_back'
is not inlined it generates different code for 'operator*=' depending on
whether push_back escapes the received reference or not:

void push_back(uint32_t const&) __attribute__((noinline));

void big_integer::push_back(uint32_t const& a)
{
__asm__("" : : : "memory");
//__asm__("" : : "g"(&a) : "memory");
}

I guess with LTO enabled this type of analysis is quite powerful, as many
'const&' and 'this' parameters in C++ don't really escape.

[Bug middle-end/90345] New: too pessimistic check whether pointer may alias local variable

2019-05-04 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90345

Bug ID: 90345
   Summary: too pessimistic check whether pointer may alias local
variable
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Consider the following example (reduced from a real program):

#include 
#include 

struct big_integer
{
void push_back(uint32_t const&);
size_t size;
uint32_t* digits;
};

big_integer& operator*=(big_integer& a, uint32_t b)
{
uint64_t const BASE = 1ull << 32;

uint32_t carry = 0;
for (size_t i = 0; i != a.size; i++)
{
uint64_t sum = 1ull * a.digits[i] * b + carry;
carry = static_cast(sum / BASE);
a.digits[i] = static_cast(sum % BASE);
}

if (carry)
{
a.push_back(carry);
//a.push_back(uint32_t(carry));
}

return a;
}

GCC 9.1 compiles the inner loop to this:
.L9:
mov esi, DWORD PTR [rsp+12]  ; load carry
.L5:
mov edx, DWORD PTR [rcx]
add rcx, 4
imulrdx, r8
add rdx, rsi
mov rsi, rdx
shr rsi, 32
mov DWORD PTR [rsp+12], esi  ; store carry
mov DWORD PTR [rcx-4], edx
cmp r9, rcx
jne .L9

As one can see carry is spilled to stack and it is loaded and stored at each
iteration of the loop. Loading and storing carry at each iteration is not
needed: it is a local variable and its address is not taken.

My guess is that GCC believes that it escapes because of the push_back after
the loop. At least if I make a copy of carry before push_back'ing it (as shown
in the comment) the problem goes away.

I think that alias analysis can be improved here: carry may not alias
a.digits[i] because it escapes only after the loop.

[Bug c++/86346] New: internal compiler error related to duduction guides

2018-06-28 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86346

Bug ID: 86346
   Summary: internal compiler error related to duduction guides
   Product: gcc
   Version: 8.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Here is the code:

template
struct bool_constant {};

using true_type = bool_constant;

template
constexpr bool is_same_v = false;

template
constexpr bool is_same_v = true;

template
struct vector
{
template
static constexpr bool v = is_same_v || false;

vector(bool_constant>, T); // (1)
};

template<>
struct vector // (2)
{
vector(true_type, bool);
};

vector v { true_type{}, false }; // (3)

The problem is that the deduction guide generated from the constructor (1)
refers to a member v of the primary template. At (3) T is then deduced to be
bool, but substitution T->bool should never be used for primary template, as
there is explicit specialization vector (2).

Clang gives an error on this code "note: candidate template ignored:
substitution failure [with T = bool]: cannot reference member of primary
template because deduced class template specialization 'vector' is an
explicit specialization".

I believe GCC should provide a similar message in this case instead of
crashing.

[Bug c++/82910] New: marking data members private affects code generation of copying

2017-11-08 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82910

Bug ID: 82910
   Summary: marking data members private affects code generation
of copying
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vanyacpp at gmail dot com
  Target Milestone: ---

Consider the following piece of code:

struct pair
{
private:
void* first;
unsigned second;
};

struct other
{
pair get() const;
};

struct my
{
pair get(other const& other);

pair current;
pair* target;
};

pair my::get(other const& other)
{
*target = other.get();
return current;
}

For the function my::get() GCC generates the following (quite inefficient)
code:

my::get(other const&):
  pushq %rbx
  movq %rdi, %rbx
  movq %rsi, %rdi
  subq $16, %rsp
  call other::get() const
  movq 16(%rbx), %rcx
  movq %rax, (%rsp)
  movq %rdx, 8(%rsp)
  movq %rax, (%rcx)
  movl 8(%rsp), %eax
  movl %eax, 8(%rcx)
  movq (%rbx), %rax
  movq 8(%rbx), %rdx
  addq $16, %rsp
  popq %rbx
  ret

The expected generated code is:

my::get(other const&):
  pushq %rbp
  pushq %rbx
  movq %rdi, %rbx
  subq $8, %rsp
  movq 16(%rdi), %rbp
  movq %rsi, %rdi
  call other::get() const
  movq %rax, 0(%rbp) # just storing to *my::target...
  movq %rdx, 8(%rbp)
  movq (%rbx), %rax  # ... and then loading my::current
  movq 8(%rbx), %rdx
  addq $8, %rsp
  popq %rbx
  popq %rbp
  ret

The issue can be worked around. One way to do this is to make the data members
of pair public. Another way is changing pair::second type to unsigned long (to
match the size of pointer).

It would be great is GCC generates the second code irrespectively of
private-ness or the size of pair::second.

[Bug target/82693] gcc/clang calling convension mismatch

2017-10-23 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82693

Ivan Sorokin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #8 from Ivan Sorokin  ---
Yep. It is. And the proposed resolution is exactly what I would like to see.
Closing as duplicate. Thank you.

*** This bug has been marked as a duplicate of bug 60336 ***

[Bug c++/60336] empty struct value is passed differently in C and C++

2017-10-23 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60336

Ivan Sorokin  changed:

   What|Removed |Added

 CC||vanyacpp at gmail dot com

--- Comment #49 from Ivan Sorokin  ---
*** Bug 82693 has been marked as a duplicate of this bug. ***

[Bug target/82693] gcc/clang calling convension mismatch

2017-10-23 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82693

--- Comment #6 from Ivan Sorokin  ---
I added files to reproduce the issue: caller.cpp and callee.cpp are the files
that need to be compiled with different compilers. empty.h is common header.
build.sh is a shell script that compiles and run all four combinations
caller/callee gcc/clang.

[Bug target/82693] gcc/clang calling convension mismatch

2017-10-23 Thread vanyacpp at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82693

Ivan Sorokin  changed:

   What|Removed |Added

  Attachment #42451|0   |1
is obsolete||

--- Comment #5 from Ivan Sorokin  ---
Created attachment 42454
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42454&action=edit
caller.cpp

  1   2   >