[Bug c/110654] inconsistent error message in presence of unexpected encoded characters

2023-07-14 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110654

--- Comment #2 from Ulrich Drepper  ---
(In reply to Andrew Pinski from comment #1)
> So this is due to differences in the languages themselves rather than due to
> C vs C++ front-end ...

This is certainly true for the validation.

But the standard never says anything about how an error should be reported.  I
don't think there is a reason to make this more obscure than necessary.

[Bug c++/110655] New: incorrect position of indicator in error message

2023-07-13 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110655

Bug ID: 110655
   Summary: incorrect position of indicator in error message
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Take this source and run it through a trunk version or earlier of the C++
frontend.  

#include 
int main() {
  puts(“hello world”);
  return 0;
}

This is the same code as in BZ 110654 but shows a different, frontend-specific
problem.

The output is:

u.c:3:8: error: extended character “ is not valid in an identifier
3 |   puts(“hello world”);
  |^
u.c:3:15: error: extended character ” is not valid in an identifier
3 |   puts(“hello world”);
  |   ^
u.c: In function ‘int main()’:
u.c:3:8: error: ‘“hello’ was not declared in this scope
3 |   puts(“hello world”);
  |^~


The problem is the second error message.  It should report the U201d character
at what would be the end of the string but it column indicator points to the
second word in the string.

If you want to go further down the rathole of this example, the first error
message says that U201c is not valid in an identifer which is of course
correct.  But then gcc neverthess adds it in front of the supposed identifier
which extends to the end of the first word of the string.  See the last error
message which says that “hello is not a valid identifier.  This is
inconsistent.

[Bug c/110654] New: inconsistent error message in presence of unexpected encoded characters

2023-07-13 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110654

Bug ID: 110654
   Summary: inconsistent error message in presence of unexpected
encoded characters
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Take this code which in a similar form was taken from a text document where the
smart quote handling used the U201c and U201d characters instead of simple
ASCII double quotes.  Note, this text should be encoded in UTF-8 and the
environment of the compiler must use UTF-8 as well.

#include 
int main() {
  puts(“hello world”);
  return 0;
}

Compiling this with a recent gcc 13 or older versions leads to these error
messages:

u.c: In function ‘main’:
u.c:3:8: error: stray ‘\342’ in program
3 |   puts(hello world);
  |^~~~
u.c:3:9: error: ‘hello’ undeclared (first use in this function); did you mean
‘ftello’?
3 |   puts(“hello world”);
  | ^
  | ftello
u.c:3:9: note: each undeclared identifier is reported only once for each
function it appears in
u.c:3:14: error: expected ‘)’ before ‘world’
3 |   puts(“hello world”);
  |   ~  ^~
  |  )
u.c:3:20: error: stray ‘\342’ in program
3 |   puts(hello world);
  |   ^~~~

The problem is the initial message about "stray ‘\342’" and the notation used
in the context line.  In the later the byte is correctly recognized as being
part on an UTF-8 character.

Note that this is in contrast to the C++ frontend which handles this correctly.
 It shows:

u.c:3:8: error: extended character “ is not valid in an identifier
3 |   puts(“hello world”);
  |^

The C frontend should adopt the same code as the C++ frontend.

[Bug tree-optimization/109045] New: assume attribute and std::optional do not mix

2023-03-06 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109045

Bug ID: 109045
   Summary: assume attribute and std::optional do not mix
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

The assume attribute is meant to help expressing more complex assumptions which
involve function calls.  Given that interfaces should use std::optional when
the semantics matches this should mean code like this should be optimized:

#include 

std::optional g(long);

long f(long a)
{
  auto r = g(a);
  [[assume(!r || *r > 0)]];
  return r.value_or(0) / 2;
}


The generated code should use the unsigned divide by two method but it does
not.  With today's gcc trunk version:

 <_Z1fl>:
   0:   48 83 ec 18 sub$0x18,%rsp
   4:   e8 00 00 00 00  call   9 <_Z1fl+0x9>
   9:   48 89 04 24 mov%rax,(%rsp)
   d:   48 89 54 24 08  mov%rdx,0x8(%rsp)
  12:   31 c0   xor%eax,%eax
  14:   80 7c 24 08 00  cmpb   $0x0,0x8(%rsp)
  19:   74 11   je 2c <_Z1fl+0x2c>
  1b:   48 8b 14 24 mov(%rsp),%rdx
  1f:   48 89 d0mov%rdx,%rax
  22:   48 c1 e8 3f shr$0x3f,%rax
  26:   48 01 d0add%rdx,%rax
  29:   48 d1 f8sar%rax
  2c:   48 83 c4 18 add$0x18,%rsp
  30:   c3  ret

The instructions from 1f to 28 including are not needed (and the initial load
at 1b adjusted).

[Bug tree-optimization/107972] New: backward propagation of finite property not performed

2022-12-05 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107972

Bug ID: 107972
   Summary: backward propagation of finite property not performed
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Here is another example where with the help of the FP ranger capabilities the
compiler should generate better code than it does today (trunk):

double f(double a, double b)
{
  if (!__builtin_isfinite(a))
return -1.0;

  double res = a + b;
  if (! __builtin_isfinite(res))
__builtin_unreachable();
  return res;
}

The condition guaranteed by the __builtin_unreachable implies that neither a
nor b cannot be finite.  Hence the initial comparison can be elided.

The same is true for - and * and also for the first operand of /.

[Bug debug/107414] dwarf 5 C macro support

2022-10-26 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107414

--- Comment #4 from Ulrich Drepper  ---
Actually, Jakub was right.  This is a gdb issue.  The gdb maintainers pointed
me to the trunk version and this indeed works with this simple code sequence. 
There might have been a bug as in 107012 but even after that fix gdb didn't
handle the dwarf data correctly before a recent commit.

[Bug debug/107414] dwarf 5 C macro support

2022-10-26 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107414

--- Comment #2 from Ulrich Drepper  ---
OK, I submitted:

https://sourceware.org/bugzilla/show_bug.cgi?id=29725

Let's see what they say.

[Bug debug/107414] New: dwarf 5 C macro support

2022-10-26 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107414

Bug ID: 107414
   Summary: dwarf 5 C macro support
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Take this code:

union node {
  struct {
int a;
  } l;
} x;

#define memory l.a

int main()
{
  return x.memory;
}

When compiled with -gdwarf-4 -g3 and run in gdb, it is possible to use

Breakpoint 1, main () at u.c:11
11return x.memory;
(gdb) p x.memory
$1 = 0

When instead -gdwarf-5 -g3 is used 'memory' is not known to be a macro and one
gets

Breakpoint 1, main () at u.c:11
11return x.memory;
(gdb) p x.memory
There is no member named memory.

Shouldn't the Dwarf 5 data be a superset of what Dwarf 4 provides?

This is not new in the trunk/13 version.  12.1 fails as well and likely prior
versions, too.

[Bug tree-optimization/107043] range information not used in popcount

2022-09-27 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107043

--- Comment #2 from Ulrich Drepper  ---
My original example and Andrew's g0 are handled by Aldy's patches

2022-09-26  Aldy Hernandez  

PR tree-optimization/107009
* range-op.cc (operator_bitwise_and::op1_range): Optimize 0 = x & MASK.
(range_op_bitwise_and_tests): New test.

2022-09-26  Aldy Hernandez  

PR tree-optimization/107009
* tree-ssa-dom.cc
(dom_opt_dom_walker::set_global_ranges_from_unreachable_edges):
Iterate over exports.


The g1 test case isn't handled, yet.

[Bug tree-optimization/107043] New: range information not used in popcount

2022-09-26 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107043

Bug ID: 107043
   Summary: range information not used in popcount
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

This code could be compiled to a simple return of the value 1 but it isn't
because the range information for n does not survive long enough.

int g(int n)
{
  n &= 0x8000;
  if (n == 0)
return 1;
  return __builtin_popcount(n);
}

The code generated today is lengthy:

   0:   81 e7 00 80 00 00   and$0x8000,%edi
   6:   ba 01 00 00 00  mov$0x1,%edx
   b:   89 f8   mov%edi,%eax
   d:   c1 e8 0fshr$0xf,%eax
  10:   85 ff   test   %edi,%edi
  12:   0f 44 c2cmove  %edx,%eax
  15:   c3  ret

[Bug tree-optimization/107009] New: massive unnecessary code blowup in vectorizer

2022-09-22 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107009

Bug ID: 107009
   Summary: massive unnecessary code blowup in vectorizer
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Given an annotated saxpy function:

#include 

void saxpy(size_t n, float* __restrict res, float a, const float* __restrict x,
const float* __restrict y)
{
  if (n == 0 || n % 8 != 0)
__builtin_unreachable();
  res = (float*)__builtin_assume_aligned(res, 32);
  x = (const float*) __builtin_assume_aligned(x, 32);
  y = (const float*) __builtin_assume_aligned(y, 32);
  for (size_t i = 0; i < n; ++i)
res[i] = a * x[i] + y[i];
}


Compiling this with the gcc 12.2.1 version from Fedora 36 leads to the
expected, guided result (although the shrq isn't necessary…) with -O3:

_Z5saxpymPffPKfS1_:
.cfi_startproc
shrq$3, %rdi
vbroadcastss%xmm0, %ymm0
xorl%eax, %eax
salq$5, %rdi
.p2align 4
.p2align 3
.L2:
vmovaps (%rdx,%rax), %ymm1
vfmadd213ps (%rcx,%rax), %ymm0, %ymm1
vmovaps %ymm1, (%rsi,%rax)
addq$32, %rax
cmpq%rdi, %rax
jne .L2
vzeroupper
ret

The the current trunk gcc the result is massively bigger and given the guidance
in the sources none of the extra code is necessary.

_Z5saxpymPffPKfS1_:
.LFB22:
.cfi_startproc
movq%rdi, %r8
movq%rdx, %rdi
movq%rcx, %rdx
leaq-1(%r8), %rax
cmpq$6, %rax
jbe .L7
movq%r8, %rcx
vbroadcastss%xmm0, %ymm2
xorl%eax, %eax
shrq$3, %rcx
salq$5, %rcx
.p2align 4
.p2align 3
.L3:
vmovaps (%rdi,%rax), %ymm1
vfmadd213ps (%rdx,%rax), %ymm2, %ymm1
vmovaps %ymm1, (%rsi,%rax)
addq$32, %rax
cmpq%rcx, %rax
jne .L3
movq%r8, %rax
andq$-8, %rax
testb   $7, %r8b
je  .L18
vzeroupper
.L2:
movq%r8, %rcx
subq%rax, %rcx
leaq-1(%rcx), %r9
cmpq$2, %r9
jbe .L5
vmovaps (%rdx,%rax,4), %xmm3
vshufps $0, %xmm0, %xmm0, %xmm1
movq%rcx, %r9
vfmadd132ps (%rdi,%rax,4), %xmm3, %xmm1
andq$-4, %r9
vmovaps %xmm1, (%rsi,%rax,4)
addq%r9, %rax
andl$3, %ecx
je  .L16
.L5:
vmovss  (%rdi,%rax,4), %xmm1
leaq0(,%rax,4), %rcx
leaq1(%rax), %r9
vfmadd213ss (%rdx,%rax,4), %xmm0, %xmm1
vmovss  %xmm1, (%rsi,%rcx)
cmpq%r8, %r9
jnb .L16
vmovss  4(%rdi,%rcx), %xmm1
addq$2, %rax
vfmadd213ss 4(%rdx,%rcx), %xmm0, %xmm1
vmovss  %xmm1, 4(%rsi,%rcx)
cmpq%r8, %rax
jnb .L16
vmovss  8(%rdx,%rcx), %xmm4
vfmadd132ss 8(%rdi,%rcx), %xmm4, %xmm0
vmovss  %xmm0, 8(%rsi,%rcx)
.L16:
ret
.p2align 4
.p2align 3
.L18:
vzeroupper
ret
.p2align 4
.p2align 3
.L7:
xorl%eax, %eax
jmp .L2

[Bug libstdc++/65230] pretty-print inconsistent output for similar objects

2022-08-05 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65230

--- Comment #9 from Ulrich Drepper  ---
Created attachment 53419
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53419=edit
diff -y of current and proposed output

To compare the results more easily.

[Bug libstdc++/65230] pretty-print inconsistent output for similar objects

2022-08-05 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65230

Ulrich Drepper  changed:

   What|Removed |Added

  Attachment #53410|0   |1
is obsolete||

--- Comment #8 from Ulrich Drepper  ---
Created attachment 53418
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53418=edit
Update pretty printers for all? containers

I spent some more time on this and should now have covered all the containers. 
The changes for the other containers are in line with what the previous patch
produced.  The output should now be as consistent as possible across all
containers.

Along the line some bugs have been fixed, too.

To illustrate the change I'll also attach to this bug a diff -y output of the
current and the proposed code.

[Bug libstdc++/65230] pretty-print inconsistent output for similar objects

2022-08-04 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65230

--- Comment #6 from Ulrich Drepper  ---
Created attachment 53410
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53410=edit
consistent pretty printing of contains

How about this patch?

I used the attached test case.  With the current code 'info locals' at the end
of the function prints:

t1 = empty std::tuple
t2 = std::tuple containing = {[0] = 0}
t3 = std::tuple containing = {[0] = 0, [1] = 0}
v1 = std::vector of length 0, capacity 0
v2 = std::vector of length 1, capacity 1 = {0}
v3 = std::vector of length 2, capacity 2 = {0, 0}
va1 = std::vector of length 0, capacity 0
va2 = std::vector of length 1, capacity 1 = {0}
va3 = std::vector of length 2, capacity 2 = {0, 0}
a1 = {_M_elems = {}}
a2 = {_M_elems = {0}}
a3 = {_M_elems = {0, 0}}
p1 = {first = 0, second = 0}
b1 = std::bitset
b2 = std::bitset
b3 = std::bitset
b4 = std::bitset = {[0] = 1}
b5 = std::bitset = {[70] = 1}

With the patch it prints:

t1 = std::tuple<> = {}
t2 = std::tuple = {[0] = 0}
t3 = std::tuple = {[0] = 0, [1] = 0}
v1 = std::vector of length 0, capacity 0 = {}
v2 = std::vector of length 1, capacity 1 = {0}
v3 = std::vector of length 2, capacity 2 = {0, 0}
va1 = std::vector > of length 0, capacity 0 = {}
va2 = std::vector > of length 1, capacity 1 = {0}
va3 = std::vector > of length 2, capacity 2 = {0, 0}
a1 = std::array = {}
a2 = std::array = {0}
a3 = std::array = {0, 0}
p1 = std::pair = {[0] = 0, [1] = 0}
b1 = std::bitset<0> = {}
b2 = std::bitset<1> = {}
b3 = std::bitset<2> = {}
b4 = std::bitset<2> = {[0] = 1}
b5 = std::bitset<72> = {[70] = 1}

This is quite a change from before but I think quite consistent.  NB: also
tested with _GLIBXX_DEBUG.

There is one point I'm not sure about: should the std::vector printer
explicitly show the length (capacity is no question)?  This is the one
remaining inconsistency.  The tuple printer does not explicitly list the number
of elements.  On the other hand, to avoid making the output too long the
std::vector printer does not show the indeces and therefore the number of
elements cannot be see right away.  So, maybe leave it as is?

BTW: notice that I added a pretty printer for std::array and I also added code
to recognize a standard allocator template argument to std::vector (which in
this case is not shown).

[Bug libstdc++/65230] pretty-print inconsistent output for similar objects

2022-08-04 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65230

--- Comment #5 from Ulrich Drepper  ---
Or should the std::pair output even be

p1 = std::pair = {[0] = 0, [1] = 0}

??

[Bug libstdc++/65230] pretty-print inconsistent output for similar objects

2022-08-04 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65230

--- Comment #4 from Ulrich Drepper  ---
Ugh, this one is a pasto:

v1 = std::vector of length 0, capacity 0 = { }

instead of

v1 = std::vector of length 0, capacity 0

[Bug libstdc++/65230] pretty-print inconsistent output for similar objects

2022-08-04 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65230

--- Comment #3 from Ulrich Drepper  ---
Actually, I think for the std::pair definition I'd like to see

p1 = {[0] = 0, [1] = 0}

instead of

p1 = {first = 0, second = 0}

Again, more uniform and I'd say it should be encouraged to use std::get instead
of .first / .second because it's compatible with std::tuple.

[Bug libstdc++/65230] pretty-print inconsistent output for similar objects

2022-08-04 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65230

Ulrich Drepper  changed:

   What|Removed |Added

 CC||drepper.fsp+rhbz at gmail dot 
com

--- Comment #2 from Ulrich Drepper  ---
Let's go through the details.  I like the idea of a common format but there
also
shouldn't be any information lost.

Here's some test code, similar to what Martin has:

#include 
#include 
#include 
#include 
#include 

int main()
{
  std::tuple<> t1;
  std::tuple t2;
  std::tuple t3;
  std::vector v1;
  std::vector v2 { 0 };
  std::vector v3 { 0, 0 };
  std::array a1;
  std::array a2 { 0 };
  std::array a3 { 0, 0 };
  std::pair p1;
  std::bitset<0> b1;
  std::bitset<1> b2("0");
  std::bitset<2> b3("00");
  return 0;
}


The output as of gcc-12.1.1 (and gdb-12.1, what a coincidence) is:

(gdb) info locals 
t1 = empty std::tuple
t2 = std::tuple containing = {[0] = 0}
t3 = std::tuple containing = {[0] = 0, [1] = 0}
v1 = std::vector of length 0, capacity 0
v2 = std::vector of length 1, capacity 1 = {0}
v3 = std::vector of length 2, capacity 2 = {0, 0}
a1 = {_M_elems = {}}
a2 = {_M_elems = {0}}
a3 = {_M_elems = {0, 0}}
p1 = {first = 0, second = 0}
b1 = std::bitset
b2 = std::bitset
b3 = std::bitset

This is just my opinion, but I would like to see the following output (NB, this
already uses the tuple pretty printer change I committed):

(gdb) info locals 
t1 = std::tuple<> = { }
t2 = std::tuple = {[0] = 0}
t3 = std::tuple = {[0] = 0, [1] = 0}
v1 = std::vector of length 0, capacity 0
v2 = std::vector of length 1, capacity 1 = {0}
v3 = std::vector of length 2, capacity 2 = {0, 0}
a1 = std::array = { }
a2 = std::array = {0}
a3 = std::array = {0, 0}
p1 = {first = 0, second = 0}
b1 = std::bitset<0> = { }
b2 = std::bitset<1> = {0}
b3 = std::bitset<2> = {0, 0}

This means several changes but it corrects the rather ad-hoc nature of the
current output to be more uniform.

[Bug c++/105626] -Wformat should accept u8"" strings

2022-07-05 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105626

--- Comment #6 from Ulrich Drepper  ---
(In reply to Marek Polacek from comment #5)
> Fixed for GCC 13.  I could probably backport to GCC 12, if desirable.

Thanks.  And I certainly would appreciate a backport since this is an annoying
warning in some of my projects.

[Bug c++/105626] -Wformat should accept u8"" strings

2022-07-04 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105626

--- Comment #2 from Ulrich Drepper  ---
Could something like this be added, it seems to have few chances if any to
disrupt any meaningful diagnostic while handling this specific case.

[Bug c++/105626] New: -Wformat should accept u8"" strings

2022-05-17 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105626

Bug ID: 105626
   Summary: -Wformat should accept u8"" strings
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

The discussion about this topic on gcc@
(https://gcc.gnu.org/pipermail/gcc/2022-May/238673.html) ended with the
conclusion that gcc should not disregard the cast in code like this:

#include 

int main()
{
  printf((const char*) u8"test %d\n", 1);
  return 0;
}

With -Wformat this code produces with gcc 12:

t.cc: In function ‘int main()’:
t.cc:5:24: warning: format string is not an array of type ‘char’ [-Wformat=]
5 |   printf((const char*) u8"test %d\n", 1);
  |^

Since

a) there are no I/O functions for u8 strings in C++20
b) using u8 strings is necessary in reliable code
c) it is safe to perform the analysis -Wformat does on u8 strings

I suggest that u8 strings are allowed when testing for -Wformat.

[Bug target/104781] [12 regression] SEGV in _Unwind_GetGR during i386 Ada bootstrap

2022-03-15 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104781

Ulrich Drepper  changed:

   What|Removed |Added

 CC||drepper.fsp+rhbz at gmail dot 
com

--- Comment #14 from Ulrich Drepper  ---
My standard build now fails in stage 1 with this patch when compiling
unwind-dw2.c for x86:

.../gcc-builds/20220315/gcc/include/cetintrin.h: In function
‘_Unwind_RaiseException’:
.../gcc-builds/20220315/gcc/include/cetintrin.h:47:1: error: inlining failed in
call to ‘always_inline’ ‘_get_ssp’: target specific option mismatch
   47 | _get_ssp (void)
  | ^~~~
.../gnu/gcc/libgcc/config/i386/shadow-stack-unwind.h:32:26: note: called from
here
   32 |   _Unwind_Word ssp = _get_ssp ();   \
  |  ^~~
.../gnu/gcc/libgcc/unwind-dw2.c:1654:7: note: in expansion of macro
‘_Unwind_Frames_Extra’
 1654 |   _Unwind_Frames_Extra (FRAMES);   
\
  |   ^~~~
.../gnu/gcc/libgcc/unwind.inc:140:3: note: in expansion of macro
‘uw_install_context’
  140 |   uw_install_context (_context, _context, frames);


and so on for the other CET functions.  When I comment out the
LIBGCC2_UNWIND_ATTRIBUTE definition in config/i386/i386.h it works.

The compiler command line that is used is

.../gcc-builds/20220315/./gcc/xgcc -B.../gcc-builds/20220315/./gcc/
-B/usr/x86_64-redhat-linux/bin/ -B/usr/x86_64-redhat-linux/lib/ -isystem
/usr/x86_64-redhat-linux/include -isystem /usr/x86_64-redhat-linux/sys-include 
 -fno-checking -g -O2 -m32 -O2  -g -O2 -DIN_GCC-W -Wall -Wno-narrowing
-Wwrite-strings -Wcast-qual -Wno-format -Wstrict-prototypes
-Wmissing-prototypes -Wold-style-definition  -isystem ./include  -fpic
-mlong-double-80 -DUSE_ELF_SYMVER -fcf-protection -mshstk -g -DIN_LIBGCC2
-fbuilding-libgcc -fno-stack-protector  -fpic -mlong-double-80 -DUSE_ELF_SYMVER
-fcf-protection -mshstk -I. -I. -I../../.././gcc -I.../gcc/libgcc
-I.../gcc/libgcc/. -I.../gcc/libgcc/../gcc -I.../gcc/libgcc/../include
-I.../gcc/libgcc/config/libbid -DENABLE_DECIMAL_BID_FORMAT -DHAVE_CC_TLS 
-DUSE_TLS  -o unwind-dw2.o -MT unwind-dw2.o -MD -MP -MF unwind-dw2.dep
-fexceptions -c .../gcc/libgcc/unwind-dw2.c -fvisibility=hidden -DHIDE_EXPORTS

This is on an up-to-date Fedora 35 system with gcc 11.2.1-9

[Bug middle-end/104486] New: if constexpr versus -Wtype-limits

2022-02-10 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104486

Bug ID: 104486
   Summary: if constexpr versus -Wtype-limits
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

One use of 'if constexpr' is to handle different type sizes.  This is a simple
example:

#include 

using some_type = int;

some_type f();

bool foo()
{
  if constexpr (sizeof(some_type) == sizeof(int))
return f() == INT_MAX;
  else if constexpr (sizeof(some_type) == sizeof(long))
return f() == LONG_MAX;
  else
return f() == LONG_LONG_MAX;
}


Compiling this code with -Wtype-limits produces warnings:

g++ -std=gnu++20 -c u.cc -Wtype-limits
u.cc: In function ‘bool foo()’:
u.cc:12:16: warning: comparison is always false due to limited range of data
type [-Wtype-limits]
   12 | return f() == LONG_MAX;
  |^
u.cc:14:16: warning: comparison is always false due to limited range of data
type [-Wtype-limits]
   14 | return f() == LONG_LONG_MAX;
  |^

These statements are guarded by 'if constexpr' which causes the code in the
respective branch to not be active.  The warnings should be disabled for those
code blocks.  Again, the whole point of writing code like this is to avoid
overflow errors and the like.

[Bug tree-optimization/103857] implement ternary without jump (and comparison)

2021-12-29 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103857

--- Comment #2 from Ulrich Drepper  ---
(In reply to Jakub Jelinek from comment #1)
> I don't think that's equivalent.

You're right, I tried to generalize the code and failed.  I my actual case this
was a single variable the compiler saw the assignments of.  Adding

  if (a[i] != 42 && a[i] != 10)
__builtin_unreachable();

etc could simulate this.

[Bug tree-optimization/103857] New: implement ternary without jump (and comparison)

2021-12-29 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103857

Bug ID: 103857
   Summary: implement ternary without jump (and comparison)
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

This test case is derived from actual code and I expect it to be not uncommon
in general.  Maybe not exactly in this form but perhaps the matcher can catch a
few more cases.

Take this code:

extern void g(int);
void f1(int* a, int b, int c)
{
  for (unsigned i = 0; i < 100; ++i)
g(a[i] == b ? c : b);
}
void f2(int* a)
{
  for (unsigned i = 0; i < 100; ++i)
g(a[i] == 42 ? 10 : 42);
}


The function 'g' is called in each loop with one of two values which is the
opposite from the one that is match in the condition of the ternary operation. 
This allows the respective loops to be rewritten as

  for (unsigned i = 0; i < 100; ++i)
g(a[i] ^ c ^ b);

and

  for (unsigned i = 0; i < 100; ++i)
g(a[i] ^ 10 ^ 42);

In the former case the c ^ b operation can be hoisted.

This should be faster and smaller in pretty much all situations and on all
platforms.

[Bug c++/103749] Misleading error message on template/non-template conflict

2021-12-16 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103749

--- Comment #4 from Ulrich Drepper  ---
(In reply to Marek Polacek from comment #3)
> Hopefully that's a bit better.

This indeed looks as good as one can hope for.  Thanks.

[Bug c++/103749] New: Misleading error message on template/non-template conflict

2021-12-16 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103749

Bug ID: 103749
   Summary: Misleading error message on template/non-template
conflict
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

This problem isn't new in the trunk version, it exists in all versions I
tested.

This is the code in question:

~
struct foo {
  template
  friend struct bar;
};

struct bar {
  int baz;
};

bar var;
~

This is obviously buggy, the actual definition of 'bar' is not a template
class.   This is exactly what clang tell me:

~
$ clang++ -c u.cc
u.cc:6:8: error: redefinition of 'bar' as different kind of symbol
struct bar {
   ^
u.cc:3:17: note: previous definition is here
  friend struct bar;
^
u.cc:10:1: error: unknown type name 'bar'
bar var;
^
2 errors generated.
~

With g++ the error messages are misleading and it also generates a lot more
unnecessary text:

~
$ g++ -c u.cc
u.cc:6:8: error: template argument required for ‘struct bar’
6 | struct bar {
  |^~~
u.cc:10:5: error: class template argument deduction failed:
   10 | bar var;
  | ^~~
u.cc:10:5: error: no matching function for call to ‘bar()’
u.cc:3:17: note: candidate: ‘template bar()-> bar<
 >’
3 |   friend struct bar;
  | ^~~
u.cc:3:17: note:   template argument deduction/substitution failed:
u.cc:10:5: note:   couldn’t deduce template parameter
‘’
   10 | bar var;
  | ^~~
u.cc:3:17: note: candidate: ‘template bar(bar< 
>)-> bar<  >’
3 |   friend struct bar;
  | ^~~
u.cc:3:17: note:   template argument deduction/substitution failed:
u.cc:10:5: note:   candidate expects 1 argument, 0 provided
   10 | bar var;
  | ^~~
~

[Bug c++/98864] New: Warning for unnecessary final keyword

2021-01-28 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98864

Bug ID: 98864
   Summary: Warning for unnecessary final keyword
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Compile the following code:

struct foo {
  virtual void f();
};

struct bar final : foo {
  void f() final override;
};

It is correct and should compile but the function bar::f is annotated with
'final' even though the entire class is also annotated with 'final'.  This adds
nothing and might be an indication of misunderstanding or leftovers from
previous versions of the code.

Perhaps a warning can be added to point out the issue.

[Bug target/98737] Atomic operation on x86 no optimized to use flags

2021-01-26 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98737

--- Comment #8 from Ulrich Drepper  ---
(In reply to Jakub Jelinek from comment #7)
> The sub fix won't be the same as would add, perhaps xor/or/and can be
> handled by the same peephole2, but even for that I'm not sure.

Just a proposal, but I can see myself using code like this.


> Though e.g.
> trying __atomic_or_fetch (, b, ...) == 0 doesn't seem to be something
> people would use.

I can see this being valid.  If b is a variable of some time (e.g.,
representing a flag), a could be the set of all flag set and the result of the
or operation being non-zero could mean some work based on the flags needs to be
done.

The alternative is is

  if ((b != 0 && __atomic_or_fetch(, b, ...)) || a != 0)
...

Unconditionally performing the or is likely faster than the additional tests
and jumps.

[Bug target/98737] Atomic operation on x86 no optimized to use flags

2021-01-26 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98737

--- Comment #6 from Ulrich Drepper  ---
(In reply to Jakub Jelinek from comment #5)
> Created attachment 50058 [details]
> gcc11-pr98737.patch
> 
> Untested fix.

This only handles sub?

The same applies to add, or, and, xor.  Maybe nand?  Can this patch be
generalized?

[Bug target/98737] New: Atomic operation on x86 no optimized to use flags

2021-01-18 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98737

Bug ID: 98737
   Summary: Atomic operation on x86 no optimized to use flags
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Consider the following code:

long a;

_Bool f(long b)
{
  return __atomic_sub_fetch(, b, __ATOMIC_RELEASE) == 0;
}

_Bool g(long b)
{
  return (a -= b) == 0;
}


When compiling for x86-64 with the current HEAD as of 20210118 the resulting
code is:

 :
   0:   48 f7 dfneg%rdi
   3:   48 89 f8mov%rdi,%rax
   6:   f0 48 0f c1 05 00 00lock xadd %rax,0x0(%rip)# f 
   d:   00 00 
   f:   48 01 f8add%rdi,%rax
  12:   0f 94 c0sete   %al
  15:   c3  retq   
  16:   66 2e 0f 1f 84 00 00nopw   %cs:0x0(%rax,%rax,1)
  1d:   00 00 00 

0020 :
  20:   48 29 3d 00 00 00 00sub%rdi,0x0(%rip)# 27 
  27:   0f 94 c0sete   %al
  2a:   c3  retq   

The code for f is far too complicated.  All that needs to be different from the
code in g is that the lock prefix must be used for sub.

Probably all __atomic_* builtins have problems with using flags when possible.

This is not an esoteric problem.  I was specifically looking at optimizing the
std::latch implementation for C++20 and this is what would be needed.  Without
a fix a special version would be needed or the current, much worse code is
used.

[Bug target/97397] Unnecessary mov instruction

2020-10-13 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97397

Ulrich Drepper  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Ulrich Drepper  ---
Actually, the instruction clears the top 32 bits.

[Bug target/97397] New: Unnecessary mov instruction

2020-10-13 Thread drepper.fsp+rhbz at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97397

Bug ID: 97397
   Summary: Unnecessary mov instruction
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

This simple code

const char s[][3] = { "aa", "bb", "cc", "dd", "ee" };

unsigned f(unsigned x)
{
  return s[x][1];
}


translates with gcc 10.1 and the current 11.0 trunk version to

   0:   89 ff   mov%edi,%edi
   2:   0f be 84 7f 00 00 00movsbl 0x0(%rdi,%rdi,2),%eax
   9:   00 
   a:   c3  retq   


Obviously, the initial mov instruction is completely unnecessary.

[Bug tree-optimization/92867] Use ERF_RETURNS_ARG in more places

2019-12-09 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92867

Ulrich Drepper  changed:

   What|Removed |Added

 CC||drepper.fsp+rhbz at gmail dot 
com

--- Comment #4 from Ulrich Drepper  ---
This BZ came out of a discussion around C++ function call chaining along the
line of:

void f1(std::string& s, int a)
{
  std::cout << "hello " << s;
  if (a != 0)
std::cout << a;
  std::cout << '\n';
}

The 'if' prevents one single series of calls through operator<< from being used
and the compiler has reload std::cout from memory every time.  There are ugly
work-arounds in the source to get the desired behaviour but this should happen
automatically.  The work-arounds are too ugly and there is lots of code out
there.

One way would be to expose a way to specify one of the arguments is returned. 
Jakub mentioned that there is already internally a way to use the "fn spec"
attribute.  How about exposing this explicitly as a function attribute?

Jakub also raised the point how this should be applied to member functions.  I
suggest that the parameter for the attribute is really a number (not parameter
name) and that argument 1 (or 0, if you want the count start at zero) refers to
'this' in case of member functions.

How about this?

[Bug rtl-optimization/92549] New: Use x86 xchg instruction more

2019-11-17 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92549

Bug ID: 92549
   Summary: Use x86 xchg instruction more
   Product: gcc
   Version: 9.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Take this code

__attribute__((noinline))
int f(int a, int b)
{
  return b - a + 5;
}
int foo(int a, int b)
{
  return 1 + f(b, a);
}
int main()
{
  return foo(39, 3);
}

gcc 9.2.1 generates for foo on x86-64 this code:

movl%edi, %r8d
movl%esi, %edi
movl%r8d, %esi
callf
addl$1, %eax
ret

This could be better:

xchgl   %edi, %esi
callf
addl$1, %eax
ret

Switching parameter location is not a uncommon pattern.

If the regparm is used on x86-32 the same likely applies there.

[Bug tree-optimization/91789] New: Value ranges determined from comparisons not used transitively

2019-09-16 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91789

Bug ID: 91789
   Summary: Value ranges determined from comparisons not used
transitively
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Take the following code:

int foo(int a, int b)
{
  if (b < a)
__builtin_unreachable();
  if (a < 0)
return -1;
  if (b < 0)
return 0;
  return 1;
}

The compiler should be able to determine that the b < 0 can never be true.  At
that point in the code a >= 0 and b >= a, therefore transitively b >= 0.


The problem is not tied to __builtin_unreachable as can be seen by changing the
code slightly:

int foo(int a, int b)
{
  if (b < a)
return 2;
  if (a < 0)
return -1;
  if (b < 0)
return 0;
  return 1;
}

After the initial test b < a is handled there is still a threeway comparison.

The problem can be seen with 9.2.1 as well as the current trunk version.  clang
8.0.0 generates pretty much the same code as gcc.

[Bug target/90094] New: better handling of x == LONG_MIN on x86-64

2019-04-15 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90094

Bug ID: 90094
   Summary: better handling of x == LONG_MIN on x86-64
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Compile the following on x86-64:

unsigned f(long a)
{
  return a == LONG_MIN;
}

The result for -O3 is:

f:  movabs $0x8000,%rax
cmp%rax,%rdi
sete   %al
movzbl %al,%eax
retq   

With -Os it looks like this:

f:  mov$0x1,%eax
shl$0x3f,%rax
cmp%rax,%rdi
sete   %al
movzbl %al,%eax
retq   

I think for both optimization directions the code should be compiled as if for
this:

unsigned f(long a)
{
  long r;
  return __builtin_sub_overflow(a, 1, );
}

This compiled to

f:  xor%eax,%eax
add$0x,%rdi
seto   %al
retq   

This should be faster and is definitely shorter than even the -Os version.

For 32-bit x86 the problem doesn't exist is this form, I think.  But it might
apply to some RISC targets as well.

[Bug c++/89923] New: printf format check and char8_t

2019-04-02 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89923

Bug ID: 89923
   Summary: printf format check and char8_t
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

With the introduction of char8_t there is a new error case in the printf format
checks:

#include 

int main() {
  auto s = u8"hello world";
#pragma GCC diagnostic error "-Wformat"
  printf("%s\n", s);
}

Compiling this with C++2a results to the following output:

$ g++ -c u.cc -std=gnu++2a
u.cc: In function ‘int main()’:
u.cc:6:12: error: format ‘%s’ expects argument of type ‘char*’, but argument 2
has type ‘const char8_t*’ [-Werror=format=]
6 |   printf("%s\n", s);
  |   ~^ ~
  || |
  |char* const char8_t*
  |   %hhn

I think char8_t* should be added to the allowed types for the %s parameter.  It
is arguably more likely to succeed then a char* argument since the latters
encoding is determined by the compiler.  At least with u8 strings the code can
make sure the locale used at runtime uses UTF-8.

[Bug tree-optimization/88972] New: popcnt of limited 128-bit number with unnecessary zeroing

2019-01-22 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88972

Bug ID: 88972
   Summary: popcnt of limited 128-bit number with unnecessary
zeroing
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Compile the following code on x86-64 with -Ofast -march=haswell:

int f(__uint128_t m)
{
  if (m < 64000)
return __builtin_popcount(m);
  return -1;
}


The generated code with the trunk gcc looks like this:

   0:   b8 ff f9 00 00  mov$0xf9ff,%eax
   5:   48 39 f8cmp%rdi,%rax
   8:   b8 00 00 00 00  mov$0x0,%eax
   d:   48 19 f0sbb%rsi,%rax
  10:   72 0e   jb 20 
  12:   31 c0   xor%eax,%eax
  14:   f3 0f b8 c7 popcnt %edi,%eax
  18:   c3  retq   
  19:   0f 1f 80 00 00 00 00nopl   0x0(%rax)
  20:   b8 ff ff ff ff  mov$0x,%eax
  25:   c3  retq   


The instruction at offset 12 is unnecessary.  I guess this is a left-over from
the popcnt of the upper half which is recognized to be unnecessary and left
out.  There is no addition anymore but somehow the register clearing survived.

[Bug libstdc++/88738] treat shared_ptr and unique_ptr more like plain old pointers

2019-01-12 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88738

--- Comment #6 from Ulrich Drepper  ---
(In reply to Martin Sebor from comment #5)
> If it did we could have GCC apply it implicitly to
> all such functions or operators defined in namespace std, and provide a new
> attribute to disable it in the cases where it might not be appropriate (say
> no_warn_unused_result).

I looked at what clang does and it's very simplistic.  The code below produces
the output shows below.  I.e., they use the warning for all comparison
operators regardless of

- namespace
- return value
- member function or not
- const-ness of operators
- visible side effects

Nevertheless people use the compiler without complaining, at least visibly.

$ clang++ -c -O -Wall u.cc
u.cc:26:5: warning: equality comparison result unused [-Wunused-comparison]
  l == r;
  ~~^~~~
u.cc:26:5: note: use '=' to turn this equality comparison into an assignment
  l == r;
^~
=
u.cc:31:5: warning: inequality comparison result unused [-Wunused-comparison]
  l != r;
  ~~^~~~
u.cc:31:5: note: use '|=' to turn this inequality comparison into an
or-assignment
  l != r;
^~
|=
u.cc:36:5: warning: relational comparison result unused [-Wunused-comparison]
  l <= r;
  ~~^~~~
u.cc:41:5: warning: relational comparison result unused [-Wunused-comparison]
  l >= r;
  ~~^~~~
u.cc:46:5: warning: relational comparison result unused [-Wunused-comparison]
  l < r;
  ~~^~~
5 warnings generated.



~~
struct foo {
  int a;
  foo(int) : a(42) {}
  auto operator==(const foo& o) const { return a == o.a; }
  auto operator!=(foo& o) { return a == o.a; }
};

auto operator<=(const foo& l, const foo& r)
{
  return l.a == r.a;
}

auto operator>=(foo& l, foo& r)
{
  l.a |= 1;
  return l.a == r.a;
}

int operator<(foo& l, foo& r)
{
  return l.a == r.a ? 0 : l.a < r.a ? -1 : 1;
}

auto f1(foo& l, foo& r)
{
  l == r;
}

auto f2(foo& l, foo& r)
{
  l != r;
}

auto f3(foo& l, foo& r)
{
  l <= r;
}

auto f4(foo& l, foo& r)
{
  l >= r;
}

auto f5(foo& l, foo& r)
{
  l < r;
}

[Bug libstdc++/88738] treat shared_ptr and unique_ptr more like plain old pointers

2019-01-12 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88738

--- Comment #3 from Ulrich Drepper  ---
Created attachment 45416
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45416=edit
Add nodiscard support

As Martin suggested, we could indeed use existing attributes in library code to
warn about some of the problems.  The code from comment #0 is real, this
happened in a project of mine where I mistyped an assignment.  The warning
would have pointed to the problem.

How about the following patch for a start?  This compiles cleanly on x86-64.  I
haven't run the test suite to see whether it breaks some regression tests.

Also, this approach should be extended beyond shared_ptr and unique_ptr,
probably to at least every single bool operatorXX(...) const.  Or even every
single const member function which then of course raises the question whether
the compiler should learn about this…

[Bug c++/88738] treat shared_ptr and unique_ptr more like plain old pointers

2019-01-07 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88738

--- Comment #1 from Ulrich Drepper  ---
BTW, this also applies to the "unused variable" warning as in the code below
but clang doesn't warn about that either.

#include 

using type = std::shared_ptr;
type g;

int f(int a) {
  auto p = g;
  return a+1;
}

[Bug c++/88738] New: treat shared_ptr and unique_ptr more like plain old pointers

2019-01-07 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88738

Bug ID: 88738
   Summary: treat shared_ptr and unique_ptr more like plain old
pointers
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

The implementations are obviously more complicated but the warning handling the
current implementation allows is less than optimal.  For the test case below
gcc (8.2.1, current trunk) doesn't emit any warning, even with -Wall.  clang on
the other hand reports

$ clang++ -c -O -Wall -g v.cc -std=gnu++17
v.cc:8:9: warning: equality comparison result unused [-Wunused-comparison]
res == nullptr;
^~
v.cc:8:9: note: use '=' to turn this equality comparison into an assignment
res == nullptr;
^~
=


Test (compile with -std=c++17):

#include 

using type = std::shared_ptr;

type f(int a) {
  auto res = std::make_shared(3);
  if (a == 0)
res == nullptr;   // <- obviously incorrect
  return res;
}

[Bug c++/88736] New: nullptr_t available without namespace qualification

2019-01-07 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88736

Bug ID: 88736
   Summary: nullptr_t available without namespace qualification
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

I don't think the following should work and clang indeed complains.  gcc 8.2.1
and current trunk both accept it without warning, even with -pedantic.

#include 

int f(nullptr_t) {
  return 0;
}


Obviously gcc should recommend using std::nullptr_t.

[Bug driver/88708] New: help-dummy.o file left behind

2019-01-05 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88708

Bug ID: 88708
   Summary: help-dummy.o file left behind
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

When using

  gcc -c -Q -O --help=optimizers

the driver leaves behind the help-dummy.o file.  This happens with gcc trunk
and all prior versions I was able to test.

[Bug tree-optimization/88676] New: missed opportunity is integer conditional

2019-01-03 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88676

Bug ID: 88676
   Summary: missed opportunity is integer conditional
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Take the following code:

int f(unsigned b)
{
  int r;
  if (b >= 2)
__builtin_unreachable();
  switch (b) {
  case 0:
r = 1;
break;
  case 1:
r = 2;
break;
  default:
r = 0;
break;
  }
  return r;
}

Compiled using the current trunk gcc and gcc 8.2.1 with -O3 on x86_64 the
following code is produced:

 :
   0:   31 c0   xor%eax,%eax
   2:   83 ff 01cmp$0x1,%edi
   5:   0f 94 c0sete   %al
   8:   ff c0   inc%eax
   a:   c3  retq   

This is quite good but it should be something like

leal 1(%edi),%eax
ret

The first three instructions test for 0 or 1 and load into %eax the values 0 or
1 respectively.  This should be just a move.

[Bug rtl-optimization/87664] New: invariant in loop after optimization

2018-10-20 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87664

Bug ID: 87664
   Summary: invariant in loop after optimization
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Compile the following code with the current trunk or gcc 8.2.1:

#include 
#include 

int rs(int s) {
  std::array ar;
  std::iota(ar.begin(), ar.end(), s);
  return std::accumulate(ar.begin(), ar.end(), 0);
}


With -O2 this leads on x86-64 to the following code:


 :
   0:   48 81 ec 20 01 00 00sub$0x120,%rsp
   7:   48 8d 44 24 88  lea-0x78(%rsp),%rax
   c:   0f 1f 40 00 nopl   0x0(%rax)
  10:   89 38   mov%edi,(%rax)
  12:   48 8d 8c 24 18 01 00lea0x118(%rsp),%rcx
  19:   00 
  1a:   48 83 c0 04 add$0x4,%rax
  1e:   ff c7   inc%edi
  20:   48 39 c8cmp%rcx,%rax
  23:   75 eb   jne10 
  25:   48 8d 54 24 88  lea-0x78(%rsp),%rdx
  2a:   31 c0   xor%eax,%eax
  2c:   0f 1f 40 00 nopl   0x0(%rax)
  30:   03 02   add(%rdx),%eax
  32:   48 8d b4 24 18 01 00lea0x118(%rsp),%rsi
  39:   00 
  3a:   48 83 c2 04 add$0x4,%rdx
  3e:   48 39 f2cmp%rsi,%rdx
  41:   75 ed   jne30 
  43:   48 81 c4 20 01 00 00add$0x120,%rsp
  4a:   c3  retq   


The relevant parts are the loop starting at offsets 10 and 30.  The respective
lea instructions to compute the end address of the loop at offset 12 and 32 are
invariant and should be hoisted out of the loops.

[Bug c++/85159] New: if constexpr error about goto

2018-04-02 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85159

Bug ID: 85159
   Summary: if constexpr error about goto
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

This is actually something I deliberate used if constexpr for to avoid warnings
but gcc creates a new kind of error.  Take this silly code:

template
int foo(int a, int b)
{
  if constexpr (B) {
a += 1;
goto L;
  } else
b += 1;
  if constexpr (B) {
  L:
return a - b;
  } else
return b - a;
}

int main()
{
  return foo(3, 4);
}

The definition and use of the label are in if constexpr blocks controlled by
the same condition.  gcc issues this warning (this is mainline gcc as of last
week or so):

u.cc: In function ‘int foo(int, int)’:
u.cc:10:3: error: jump to label ‘L’
   L:
   ^
u.cc:6:10: note:   from here
 goto L;
  ^
u.cc:9:16: note:   enters constexpr if statement
   if constexpr (B) {
^


I think gcc should look at the block conditions and avoid the error.

[Bug c++/84695] New: Missed opportunity to issue warning about override

2018-03-03 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84695

Bug ID: 84695
   Summary: Missed opportunity to issue warning about override
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Stroustrup in [1] §20.3.4 writes the compilers should issue warnings about
inconsistent use of explicit override controls.

struct foo {
  virtual int one(int) { return 1; }
  virtual int two(int) { return 2; }
};
struct bar : foo {
  int one(int) override { return 101; }
  int two(int) { return 102; }
};
bar b;

In this case the compiler should issue a warning about bar::two.

This is different from -Wsuggest-override.  That command line option causes a
warning to be issued but the fact that bar::one uses override should
automatically turn on this type of warnings for the rest of the class.  I would
even say there is no need for a further command line option, it should just
been done.

[1] http://www.stroustrup.com/4th.html

[Bug c++/84360] New: unnecessary aka in error message

2018-02-13 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84360

Bug ID: 84360
   Summary: unnecessary aka in error message
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

This applies to gcc 7.3.1 as well, I tested the current 8 top of the tree as
well.  Compile with g++:

#include 
using T = std::tuple<int,int,int>;
T g() { return std::make_tuple(1,2,3); }

The error messages include:

v.cc:3:5: error: return type ‘using T = class std::tuple<int, int, int>’ {aka
‘class std::tuple<int, int, int>’} is incomplete
 T g() { return std::make_tuple(1,2,3); }
 ^

Note the superfluous aka.

[Bug c++/82373] New: syntax error in error message

2017-09-30 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82373

Bug ID: 82373
   Summary: syntax error in error message
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Compiling the invalid code at the bottom produces the following message:

y.cc: In function ‘auto bar::foo(int))(int)’:
y.cc:7:12: error: inconsistent deduction for auto return type: ‘int (*)(int)’
and then ‘std::nullptr_t’
 return nullptr;
^~~

The function name is wrong.  The parameter list of the inferred return type
should not be printed.

   auto bar::foo(int)

The correct form would have (* before the function name:

   auto (*bar::foo(int))(int)
^^

But then the 'auto' must be replaced as well.  But this is wrong since it does
not correspond to a function in the source.



namespace bar {
  int(*fp)(int);
  auto foo(int a)
  {
if (a)
  return fp;
return nullptr;
  }
}

[Bug middle-end/81376] New: unnecessary cast before comparison

2017-07-10 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81376

Bug ID: 81376
   Summary: unnecessary cast before comparison
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Take the following code:

typedef double c_t;
typedef int a_t;
int f(a_t a1, a_t a2) {
  return (c_t) a1 < (c_t) a2;
}

With IEEE 754 double we have a 52 bits mantissa which is wide enough to
represent all of the 'int' values exactly.  No possibility for imprecision and
especially not for Inf or NaN.

Still gcc (trunk as of today but likely older versions as well) generate the
code for the conversion.  This is for x86-64 with -O2:

f:
vxorpd  %xmm0, %xmm0, %xmm0
vxorpd  %xmm1, %xmm1, %xmm1
xorl%eax, %eax
vcvtsi2sd   %edi, %xmm0, %xmm0
vcvtsi2sd   %esi, %xmm1, %xmm1
vucomisd%xmm0, %xmm1
seta%al
ret

A simple

xorl%eax, %eax
cmpl%esi, %edi
setl%al
ret

is sufficient, just as if c_t above would be defined as 'int'.

[Bug rtl-optimization/80917] New: missed bit information propagation

2017-05-30 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80917

Bug ID: 80917
   Summary: missed bit information propagation
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Take the following code:

int f(unsigned a)
{
  if ((a & 2) == 0)
return 0;

  a += 4;

  return (a & 2) != 0;
}

The addition clearly cannot affect the repeat of the test of the second bit. 
Still, with the current trunk compiler I get with -O3 on x86-64:

xorl%eax, %eax
testb   $2, %dil
je  .L1
shrl%edi
movl%edi, %eax
andl$1, %eax
.L1:
ret


It might get quickly expansive but the bit set/unset information could be
tracked passed the test and through arithmetic operations like additions.

[Bug tree-optimization/80758] New: isnan/isfinite/isinf value propagation

2017-05-15 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80758

Bug ID: 80758
   Summary: isnan/isfinite/isinf value propagation
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Consider the following code:

#define isnan(x) __builtin_isnan(x)
#define isfinite(x) __builtin_isfinite(x)

int f(double a, double b)
{
  if (!isfinite(a) || !isfinite(b))
return 0;
  double c = a + b;
  return isnan(c) ? 0 : 1;
}

For x86-64 with the current trunk version (and probably all previous versions)
the generated code looks something like this:

.cfi_startproc
vmovq   .LC0(%rip), %xmm2
vmovapd %xmm0, %xmm4
vmovsd  .LC1(%rip), %xmm3
xorl%eax, %eax
vandpd  %xmm2, %xmm4, %xmm4
vucomisd%xmm4, %xmm3
jb  .L5
vandpd  %xmm1, %xmm2, %xmm2
vucomisd%xmm2, %xmm3
jb  .L5
vaddsd  %xmm1, %xmm0, %xmm0
xorl%eax, %eax
vucomisd%xmm0, %xmm0
setnp   %al
.L5:
ret
.cfi_endproc

The issue here is that the sum of two finite values will never be NaN.  It can
be ±Inf but not NaN.  The VRP information should contain necessary information
and use it in the __builtin_isnan code generation.

[Bug c++/80577] Avoid using adj in member function pointers

2017-05-08 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80577

--- Comment #3 from drepper.fsp+rhbz at gmail dot com <drepper.fsp+rhbz at 
gmail dot com> ---
(In reply to drepper.fsp+r...@gmail.com from comment #2)
> final isn't necessary in this case.  An object is used and the type is known.

Ignore this comment, wrong bug.

[Bug c++/80577] Avoid using adj in member function pointers

2017-05-08 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80577

--- Comment #2 from drepper.fsp+rhbz at gmail dot com <drepper.fsp+rhbz at 
gmail dot com> ---
final isn't necessary in this case.  An object is used and the type is known.

[Bug c++/80660] Member function pointer optimization affected by incompatible virtual function

2017-05-08 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80660

--- Comment #2 from drepper.fsp+rhbz at gmail dot com <drepper.fsp+rhbz at 
gmail dot com> ---
final shouldn't be needed in this case.  It's an object that is used, the type
is known.

[Bug c++/80660] New: Member function pointer optimization affected by incompatible virtual function

2017-05-07 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80660

Bug ID: 80660
   Summary: Member function pointer optimization affected by
incompatible virtual function
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Consider the following code:

struct foo final {
  int a = 0;
  int b = 0;
  void set_a(int p) { a = p; }
  void set_b(int p) { b = p; }
#ifdef VIRT
  virtual int get_a() const { return a; }
#endif
};

void (foo::*set)(int);

foo fobj1;

void bar1(int a) {
  (fobj1.*set)(a);
}

When compiling with optimization and VIRT not defined the code generated for
bar1 does correctly so elide the test for a virtual function and saves code and
time at execution time.

Adding any virtual function (such as by defining VIRT) changes this.  All of
the sudden the entire member function pointer call sequence is emitted.

This is unnecessary, though, since the present virtual function is incompatible
with the member function pointer 'set'.  Therefore the generated code should be
the same, with or without get_a defined.

[Bug c++/80577] New: Avoid using adj in member function pointers

2017-04-30 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80577

Bug ID: 80577
   Summary: Avoid using adj in member function pointers
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Consider the following code:

struct foo final {
  int a = 0, b = 0;
  int get1() const { return a; }
  int get2() const { return a + b; }
};

foo f;
int (foo::*mfp)() const = ::get1;

int get()
{
  return (f.*mfp)();
}

When compiled get() looks on x86-64 like this:

movqmfp+8(%rip), %rax
leaqf(%rax), %rdi
jmp *mfp(%rip)

The compiler knows the type 'foo'.  It can determine that there is no multiple
inheritence.  This means that the adj field in the member function pointer will
always be zero.  Hence the generated code should be

movl$f, %esi
jmp *mfp(%rip)

[Bug c++/80575] New: unnecessary virtual function table support in member function call

2017-04-30 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80575

Bug ID: 80575
   Summary: unnecessary virtual function table support in member
function call
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Created attachment 41288
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41288=edit
example for ineffectiveness of final with member function pointers

The support for virtual function requires more complex code at the call site
through a member function pointer.  gcc has some support to elide the handling
of virtual functions.  Take the following code:

struct foo {
  int a = 0, b =0;
  int get1() const { return a; }
  int get2() const { return a + b; }
};

foo f;
int (foo::*mfp)() const = ::get1;

int get()
{
  return (f->*mfp)();
}

The generated code (on x86-64) for get is:

movqmfp+8(%rip), %rax
leaqf(%rax), %rdi
jmp *mfp(%rip)

Perfectly fine.  If now the variable 'f' is changed to a pointer the code looks
like this:

movqmfp(%rip), %rax
movqmfp+8(%rip), %rdi
addqf(%rip), %rdi
testb   $1, %al
je  .L4
movq(%rdi), %rdx
movq-1(%rdx,%rax), %rax
.L4:
jmp *%rax

This is due to the fact that other classes derived can be derived from 'foo'
and those could have virtual functions.

To prevent this it should be possible to mark 'foo' as final.  If you do this
nothing changes, though.

--- u.cc-old2017-04-30 16:30:50.704469153 +0200
+++ u.cc2017-04-30 16:24:56.619672469 +0200
@@ -1,4 +1,4 @@
-struct foo {
+struct foo final {
   int a = 0, b =0;
   int get1() const { return a; }
   int get2() const { return a + b; }

[Bug c++/78923] New: bad error message about missing template argument

2016-12-24 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78923

Bug ID: 78923
   Summary: bad error message about missing template argument
   Product: gcc
   Version: 6.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Here is another case where the error message emitted by gcc (albeit, 6.2.1, I
don't have a more recent version handy):

~
template
struct A;

template
struct B {
  B(A*);
};

template
B::B(A*) { // <-- should be:  B::B(A*) {
}
~

Notice the missing template argument.  The error message emitted is:
u.cc:10:8: error: expected constructor, destructor, or type conversion before
‘(’ token
 B::B(A*) {
^


In comparison, clang++ emits:

u.cc:10:9: error: use of class template 'A' requires template arguments
B::B(A*) {
^
u.cc:2:8: note: template is declared here
struct A;
   ^

[Bug tree-optimization/77877] New: missed optimization in switch of modulus value

2016-10-05 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77877

Bug ID: 77877
   Summary: missed optimization in switch of modulus value
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Compile this code with a recent trunk gcc:

~~
int zero;
int one;
int two;

extern unsigned compute_mod(unsigned);

void cnt(unsigned n) {
#ifdef HIDE
  n = compute_mod(n);
#else
  n %= 3;
#endif
  switch (n) {
  case 0: ++zero; break;
  case 1: ++one; break;
  case 2: ++two; break;
#ifdef OPT
  default: __builtin_unreachable();
#endif
  }
}
~~

On x86-64, without HIDE defined, the code compiles with -O3 to:

   0:   89 f8   mov%edi,%eax
   2:   ba ab aa aa aa  mov$0xaaab,%edx
   7:   f7 e2   mul%edx
   9:   d1 ea   shr%edx
   b:   8d 04 52lea(%rdx,%rdx,2),%eax
   e:   29 c7   sub%eax,%edi
  10:   83 ff 01cmp$0x1,%edi
  13:   74 1b   je 30 <_Z3cntj+0x30>
  15:   83 ff 02cmp$0x2,%edi
  18:   75 0e   jne28 <_Z3cntj+0x28>
  1a:   83 05 00 00 00 00 01addl   $0x1,0x0(%rip)# 21
<_Z3cntj+0x21>
  21:   c3  retq   
  22:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1)
  28:   83 05 00 00 00 00 01addl   $0x1,0x0(%rip)# 2f
<_Z3cntj+0x2f>
  2f:   c3  retq   
  30:   83 05 00 00 00 00 01addl   $0x1,0x0(%rip)# 37
<_Z3cntj+0x37>
  37:   c3  retq   


This is good, the compiler knows there are only three possible values and does
not emit any code for a default case.

If I make sure the compiler doesn't know anything about the arithmetic
operation by calling a function but telling the compiler there is no other case
by using __builtin_unreachable() then the generated code is even better:

   0:   48 83 ec 08 sub$0x8,%rsp
   4:   e8 00 00 00 00  callq  9 <_Z3cntj+0x9>
   9:   83 f8 01cmp$0x1,%eax
   c:   74 22   je 30 <_Z3cntj+0x30>
   e:   72 10   jb 20 <_Z3cntj+0x20>
  10:   83 05 00 00 00 00 01addl   $0x1,0x0(%rip)# 17
<_Z3cntj+0x17>
  17:   48 83 c4 08 add$0x8,%rsp
  1b:   c3  retq   
  1c:   0f 1f 40 00 nopl   0x0(%rax)
  20:   83 05 00 00 00 00 01addl   $0x1,0x0(%rip)# 27
<_Z3cntj+0x27>
  27:   48 83 c4 08 add$0x8,%rsp
  2b:   c3  retq   
  2c:   0f 1f 40 00 nopl   0x0(%rax)
  30:   83 05 00 00 00 00 01addl   $0x1,0x0(%rip)# 37
<_Z3cntj+0x37>
  37:   48 83 c4 08 add$0x8,%rsp
  3b:   c3  retq   

There is only one compare instruction.  This is how even the first case should
look like.

Even more interesting: just defining the OPT macro does not change anything. 
So, there is currently no way to get the optimal behaviour.  We certainly don't
want to artificially the function calls.

[Bug libstdc++/77760] New: get_time needs to set tm_wday amd tm_yday

2016-09-27 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77760

Bug ID: 77760
   Summary: get_time needs to set tm_wday amd tm_yday
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

Created attachment 39701
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39701=edit
Test case to show get_time vs strptime

Currently the get_time function sets the tm_wday member of tm only when
explicitly asked and doesn't set tm_yday at all.  This does not match
strptime() which this function should emulate.

Basically, every time the parsing has enough information to compute that data
it should be done.  In the simplest case, of day of the month, month, and year
are known the two fields can be computed.

The glibc strptime function does all that.  It gets complicated, true, but
users of the functions will expect this behaviour since otherwise the results
are strange.  I know the standard does not explicitly say this.  See the
attached code for a test case.  I see

Sun Sep 27 03:11:41 2016 vs Tue Sep 27 03:11:41 2016 FAIL

because tm_wday == 0 means Sunday.

[Bug c/77577] New: missing warnings about too few array elements

2016-09-13 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77577

Bug ID: 77577
   Summary: missing warnings about too few array elements
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

With a declaration of 'f' as in the following code the function implementation
can assume that at least the given number of elements are available in the
array.  According to ISO C:

If the keyword static also appears within the [ and ] of the array type
derivation, then for each call to the function, the value of the corresponding
actual argument shall provide access to the first element of an array with at
least as many elements as specified by the size expression.


Given the following code gcc (in trunk and previous versions) does not emit a
warning.  It should be possible to emit one.  Especially with recent changes
which make __builtin_object_size usable even without optimization.


int f(int ss[static 5]);

int g() {
  int ar[2];
  return f(ar);
}

[Bug driver/77475] New: unnecessary or misleading context in reporting command line problems

2016-09-04 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77475

Bug ID: 77475
   Summary: unnecessary or misleading context in reporting command
line problems
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

When specifying a wrong command line option gcc reports the error as being on
the first line of the file and also prints that line.  This can be confusing.

$ cat u.c
this is garbage
$ local-gcc -c -mmemset-strategy=wrong u.c
u.c:1:0: error: wrong arg wrong to option -mmemset_strategy=
 this is garbage

$ local-gcc -c -march=wrong u.c
u.c:1:0: error: bad value (wrong) for -march= switch
 this is garbage


The message should ideally not include any file information and especially not
line from the input file should be printed, it has nothing to do with the
error.

[Bug c++/60994] gcc does not recognize hidden/shadowed enumeration as valid nested-name-specifier

2016-08-31 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60994

drepper.fsp+rhbz at gmail dot com <drepper.fsp+rhbz at gmail dot com> changed:

   What|Removed |Added

 CC|    |drepper.fsp+rhbz at gmail dot 
com

--- Comment #11 from drepper.fsp+rhbz at gmail dot com <drepper.fsp+rhbz at 
gmail dot com> ---
There is one problem remaining, it seems, and that's Jonathan's in comment #7. 
I got here because I re-tested some of the compiler bugs which hit me.  Those
are fixed.

[Bug tree-optimization/73285] New: perhaps avoid duplication?

2016-08-10 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73285

Bug ID: 73285
   Summary: perhaps avoid duplication?
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

For some architectures functions with different interfaces generate identical
code.  For instance, on x86-64 (not the old x86 ABI):

double f1(int a, double b)
{
  return a + b;
}
double f2(double a, int b)
{
  return a + b;
}

Obviously the reason is that the parameters are stored in the same registers
for both functions.

Perhaps this could be recognized and the compiler could avoid generating the
code twice?

[Bug c++/71912] flexible array in union

2016-07-18 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71912

--- Comment #2 from drepper.fsp+rhbz at gmail dot com <drepper.fsp+rhbz at 
gmail dot com> ---
If it is accepted that this code should work (as I also expect) then this bug
should also be marked as a regression to 5.x.  6.1 at least is broken, I don't
know when it stopped working.

[Bug c++/71912] New: flexible array in union

2016-07-17 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71912

Bug ID: 71912
   Summary: flexible array in union
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com
  Target Milestone: ---

I haven't researched in detail what the accepted wisdom about this code is but
at the very least it is completely unnecessary to reject it, as the code shows.
 Code like this is actually from an actual project of mine.

Take the code below.  gcc 6.1 and also the current trunk reject the code
because:

v.cc:22:14: error: flexible array member ‘xyyzy::s’ not at end of ‘struct xyyzy’
   char s[];
  ^
v.cc:25:14: note: next member ‘double xyyzy::a’ declared here
   double d;
  ^
v.cc:18:8: note: in the definition of ‘struct xyyzy’
 struct xyyzy {
^


Clearly, the array 's' is not followed by 'a' in anything but a syntactic way. 
The compiler does not reject the use of flexible arrays like this when the
types are defined separately, as exampled of type 'baz' shows.

If the rejection is done deliberately at the very least the message must be
fixed but I would also like to see a justification.

NB: the same code compiles fine in C.  This is why I added all the unnecessary
'struct'.


struct foo {
  int a;
  char s[];
};

struct bar {
  double d;
  char t[];
};

struct baz {
  union {
struct foo f;
struct bar b;
  } u;
};

struct xyyzy {
  union {
struct {
  int a;
  char s[];
} f;
struct {
  double d;
  char t[];
} b;
  } u;
};

struct baz b;
struct xyyzy x;

[Bug jit/64206] fake.so is unlinked too early for some users

2015-04-13 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64206

drepper.fsp+rhbz at gmail dot com drepper.fsp+rhbz at gmail dot com changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from drepper.fsp+rhbz at gmail dot com drepper.fsp+rhbz at 
gmail dot com ---
(In reply to David Malcolm from comment #11)
 (In reply to David Malcolm from comment #10)
  You need something
 something like, I meant to say

Sorry, David, I missed that.  At least nothing crashes with this extension.


[Bug jit/64206] fake.so is unlinked too early for some users

2015-04-13 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64206

--- Comment #9 from drepper.fsp+rhbz at gmail dot com drepper.fsp+rhbz at 
gmail dot com ---
Created attachment 35307
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35307action=edit
Little hello world

Probably copied from the documentation, nothing special.


[Bug jit/64206] fake.so is unlinked too early for some users

2015-04-13 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64206

drepper.fsp+rhbz at gmail dot com drepper.fsp+rhbz at gmail dot com changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #8 from drepper.fsp+rhbz at gmail dot com drepper.fsp+rhbz at 
gmail dot com ---
(In reply to David Malcolm from comment #5)
 Does this fix the symptoms you're seeing?

Sorry for the delay, I'm terribly behind.

I just tried it and I don't see any improvement.  This is on Fedora 21 with a
mainline gcc.  By the call to the loaded function is made the entire directory
the jit uses is gone.  This is before the call to gcc_jit_results_release. 
Since I don't have a fixed gdb (or more correct: BFD) I still see the gdb
crash.

I haven't looked at the logic of your patch.  From my perspective the right
solution is still to enable, on request, to delay removing all the files until
the call to gcc_jit_result_release.  There is already this interface available,
let's use it for one more thing.  For production runs we probably want the
current behavior.

I don't think you need any code to reproduce, I see the problem with a trivial
hello world like the one below.  Just put a breakpoint on line 55 (the call to
some_fn) and step into the function.  I immediately get

Can't read data for section '.eh_frame' in file '/tmp/libgccjit-a07Nh7/fake.so'

and upon issuing p $pc gdb will crash (gdb 7.8.2-38.fc21).


[Bug c++/65011] New: misleading error message for target attribute

2015-02-10 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65011

Bug ID: 65011
   Summary: misleading error message for target attribute
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com

The target attribute named default is handled special and this causes a
problem in error reporting.  If I have code like this:

__attribute__((target(sse2,avx2)))
void foo(double *__restrict r, const double *__restrict a, const double
*__restrict b, double f)
{
  for (unsigned i = 0; i  128; ++i)
r[i] = a[i] * f + b[i];
}

the compiler doesn't complain even though it ignores one of the parameters of
the target attribute.  That's a question for another day.

The problem is that replacing one of the strings with default causes a
problem:

$ cat u.cc
__attribute__((target(default,avx2)))
void foo(double *__restrict r, const double *__restrict a, const double
*__restrict b, double f)
{
  for (unsigned i = 0; i  128; ++i)
r[i] = a[i] * f + b[i];
}
$ local-gcc -c -O3 u.cc
u.cc:10:96: error: attribute(target(default)) is unknown
 void foo(double *__restrict r, const double *__restrict a, const double
*__restrict b, double f)
   
^

Of course default is known.  But it is not parsed in
ix86_valid_target_attribute_inner_p.  It seems (haven't verified it) that the
caller checks for default being the entire string and if this is not the case
defers parsing to ix86_valid_target_attribute_inner_p.


[Bug jit/64206] New: fake.so is unlinked too early for some users

2014-12-05 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64206

Bug ID: 64206
   Summary: fake.so is unlinked too early for some users
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: jit
  Assignee: dmalcolm at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com

Some users of the DSO created by the JIT (probably mostly debuggers) might have
a hard time getting to the file before it gets unlinked.  For some gdb
versions, for instance, this is fatal.  Try gdb 7.8.1, for instance, see

https://bugzilla.redhat.com/show_bug.cgi?id=1170861

That is certainly gdb's fault but it's an example about the type of problems
that can appear.

It certainly is useful to unlink the file as quickly as possible so that in
case of a problem crash nothing is left behind.  But there at least should be
the possibility to prevent the early unlink.  Dave suggested to tie this to the
enabling of debuginfo generation in libgccjit.  I'm actually not entirely sure
that's the best possibility since even without debuginfo the debugger can use
the ELF symbols to place breakpoints etc.  Maybe a boolean option?

As a solution it should be quite easy to transfer ownership of the file and
directory from playback::context to result.


[Bug tree-optimization/62220] New: missed optimization wrt module for loop variable

2014-08-21 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62220

Bug ID: 62220
   Summary: missed optimization wrt module for loop variable
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com

Created attachment 33376
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33376action=edit
program to measure difference of the original and proposed optimized code

I've optimized some prominent code bases by hand to achieve significant speedup
but this optimization could have been performed by gcc.  The problem is that
some program authors underestimate the price of integer module operations.  I
attaching a test program (x86-64) below and the time difference I see about
1500%.

The general pattern is:

   loop index variable I with upper limit L

  for (I = ?; I  L; ++I)

   inside loop use I % M where M is loop in-variant

  e.g.: var[I % M]

This could be optimized to

   compute LL = L - L % M

   loop index variable I with upper limit LL; nested second loop

 J = ? % M
 for (I = ?; I  LL; ) {
   for (; J  M; ++I, ++J) {

  ... loop body ...

  ... e.g., var[J]
  ... instead of var[I % M]
   }
   J = 0;
 }


[Bug c++/61719] New: misleading error message

2014-07-05 Thread drepper.fsp+rhbz at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61719

Bug ID: 61719
   Summary: misleading error message
   Product: gcc
   Version: 4.10.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com

This happens with older versions as well and the problem is worse in more
complicated situations.  This is the boiled-down version.  Take this source:

struct c {
  c(int a) : aa(a {}
  int aa;
};

c v(1);


There clearly is a type in the constructor call, the closing parenthesis is
missing.  This causes the scanner to read the remainder of the file looking for
the end of the initializer of the call.  The error messages you get are:

u.cc: In constructor ‘c::c(int)’:
u.cc:2:14: error: class ‘c’ does not have any field named ‘aa’
   c(int a) : aa(a {}
  ^
u.cc:2:19: error: expected ‘)’ before ‘{’ token
   c(int a) : aa(a {}
   ^
u.cc:3:7: error: expected ‘{’ at end of input
   int aa;
   ^

Yes, the second error points in the right direction but in more complicated
situations there can be even more messages between the first message and the
one in second place here.

It seems that despite an error token being returned when looking for the end of
the initializer for aa the compiler first performs a lookup of the member which
of course makes no sense in this case since the remainder of the class is not
parsed.

I think something better can be done, maybe just skip looking up the member to
be initialized if there is a syntax error in the initializer call itself.

[Bug c++/59565] ICE on valid code in DWARF generation

2014-01-29 Thread drepper.fsp+rhbz at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59565

Ulrich Drepper drepper.fsp+rhbz at gmail dot com changed:

   What|Removed |Added

 CC||jason at redhat dot com

--- Comment #2 from Ulrich Drepper drepper.fsp+rhbz at gmail dot com ---
Jason,

I saw your commit to bug #53756 and thought about this bug.  I built the
current trunk version and the bug is now fixed.

Should this bug be closed as duplicate of 53756 or something else?


[Bug c++/59565] New: ICE on valid code in DWARF generation

2013-12-19 Thread drepper.fsp+rhbz at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59565

Bug ID: 59565
   Summary: ICE on valid code in DWARF generation
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drepper.fsp+rhbz at gmail dot com

I see an ICE with both 4.8.2 and the trunk version on the following code when
compiled like this:

  g++ -std=gnu++1y -c bug.cc -g

The backtrace is seen below.  The code is simple:


extern double f1(double);
struct s1
{
  double v;
};
inline auto f1(const s1 p) { return s1 { f1(p.v) }; }
struct s2
{
  double v;
  auto operator-(const s2 r) const { return s2 { v - r.v }; }
};
struct s3
{
  double v;
  s1 operator/(const s3 r) const { return s1 { v / r.v }; }
};
void f2(s3 p1, s3 p2) { auto d1 = f1(p1 / p2); }


Notice that s2 is actually not used yet remove the definition or just the
inline function and it works.


NN.cc:10:8: internal compiler error: in gen_type_die_with_usage, at
dwarf2out.c:19823
 struct s2
^
0x8fd836 gen_type_die_with_usage
../../gcc/dwarf2out.c:19823
0x8fa3cd gen_decl_die
../../gcc/dwarf2out.c:20320
0x8fc4dc gen_member_die
../../gcc/dwarf2out.c:19384
0x8fc4dc gen_struct_or_union_type_die
../../gcc/dwarf2out.c:19456
0x8fc4dc gen_tagged_type_die
../../gcc/dwarf2out.c:19646
0x8fd7c5 gen_type_die_with_usage
../../gcc/dwarf2out.c:19793
0x8f9d8f gen_decl_die
../../gcc/dwarf2out.c:20359
0xb02f72 rest_of_type_compilation(tree_node*, int)
../../gcc/passes.c:280
0x680ddd finish_struct_1(tree_node*)
../../gcc/cp/class.c:6588
0x6825f4 finish_struct(tree_node*, tree_node*)
../../gcc/cp/class.c:6753
0x6b48c2 cp_parser_class_specifier_1
../../gcc/cp/parser.c:19182
0x6b48c2 cp_parser_class_specifier
../../gcc/cp/parser.c:19401
0x6b48c2 cp_parser_type_specifier
../../gcc/cp/parser.c:14292
0x6cd4c1 cp_parser_decl_specifier_seq
../../gcc/cp/parser.c:11537
0x6d4019 cp_parser_simple_declaration
../../gcc/cp/parser.c:11127
0x6b7dc3 cp_parser_block_declaration
../../gcc/cp/parser.c:11076
0x6de7e3 cp_parser_declaration
../../gcc/cp/parser.c:10973
0x6dd4d8 cp_parser_declaration_seq_opt
../../gcc/cp/parser.c:10859
0x6dedcb cp_parser_translation_unit
../../gcc/cp/parser.c:4018
0x6dedcb c_parse_file()
../../gcc/cp/parser.c:31326
Please submit a full bug report,


[Bug target/54087] __atomic_fetch_add does not use xadd instruction

2013-11-19 Thread drepper.fsp+rhbz at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54087

Ulrich Drepper drepper.fsp+rhbz at gmail dot com changed:

   What|Removed |Added

 CC||drepper.fsp+rhbz at gmail dot 
com

--- Comment #11 from Ulrich Drepper drepper.fsp+rhbz at gmail dot com ---
Yes, although there still in an oddity.  The code from comment #3 is compiled
as:

 a:
   0:ba fb ff ff ff   mov$0xfffb,%edx
   5:89 d0mov%edx,%eax
   7:f0 0f c1 05 00 00 00 lock xadd %eax,0x0(%rip)# f a+0xf
   e:00 
b: R_X86_64_PC32v-0x4
   f:01 d0add%edx,%eax
  11:c3   retq   
  12:66 66 66 66 66 2e 0f data32 data32 data32 data32 nopw
%cs:0x0(%rax,%rax,1)
  19:1f 84 00 00 00 00 00 

0020 b:
  20:b8 fb ff ff ff   mov$0xfffb,%eax
  25:f0 0f c1 05 00 00 00 lock xadd %eax,0x0(%rip)# 2d b+0xd
  2c:00 
29: R_X86_64_PC32v-0x4
  2d:83 e8 05 sub$0x5,%eax
  30:c3   retq   



There is no reason for the difference.  In both cases the latter sequence
should be generated.