[Bug c++/90287] New: [concepts] bogus error on overload failure inside requires-expression

2019-04-29 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90287

Bug ID: 90287
   Summary: [concepts] bogus error on overload failure inside
requires-expression
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/30Cf6s

#include 

template 
constexpr inline bool isAddable = requires(const T& lhs, const U& rhs) {
 lhs + rhs;
};

auto x = isAddable;

: In instantiation of 'constexpr const bool isAddable >':
:13:10:   required from here
:10:10: error: no match for 'operator+' (operand types are 'const int'
and 'const std::__cxx11::basic_string')
   10 |  lhs + rhs ;
  |  ^

[snip rest of wall-o-errors]

If I make isAddable a concept, it correctly evaluates to false, but I would
expect requires-expressions to also work when assigned to a constexpr bool.

[Bug c++/90033] New: [concepts] ICE segfault evaluating a requires clause that transitively depends on itself

2019-04-09 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90033

Bug ID: 90033
   Summary: [concepts] ICE segfault evaluating a requires clause
that transitively depends on itself
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/SEfFol

This is a creduce'd example that tripped a segfault in our Real World Code
implementation of unique_function. The RWC version includes a requirement that
X != Y so this can never be a copy or move constructor, but that was removed in
the reduction. FWIW, The clang concepts fork compiles this successfully.

template 
struct bool_constant { static constexpr bool value = B; };
template 
struct is_constructible : bool_constant<__is_constructible(T, Args...)> {};
template 
T&& move(T&);

struct X {
  template 
  requires(is_constructible::value)
  X(OtherFunc &&);

  X() = default;
};

X source;
X dest = move(source);

-

: In substitution of 'template  requires 
is_constructible::value X::X(OtherFunc&&) [with OtherFunc
= X]':
:16:21:   required from here
:16:21: internal compiler error: Segmentation fault
   16 | X dest = move(source);
  | ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions.


I know concepts are still experimental, but if the fix turns out to be simple,
we'd appreciate a backport to gcc8.

[Bug c++/90031] New: Bogus parse error trying to explicitly specialize a template variable inside class scope

2019-04-09 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90031

Bug ID: 90031
   Summary: Bogus parse error trying to explicitly specialize a
template variable inside class scope
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/y5GQZd

struct Struct {
template 
constexpr static bool use_cond = false;
template 
constexpr static bool use_cond = true;
};

:5:27: error: explicit template argument list not allowed
5 | constexpr static bool use_cond = true;
  |   ^

[Bug c++/90019] New: [8 regression] Bogus ambiguous overload error for NTTP pack of disjoint enable_ifs unless there is an unsupplied default argument

2019-04-08 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90019

Bug ID: 90019
   Summary: [8 regression] Bogus ambiguous overload error for NTTP
pack of disjoint enable_ifs unless there is an
unsupplied default argument
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

This code compiles fine with 7 and trunk but fails with gcc8:
https://godbolt.org/z/v1HN8B

#include 

// gcc8 thinks these are ambiguous for <0>
template ...> void foo(){}
template ...> void foo(){}

// but somehow these arn't for <0>, but are for <0,0> !?
template ...> void bar(){}
template ...> void bar(){}

void test() {
bar<0>(); // works
bar<0,0>(); // boom
foo<0>(); // boom
}

[Bug c++/89997] New: Garbled expression in error message with -fconcepts

2019-04-06 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89997

Bug ID: 89997
   Summary: Garbled expression in error message with -fconcepts
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

Usually -fconcepts delivers excellent error messages, but this one is pretty
bad. It looks like this goes back to 6.2, when it first started to show the
expression.

https://godbolt.org/z/m9DlOZ

struct Y;

struct X {
Y operator<< (const char*);
};

struct Y {
X operator<< (void*);
};

template 
void check() requires requires (X x, T val) { x << "hello" << val; } {}

void test() {
check(); // no error
check(); // mangled error
}

-

: In function 'void test()':
:16:16: error: cannot call function 'void check() requires (> [with T =
int]'
   16 | check(); // mangled error
  |^
:12:6: note:   constraints not satisfied
   12 | void check() requires requires (X x, T val) { x << "hello" << val; } {}
  |  ^
:12:6: note: with 'X x'
:12:6: note: with 'int val'
:12:6: note: the required expression '("hello"->x.X::operator<<() <<
val)' would be ill-formed


What is that expression? How did it end up applying -> to a string literal!?

[Bug c++/89781] Misleading error messages when initializing a static data member in-class

2019-03-22 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89781

--- Comment #3 from Mathias Stearn  ---
Yeah, my point wasn't that this code should be accepted, it was that the error
messages should be improved. Ideally there would even be fixits suggesting
adding constexpr where it would be valid, otherwise suggesting inline.

Sorry if that wasn't clear.

[Bug c++/89781] New: Misleading error messages when initializing a static data member in-class

2019-03-20 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89781

Bug ID: 89781
   Summary: Misleading error messages when initializing a static
data member in-class
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

The error messages make sense in a pre-c++17 world before inline variables
existed. Now they are lies!

struct X{};
struct Y { static X x = {}; };

:2:21: error: 'constexpr' needed for in-class initialization of static
data member 'X Y::x' of non-integral type [-fpermissive]
 struct Y { static X x = {}; };
 ^

The error is even worse when X can't be a constexpr variable:


struct X{ X(); };
struct Y { static X x = {}; };

:2:21: error: in-class initialization of static data member 'X Y::x' of
non-literal type
 struct Y { static X x = {}; };
 ^
:2:26: error: temporary of non-literal type 'X' in a constant
expression
 struct Y { static X x = {}; };
  ^
:1:8: note: 'X' is not literal because:
 struct X{ X(); };
^
:1:8: note:   'X' is not an aggregate, does not have a trivial default
constructor, and has no 'constexpr' constructor that is not a copy or move
constructor



This compiles just fine:

struct X{ X(); };
struct Y { static inline X x = {}; };

All examples were passing -std=c++17.

[Bug c++/89780] New: -Wpessimizing-move is too agressive with templates and recommends pessimization

2019-03-20 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89780

Bug ID: 89780
   Summary: -Wpessimizing-move is too agressive with templates and
recommends pessimization
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/JA-0Gq

#include 

struct Dest {
Dest() = default;
Dest(Dest&&);
Dest(const Dest&);
};
struct Source : Dest {};

template 
Dest withMove() {
T x;
return std::move(x);
}


template 
Dest noMove() {
T x;
return x;
}


template Dest withMove();
template Dest withMove();

template Dest noMove();
template Dest noMove();

> g++ -O3 -Wall -std=c++17

: In instantiation of 'Dest withMove() [with T = Dest]':
:24:30:   required from here
:13:23: warning: moving a local object in a return statement prevents
copy elision [-Wpessimizing-move]
   13 | return std::move(x);
  |   ^
:13:23: note: remove 'std::move' call


Basically, gcc9 recommends changing withMove, where both Source and Dest are
moved from, in to noMove, where Dest is copy-elided but Source is copied. While
this is a minor optimization for the Dest instantiation, it is a potentially
significant pesimization for the Source one.

Amusingly, this code trips a warning in clang that recommends doing the
opposite, adding the std::move to turn noMove into withMove:
https://godbolt.org/z/WONlMN

 :20:12: warning: local variable 'x' will be copied despite being
returned by name [-Wreturn-std-move]
return x;
   ^
:28:15: note: in instantiation of function template specialization
'noMove' requested here
template Dest noMove();
  ^
:20:12: note: call 'std::move' explicitly to avoid copying
return x;
   ^
   std::move(x)

There is just no winning!

[Bug middle-end/89739] New: pessimizing vectorization at -O3 to load two u64 objects

2019-03-16 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89739

Bug ID: 89739
   Summary: pessimizing vectorization at -O3 to load two u64
objects
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/8vIGZ3

using u64 = unsigned long long;
struct u128 {u64 a, b;};

inline u64 load8(void* ptr) {
u64 out;
__builtin_memcpy(, ptr, 8);
return out;
}

u128 load(char* basep, u64 n) {
return {load8(basep), load8(basep+n-8)};
}

At -O2 this emits ideal asm:
mov rax, QWORD PTR [rdi]
mov rdx, QWORD PTR [rdi-8+rsi]
ret


At -O3 it is comical:
movqxmm0, QWORD PTR [rdi]
movhps  xmm0, QWORD PTR [rdi-8+rsi]
movaps  XMMWORD PTR [rsp-24], xmm0
mov rax, QWORD PTR [rsp-24]
mov rdx, QWORD PTR [rsp-16]
ret

This seems to have been introduced in gcc7

[Bug c++/89706] New: -Wstringop-truncation strncpy message is confusing and has psuedo-false-positives

2019-03-13 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89706

Bug ID: 89706
   Summary: -Wstringop-truncation strncpy message is confusing and
has psuedo-false-positives
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/BAeXQB

"strncpy output truncated copying between 0 and 8 bytes from a string of length
8" doesn't make it obvious what the problem with the is or how to solve it.
Once I decoded it[1], I realized it was warning about something the code was
already set up to handle correctly. Specifically, the last line of func
re-establishes the nul byte termination regardless of how many bytes are
copied. (Yes I know that this code is a dumb use of strncpy and we should
probably just be using memcpy, but that isn't what this warning is warning
about.)


[1] I was only able to decode it by tweaking the code until I got the "output
truncated before terminating nul copying 8 bytes from a string of the same
length" form of the warning. This happens if you comment out the if-block in
this code.


#include 

extern size_t len;
extern char *buf;

inline void func(const char* s) {
size_t sz = strlen(s);
if (sz > len - 1)
sz = len - 1;
strncpy(buf, s, sz);
buf[sz] = '\0';
}

void test() {
const char* p = "Progress";
func(p);
}

> g++ -Wstringop-truncation -O2

In function 'void func(const char*)',
inlined from 'void test()' at :16:9:
:10:12: warning: 'char* strncpy(char*, const char*, size_t)' output
truncated copying between 0 and 8 bytes from a string of length 8
[-Wstringop-truncation]
   10 | strncpy(buf, s, sz);
  | ~~~^~~~

[Bug c++/89688] New: -Wstringop-overflow confused by 2D array of char

2019-03-12 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89688

Bug ID: 89688
   Summary: -Wstringop-overflow confused by 2D array of char
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

Also, for some reason it repeats the warning 3 times.

https://godbolt.org/z/SStUsl

// Fine
extern const char a[2] = {'1'};
auto z = __builtin_strlen(a);

// Warns
extern const char aa[1][2] = {{'2'}};
auto zz = __builtin_strlen(aa[0]);



> ~/opensource/gcc/prefix/bin/g++ -Wall str.cpp  -fsyntax-only
str.cpp:7:27: warning: ‘strlen’ argument missing terminating nul
[-Wstringop-overflow=]
7 | auto zz = __builtin_strlen(aa[0]);
  |   ^~~
str.cpp:6:19: note: referenced argument declared here
6 | extern const char aa[1][2] = {{'2'}};
  |   ^~
str.cpp:7:27: warning: ‘strlen’ argument missing terminating nul
[-Wstringop-overflow=]
7 | auto zz = __builtin_strlen(aa[0]);
  |   ^~~
str.cpp:6:19: note: referenced argument declared here
6 | extern const char aa[1][2] = {{'2'}};
  |   ^~
str.cpp:7:32: warning: ‘strlen’ argument missing terminating nul
[-Wstringop-overflow=]
7 | auto zz = __builtin_strlen(aa[0]);
  |^
str.cpp:6:19: note: referenced argument declared here
6 | extern const char aa[1][2] = {{'2'}};
  |   ^~


Reduced example from real code at:
https://github.com/boostorg/date_time/blob/b0437e2999a65668dc4178dbb817a89a382ece94/include/boost/date_time/special_values_formatter.hpp#L89-L92
+
https://github.com/boostorg/date_time/blob/b0437e2999a65668dc4178dbb817a89a382ece94/include/boost/date_time/special_values_formatter.hpp#L43-L45

[Bug c++/89682] New: g++9 incorrectly disallows using private static method as default arg to ctor of template type

2019-03-12 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89682

Bug ID: 89682
   Summary: g++9 incorrectly disallows using private static method
as default arg to ctor of template type
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

New in GCC9. No other compiler errors on this:

https://godbolt.org/z/480KfZ

template 
class C {
class TagType {};
public:
C(int, TagType = makeTag());
private:
static TagType makeTag();
};

void test() {
C(1);
}


> g++ -fsyntax-only./creduce_repro/test.ii
./creduce_repro/test.ii: In function ‘void test()’:
./creduce_repro/test.ii:5:29: error: ‘static C::TagType C::makeTag()
[with T = int]’ is private within this context
5 | C(int, TagType = makeTag());
  |  ~~~^~
./creduce_repro/test.ii:7:20: note: declared private here
7 | static TagType makeTag();
  |^~~

[Bug c++/89640] [9 Regression] g++ chokes on lambda with __attribute__

2019-03-11 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89640

--- Comment #4 from Mathias Stearn  ---
@Jakub, This code doesn't have either mutable or noexcept, so the "wrong place
in the grammer" issue doesn't apply. That part of the fix seems correct and
useful.

While it seems correct to fix what the c++11-syle [[attribute]] appertains to
to match the standard, it would be unfortunate to do the same to
__attribute__(()) style attributes which are already outside of the standard.
Until the standard provides some way to have an attribute that appertains to
the lambda function, this part of the "fix" is breaking useful functionality
that has existed since GCC-5 without offering any replacement.

[Bug c++/89640] [9 Regression] g++ chokes on lambda with __attribute__

2019-03-11 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89640

--- Comment #2 from Mathias Stearn  ---
Unfortunately the c++ attributes syntax applies to the lambda type rather than
the function, so the warning is correct. The old style __attribute__ syntax
seems to be the only way to annotate the lambda function, which is why we use
it here. We use something like this in a macro around code that builds error
messages on our error paths, which is why it needs to be on a lambda. It made a
notable shrink to the size of our primary .text section moving that stuff out
to the .text.cold section.

[Bug c++/89640] New: g++ chokes on lambda with __attribute__

2019-03-08 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89640

Bug ID: 89640
   Summary: g++ chokes on lambda with __attribute__
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

Regression in gcc 9 vs 8.

https://godbolt.org/z/MGL0H4

void test() {
[]() __attribute__((noinline,cold)) {
asm volatile("");
}();
}

: In function 'void test()':
:2:41: error: expected identifier before '{' token
2 | []() __attribute__((noinline,cold)) {
  | ^
:2:41: error: type-specifier invalid in lambda
Compiler returned: 1

[Bug c++/88865] New: [[no_unique_address]] leads to sizeof(T) == 0, which cannot be

2019-01-15 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88865

Bug ID: 88865
   Summary: [[no_unique_address]] leads to sizeof(T) == 0, which
cannot be
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/rJT5X9

struct B {};
struct A {
[[no_unique_address]] B a;
[[no_unique_address]] B b;
[[no_unique_address]] B c;
[[no_unique_address]] B d;
};

int f() {
return sizeof(A);
}

f():
pushrbp
mov rbp, rsp
mov eax, 0
pop rbp
ret

In addition to the major issue that sizeof(A) must not be 0, it also must not
be 1 either. It must be (at least) 4.
http://eel.is/c++draft/intro.object#9.sentence-2 is very clear that
[[no_unique_address]] (which clauses 7 and 8 define to mean a "subobject of
zero size") only allows members of *different types* to overlap. a,b,c, and d
are all distinct objects of the same type B, and therefore must have distinct
addresses.

> Two objects with overlapping lifetimes that are not bit-fields may have the 
> same address if one is nested within the other, or if at least one is a 
> subobject of zero size and they are of different types; otherwise, they have 
> distinct addresses and occupy disjoint bytes of storage.


https://godbolt.org/z/160XGN shows that some parts of gcc seem to understand
this rule, some something very strange must be going on.

[Bug c++/87312] New: statics in lambdas should be weak not local symbols

2018-09-14 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87312

Bug ID: 87312
   Summary: statics in lambdas should be weak not local symbols
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/oSuiQO

// IN HEADER:
inline auto lambda = [] () -> int* {
static int foo;
return 
};

inline int* func() {
static int foo;
return 
};

// NOT IN HEADER:
int* lambda_addr() {
return lambda();
}

int* func_addr() {
return func();
}

Both of the "foo" objects should have exactly one address in the whole program.
The "foo" in func() will work correctly, but the "foo" in the lambda will
incorrectly have one address in each TU where it is used.

[Bug c++/67012] decltype(auto) with trailing return type

2018-08-14 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67012

Mathias Stearn  changed:

   What|Removed |Added

 CC||redbeard0531 at gmail dot com

--- Comment #1 from Mathias Stearn  ---
Still repros with latest trunk builds, even with -pedantic:
https://godbolt.org/g/vMDcwg

[Bug c++/86276] New: Poor codegen when returning a std::vector

2018-06-21 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86276

Bug ID: 86276
   Summary: Poor codegen when returning a std::vector
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/g/aCiuAy

While I'm a bit sad that good() isn't just "ret", I think that the current
rules for allocation elision require something like that codegen.

I'd expect bad() to codegen to roughly the same as good(), but then rather than
calling operator delete, it should store [rax, rax + 1, rax + 22] to the three
qwords starting at the out pointer. Instead, it does things like checking if
the vector pointer it just zeroed is still zero to see if it needs to be
deleted. It also seems to register an exception landing pad (I'm guessing to
cover the operator new call) to handle freeing the vector's memory, even though
it should know it isn't holding any. 

If I had to guess, I'd say the problem is that it thinks the hidden return out
pointer has escaped when it hasn't really until a successful return. I'm pretty
sure nothing in the language allows any way to access the return value before a
function returns like that, but I'm now really curious if I'm wrong. If there
is, is there any way to tell gcc that I promise I'm not doing anything quite
that stupid?

PS- is it helpful to include the code and asm here in addition to the godbolt
links?

#include 
#include 

auto good() {
std::vector something;
something.reserve(22);
something = {0x02};
//return something; Only difference from bad
}


auto bad() {
std::vector something;
something.reserve(22);
something = {0x02};
return something;
}

good():
sub rsp, 8
mov edi, 22
calloperator new(unsigned long)
mov BYTE PTR [rax], 2
mov rdi, rax
add rsp, 8
jmp operator delete(void*)
bad():
pushrbp
pxorxmm0, xmm0
pushrbx
mov rbx, rdi
sub rsp, 24
mov QWORD PTR [rdi+16], 0
movups  XMMWORD PTR [rdi], xmm0
mov edi, 22
calloperator new(unsigned long)
mov rdi, QWORD PTR [rbx]
testrdi, rdi
je  .L5
mov QWORD PTR [rsp+8], rax
calloperator delete(void*)
mov rax, QWORD PTR [rsp+8]
.L5:
mov BYTE PTR [rax], 2
lea rdx, [rax+22]
mov QWORD PTR [rbx], rax
add rax, 1
mov QWORD PTR [rbx+8], rax
mov rax, rbx
mov QWORD PTR [rbx+16], rdx
add rsp, 24
pop rbx
pop rbp
ret
mov rbp, rax
jmp .L6
bad() [clone .cold.25]:
.L6:
mov rdi, QWORD PTR [rbx]
testrdi, rdi
je  .L7
calloperator delete(void*)
.L7:
mov rdi, rbp
call_Unwind_Resume

[Bug middle-end/86140] New: constprop clones with identical bodies

2018-06-13 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86140

Bug ID: 86140
   Summary: constprop clones with identical bodies
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

This was tweaked example code to demonstrate something totally unrelated, but I
noticed it was doing the constprop clone thing for functions with identical
instructions, which seemed odd. While this exact case doesn't matter, it seemed
like it might be a sign of a deeper bug.

https://godbolt.org/g/Zwpi7J

[[gnu::noinline]] void f(const int*){
asm volatile("");
}

void good() {
constexpr static int arr[1000] = {};
f(arr);
}

void bad() {
constexpr  int arr[1000] = {};
f(arr);
}


f(int const*) [clone .constprop.0]:
  ret
f(int const*) [clone .constprop.1]:
  ret
f(int const*):
  ret
good():
  jmp f(int const*) [clone .constprop.0]
bad():
  jmp f(int const*) [clone .constprop.1]

[Bug c++/86072] New: Poor codegen with atomics

2018-06-06 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86072

Bug ID: 86072
   Summary: Poor codegen with atomics
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/g/aEFhq8

#include 

std::atomic progress{-1};

void combine_writes1() {
// These should be reduced to a single store(0,release),
// At least as long as release-sequnce includes same-thread
// relaxed stores. If that is removed, this should just be
// a single relaxed store.
progress.store(0, std::memory_order_relaxed);
progress.store(0, std::memory_order_relaxed);
progress.store(0, std::memory_order_release);
progress.store(0, std::memory_order_release);
progress.store(0, std::memory_order_relaxed);
progress.store(0, std::memory_order_relaxed);
}

void combine_writes2() {
// Ditto above, but should store 5.
progress.store(0, std::memory_order_relaxed);
progress.store(1, std::memory_order_relaxed);
progress.store(2, std::memory_order_release);
progress.store(3, std::memory_order_release);
progress.store(4, std::memory_order_relaxed);
progress.store(5, std::memory_order_relaxed);
}

void combine_loads() {
// These should be reduced to either a single acquire-load
// or an acquire fence. 
progress.load(std::memory_order_relaxed);
progress.load(std::memory_order_relaxed);
progress.load(std::memory_order_acquire);
progress.load(std::memory_order_acquire);
progress.load(std::memory_order_relaxed);
progress.load(std::memory_order_relaxed);
}

combine_writes1():
  xor eax, eax
  mov DWORD PTR progress[rip], eax
  mov DWORD PTR progress[rip], eax
  mov DWORD PTR progress[rip], eax
  mov DWORD PTR progress[rip], eax
  mov DWORD PTR progress[rip], eax
  mov DWORD PTR progress[rip], eax
  ret
combine_writes2():
  mov DWORD PTR progress[rip], 0
  mov DWORD PTR progress[rip], 1
  mov DWORD PTR progress[rip], 2
  mov DWORD PTR progress[rip], 3
  mov DWORD PTR progress[rip], 4
  mov DWORD PTR progress[rip], 5
  ret
combine_loads():
  mov eax, DWORD PTR progress[rip]
  mov eax, DWORD PTR progress[rip]
  mov eax, DWORD PTR progress[rip]
  mov eax, DWORD PTR progress[rip]
  mov eax, DWORD PTR progress[rip]
  mov eax, DWORD PTR progress[rip]
  ret
progress:
  .long -1

This example seems to ICE segfaulting gcc trunk: https://godbolt.org/g/M4ZQGS


Possibly related: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86056

[Bug middle-end/86056] Codegen can result in multiple sequential mfence instructions

2018-06-05 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86056

--- Comment #2 from Mathias Stearn  ---
Oh, I agree that this optimization must be permitted. I was using this example
to prove this to someone else who didn't believe that. I only mentioned that to
explain how that weird source code came to be.

My point of this bug was that the code gen can be further optimized because
there is never a good reason to have multiple mfence instructions back to back
like that.

[Bug target/86056] New: Codegen can result in multiple sequential mfence instructions

2018-06-05 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86056

Bug ID: 86056
   Summary: Codegen can result in multiple sequential mfence
instructions
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

This is from an example to prove that atomic_thread_fence does not prevent the
compiler from optimizing non-escaped memory.

https://godbolt.org/g/288mTC

#include 
#include 
struct Type {
Type(Type&&)=default;
int i;
};

Type func(Type t) {
auto out = Type(Type(std::move(t)));
std::atomic_thread_fence(std::memory_order_seq_cst);
return out;
}

auto func2(Type t) { return func(func(func(func(std::move(t); }

func(Type):
  movl (%rsi), %edx
  movq %rdi, %rax
  movl %edx, (%rdi)
  mfence
  ret
func2(Type):
  movl (%rsi), %edx
  movq %rdi, %rax
  mfence
  mfence
  mfence
  movl %edx, (%rdi)
  mfence
  ret

[Bug libstdc++/85813] make_exception_ptr should support __cxa_init_primary_exception path even with -fno-exceptions

2018-05-17 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85813

--- Comment #7 from Mathias Stearn  ---
> Simply returning an empty exception_ptr is what happened before the PR 64241
> change, so what we do now retains that behaviour. That might be the main
> reason for it.

Silently dropping errors always skeeves me out. I'm not sure if anyone is well
served by the current behavior. If it were up to me (and I know it isn't) I'd
make that case call std::terminate() or similar rather than returning the "no
error" value. That seems consistent with the behavior of the __throw*
functions, but it is a breaking change. Or even better, since gcc seems fine
throwing through functions marked noexcept in -fno-exceptions TUs, maybe in the
(very rare) case where copying/moving an exception throws inside an
-fno-exceptions TU, just let it bubble out.

> ::new (__e) _Ex(std::forward<_Ex>(__ex));

Should that be std::move(__ex)? I don't know why, but make_exception_ptr takes
the exception by value rather than forwarding ref. I guess forward<_Ex> is fine
either way, and will stay correct if that defect gets fixed. It just seemed odd
to forward a value so I thought I'd mention it.

> Mix-and-match works if the function gets inlined.

Do you think this case warrants a [[gnu::always_inline]]? Given the sensitive
nature of error handling, it seems pretty bad if a correct usage of
make_exception_ptr() could be silently transformed to return the "no error"
value just by linking in a bad library. Even if that library never dynamically
executes make_exception_ptr(). I know I'd hate to be called in to debug that
issue...

I'm going to go make sure no code we link with uses -fno-exceptions!

[Bug libstdc++/85813] make_exception_ptr should support __cxa_init_primary_exception path even with -fno-exceptions

2018-05-17 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85813

--- Comment #3 from Mathias Stearn  ---
My assumption was that if E(...) throws and it can't be caught, it should be
treated as any other case when an -fno-exceptions TU calls a throwing function.
In this case that would mean calling terminate() due to the noexcept, which
seems better than returning a null exception_ptr.

However, while testing the behavior of mixing -fno-exceptions TUs with normal
ones, I realized there may be a bigger problem due to ODR violations. In
particular, if you link these TUs without optimizations, one of the asserts
will trip depending on the link order:

// g++ -fno-exceptions -c
#include 
#include 
void no_exceptions() {
assert(!std::make_exception_ptr(1));
}

// g++ -fexceptions
#include 
#include 
void no_exceptions();
int main() {
assert(std::make_exception_ptr(1));
no_exceptions();
}

Is that just accepted, implying the the whole program must be compiled with
either -fexceptions or -fno-exeptions, rather than allowing mix-and-match? If
so, I guess this whole point is moot.

[Bug libstdc++/85813] New: make_exception_ptr should support __cxa_init_primary_exception path even with -fno-exceptions

2018-05-16 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85813

Bug ID: 85813
   Summary: make_exception_ptr should support
__cxa_init_primary_exception path even with
-fno-exceptions
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

It looks like nothing in that path actually needs exception support. This would
allow make_exception_ptr to produce a valid exception_ptr even in TUs that
don't support exceptions. While that may not sound valuable, the exception_ptr
could be handed to TUs that do support exceptions where it can be rethrown (a
misnomer in this case...) and caught.

Also, if my paper http://wg21.link/p1066 is accepted, that would provide full
support for handling the exception without ever throwing it. And if this ticket
is impossible to implement for some reason, I'd love to know that before
Rapperswil ;)

The nullptr return was added in response to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64241, but it looks like that was
done before the __cxa_init_primary_exception path existed. When that was added
in
https://github.com/gcc-mirror/gcc/commit/cce8e59224e18858749a2324bce583bcfd160d6c#diff-e1b5856b8cc940de58c12ef252d63c34R188,
I can't tell if it was put in the #if __cpp_exceptions block for a good reason
or just by default.

[Bug libstdc++/85812] New: make_exception_ptr can leak the allocated exception if construction throws

2018-05-16 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85812

Bug ID: 85812
   Summary: make_exception_ptr can leak the allocated exception if
construction throws
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

I think if this line throws the exception allocated on line 185 will leak:
https://github.com/gcc-mirror/gcc/blob/3474beffb1fd23da44a752763e648d5d1ffa4d0f/libstdc%2B%2B-v3/libsupc%2B%2B/exception_ptr.h#L189.

[Bug c++/85799] __builin_expect doesn't propagate through inlined functions

2018-05-15 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85799

--- Comment #1 from Mathias Stearn  ---
LLVM bug: https://bugs.llvm.org/show_bug.cgi?id=37476

[Bug c++/85799] New: __builin_expect doesn't propagate through inlined functions

2018-05-15 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85799

Bug ID: 85799
   Summary: __builin_expect doesn't propagate through inlined
functions
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

It seems like __builtin_expect doesn't propagate to conditions when inlined.
This is unfortunate because in some cases it would be nice to be able to put
the expectation into methods so that every user doesn't need to do their own
hinting. Currently we need to use macros to achieve this.

See https://godbolt.org/g/MuPM77 for full example

inline bool expect_false(bool b) {
return __builtin_expect(b, false);
}

void inline_func_hint(bool b) {
if (expect_false(b)) {
unlikely(); // treated as more likely!!!
} else {
likely();
}
}


inline_func_hint(bool):
testdil, dil
je  .L11
jmp unlikely()
.L11:
jmp likely()

[Bug c++/83400] g++ -O1 doesn't execute any code in destructors with a throw statement if it sees another throw

2018-05-15 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83400

--- Comment #3 from Mathias Stearn  ---
This is related to https://wg21.link/CWG2219

[Bug c++/85736] New: Support warn_unused or warn_unused_result on specific constructors

2018-05-10 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85736

Bug ID: 85736
   Summary: Support warn_unused or warn_unused_result on specific
constructors
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

It would be nice to get the benefits of those attributes on a per-constructor
basis, rather than requiring them on the whole type. The particular use case I
have in mind is for unique_lock's default constructor (or at least on our
wrapper around it). I recently did a code review where someone typed:

std::unique_lock myMutex;

where they meant to use:

std::unique_lock lk(myMutex);

There is currently no warning for this at -Wall -Wextra, although thankfully it
is at least caught when myMutex has parentheses around it, which is the more
common mistake. Clearly, it wouldn't make sense to put warn_unused on the whole
unique_lock since the second line is fine.

It would probably make sense on almost all default constructors actually, since
with the exception of a few specific types that alter global or thread-local
state, why are you declaring a default constructed variable then not using it
at all? But on a few types like unique_lock it seems actively dangerous rather
than just simply wasteful.

[Bug middle-end/85721] bad codegen for looped copy of primitives at -O2 and -O3 (differently bad)

2018-05-10 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85721

--- Comment #4 from Mathias Stearn  ---
Marc Glisse pointed out at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85720#c3 that my I missed an
aliasing case when I created this ticket. memmove isn't a valid replacement if
out is in the range (in, in + n). I did some benchmarking to see what the best
solution is and how much this matters. This seems to do the best on
sandybridge, haswell, and an Opteron 6344 Piledriver:

[[gnu::noinline, gnu::optimize("s")]] void copy0(char* out, const char* in,
size_t n) {
if (n >= 8 &&__builtin_expect(out >= in + n || out + n <= in, 1)) {
memcpy(out, in, n);
return;
}
for (size_t i = 0; i < n; i++){
out[i] = in[i];
}
}

copy0(char*, char const*, unsigned long):
cmp rdx, 7
jbe .L7
lea rax, [rsi+rdx]
cmp rdi, rax
jnb .L3
lea rax, [rdi+rdx]
cmp rsi, rax
jb  .L7
.L3:
jmp memcpy
.L7:
xor eax, eax
.L5:
cmp rax, rdx
je  .L1
mov cl, BYTE PTR [rsi+rax]
mov BYTE PTR [rdi+rax], cl
inc rax
jmp .L5
.L1:
ret

With char, it is substantially faster than the current codegen for the orignal
loop at -O2  and moderately faster than -O3, while being about 10% the size.
With a TriviallyCopiable type with a non-trivial default ctor, even -O3 does
byte-by-byte, so it is a substantial win there as well.

Let me know if you'd like me to post the benchmark I was using.

[Bug middle-end/85720] bad codegen for looped assignment of primitives at -O2

2018-05-09 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85720

--- Comment #4 from Mathias Stearn  ---
(In reply to Marc Glisse from comment #3)
> Again, you are ignoring aliasing issues (just like in your other PR the
> function copy isn't equivalent to memmove). Does adding __restrict change
> the result? Also, B[i]=B[i]+1 doesn't look like a memset...

Sorry, I typoed. It was supposed to be B[i] = A[i] + 1. That still does
basically the same thing though: https://godbolt.org/g/dtmU5t. Good point about
aliasing though. I guess the right code gen in that case would actually be
something that detected the overlap and did the right calls to memset to only
set each byte once. Or just do the simple thing:

if (b > a && b < a + n) {
  memset(b, 1, n);
  memset(a, 0, n);
} else {
  memset(a, 0, n);
  memset(b, 1, n);
}

Yes, __restrict helps, but that isn't part of standard c++, and it seems like
it never will be.

[Bug middle-end/85721] bad codegen for looped copy of primitives at -O2 and -O3 (differently bad)

2018-05-09 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85721

--- Comment #3 from Mathias Stearn  ---
@Jonathan Wakely, that is because std::copy cheats and calls memmove directly.
A slight modification of the type that shouldn't matter defeats that
optimization and causes both forms to degrade to byte-by-byte:
https://godbolt.org/g/Z4fWNT. 

I actually consider that the fact that the explicit optimization in std::copy
is necessary should be considered an optimizer bug. Why isn't the optimizer
noticing that the simple implementation is the same as a memmove and do the
transformation for you? It generally seems unfortunate if similar code that
clearly means the same thing results in very different performance. Ideally,
even providing user-defined inline copy operations should result in calling
memmove if they are equivalent since the compiler should be able to "see
through" the non-triviality and optimize it all away.

I'm filing these tickets based on what Martin Sebor said: "As an aside, using
assignment and copy constructors should be at least as efficient as calling
memset and memcpy, and ideally more, and they should always be preferred over
the raw memory functions.  If/where they aren't as efficient please open bugs
for missing optimizations." Do you disagree with that statement?

[Bug middle-end/85720] bad codegen for looped assignment of primitives at -O2

2018-05-09 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85720

--- Comment #2 from Mathias Stearn  ---
Hmm. Taking the example from the -ftree-loop-distribute-patterns documentation,
it still seems to generate poor code, this time at both -O2 and -O3:
https://godbolt.org/g/EsQDj8

Why isn't that transformed to memset(A, 0, N); memset(B, 1, N); ? This feels
similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85721. Should I make a
new ticket with this example?

[Bug c++/85721] New: bad codegen for looped copy of primitives at -O2 and -O3 (differently bad)

2018-05-09 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85721

Bug ID: 85721
   Summary: bad codegen for looped copy of primitives at -O2 and
-O3 (differently bad)
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/g/Gg9fFt

Related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85720, but filed
separately because this also affects -O3. Similarly, while this affects types
other than char, char is most egregious.

using SIZE_T = decltype(sizeof(0));
void copy(char* out, const char* in, SIZE_T n) {
for (SIZE_T i = 0; i < n; i++){
out[i] = in[i];
}
}

This should probably just be compiled to check size then jmp memmove. At O2 it
copies byte-by-byte:

copy(char*, char const*, unsigned long):
testrdx, rdx
je  .L1
xor eax, eax
.L3:
movzx   ecx, BYTE PTR [rsi+rax]
mov BYTE PTR [rdi+rax], cl
add rax, 1
cmp rdx, rax
jne .L3
.L1:
ret


At O3 it generates a TON of code:

copy(char*, char const*, unsigned long):
  test rdx, rdx
  je .L1
  lea rax, [rsi+16]
  cmp rdi, rax
  lea rax, [rdi+16]
  setnb cl
  cmp rsi, rax
  setnb al
  or cl, al
  je .L7
  lea rax, [rdx-1]
  cmp rax, 14
  jbe .L7
  mov rcx, rdx
  xor eax, eax
  and rcx, -16
.L4:
  movdqu xmm0, XMMWORD PTR [rsi+rax]
  movups XMMWORD PTR [rdi+rax], xmm0
  add rax, 16
  cmp rax, rcx
  jne .L4
  mov rax, rdx
  and rax, -16
  cmp rdx, rax
  je .L1
  movzx ecx, BYTE PTR [rsi+rax]
  mov BYTE PTR [rdi+rax], cl
  lea rcx, [rax+1]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+1+rax]
  mov BYTE PTR [rdi+1+rax], cl
  lea rcx, [rax+2]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+2+rax]
  mov BYTE PTR [rdi+2+rax], cl
  lea rcx, [rax+3]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+3+rax]
  mov BYTE PTR [rdi+3+rax], cl
  lea rcx, [rax+4]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+4+rax]
  mov BYTE PTR [rdi+4+rax], cl
  lea rcx, [rax+5]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+5+rax]
  mov BYTE PTR [rdi+5+rax], cl
  lea rcx, [rax+6]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+6+rax]
  mov BYTE PTR [rdi+6+rax], cl
  lea rcx, [rax+7]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+7+rax]
  mov BYTE PTR [rdi+7+rax], cl
  lea rcx, [rax+8]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+8+rax]
  mov BYTE PTR [rdi+8+rax], cl
  lea rcx, [rax+9]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+9+rax]
  mov BYTE PTR [rdi+9+rax], cl
  lea rcx, [rax+10]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+10+rax]
  mov BYTE PTR [rdi+10+rax], cl
  lea rcx, [rax+11]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+11+rax]
  mov BYTE PTR [rdi+11+rax], cl
  lea rcx, [rax+12]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+12+rax]
  mov BYTE PTR [rdi+12+rax], cl
  lea rcx, [rax+13]
  cmp rdx, rcx
  jbe .L1
  movzx ecx, BYTE PTR [rsi+13+rax]
  mov BYTE PTR [rdi+13+rax], cl
  lea rcx, [rax+14]
  cmp rdx, rcx
  jbe .L1
  movzx edx, BYTE PTR [rsi+14+rax]
  mov BYTE PTR [rdi+14+rax], dl
  ret
.L7:
  xor eax, eax
.L3:
  movzx ecx, BYTE PTR [rsi+rax]
  mov BYTE PTR [rdi+rax], cl
  add rax, 1
  cmp rdx, rax
  jne .L3
.L1:
  ret

A) This should probably just call memmove which has a tuned implementation for
many architectures and uses ifunc dispatch to choose the right one based on the
runtime CPU rather than the compile-time settings. Also, all functions like
this for all types would all just jump to a single function, there should be I$
advantages.

B) If you really want to emit code for this rather than calling into libc, it
is probably best to use their technique of overlapped reads and writes for the
last vector rather than going into an unrolled byte-by-byte loop:
https://github.molgen.mpg.de/git-mirror/glibc/blob/20003c49884422da7ffbc459cdeee768a6fee07b/sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S#L331-L335

[Bug c++/85720] New: bad codegen for looped assignment of primitives at -O2

2018-05-09 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85720

Bug ID: 85720
   Summary: bad codegen for looped assignment of primitives at -O2
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/g/qp19Cv

using SIZE_T = decltype(sizeof(0));
void fill(char* p, SIZE_T n) {
for (SIZE_T i = 0; i < n; i++){
p[i] = -1;
}
}

fill(char*, unsigned long):
testrsi, rsi
je  .L1
add rsi, rdi
.L3:
mov BYTE PTR [rdi], -1
add rdi, 1
cmp rdi, rsi
jne .L3
.L1:
ret

At -O3 it basically just tail-calls memset.

Also applies to other types than char, but char is most egregious.

This ticket is spun out of from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85707

[Bug c++/85707] -Wclass-memaccess should excempt safe usage inside of a class and its friends

2018-05-09 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85707

--- Comment #8 from Mathias Stearn  ---
>  If the constructor had other side-effects (e.g., count
>  the number of objects of the class) bypassing it could
>  violate the invariant.

I thought one of the points of friendship was to allow friends to violate a
class's invariants temporarily as long as they promise that the class is left
in a valid state in the end. Since presumably a class and its friends are
maintained by the same people, they are aware of what the actual requirements
of the class are, even if they can't be stated precisely in the language today.

[Bug c++/85707] -Wclass-memaccess should excempt safe usage inside of a class and its friends

2018-05-08 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85707

--- Comment #2 from Mathias Stearn  ---
Here is a boiled down example of some of our code that trips this warning:
https://godbolt.org/g/ChLrch. It also shows why we do this, since the codegen
is *substantially* better for init_table_memset than init_table_assignment, at
least at -O2. Even if you improve the codegen for that case tomorrow, we'd
still need to keep using memset for a while until we stop supporting older
compilers.

This is reduced from
https://github.com/mongodb/mongo/blob/11a3d5ccb1216da0e84d941fd48e486f72455ba4/src/mongo/db/pipeline/document_internal.h.
The actual key type is stored as a variable-lenth string stored directly in the
buffer and the Key type in the interface is our pre-17 string_view equivalent.
The value is actually a type called Value, that holds an internal 16-byte type
called ValueStorage as its only member. ValueStorage also triggers the warning
in its lifetime methods:
https://github.com/mongodb/mongo/blob/11a3d5ccb1216da0e84d941fd48e486f72455ba4/src/mongo/db/pipeline/value_internal.h#L165-L221
(the DEV macro expands to "if (DEBUG_BUILD)" so you can ignore anything on
those lines). If necessary I can try to boil down that type too, but it is much
more complex, so I'm not sure how small I can make it.

This is all to implement what is essentially a dynamically-typed JSON object
which is something we need to be *REALLY* fast. A lot of effort went into
micro-optimizing these types so that the business logic code doesn't need to
worry about any of this and can write very natural looking, modern c++ code.
All of this memory-munging is hidden in internal types in _internal.h files.
The user facing types are not supposed to expose any of this, except to the
implementations which are all friendly which each other.

The third_party code that is tripping this is in S2. It tries to memcpy
https://github.com/mongodb/mongo/blob/11a3d5ccb1216da0e84d941fd48e486f72455ba4/src/mongo/db/pipeline/value_internal.h#L165-L221
an array of S2Points (typedef for Vector3)
https://github.com/mongodb/mongo/blob/11a3d5ccb1216da0e84d941fd48e486f72455ba4/src/third_party/s2/util/math/vector3.h#L30.
This doesn't meet the self-or-friend condition described in the ticket, so I'm
not sure what the solution there is, but it is an example of code that is
correct (I think, I haven't reviewed it *too* closely) but still triggers this
warning.

[Bug c++/85707] New: -Wclass-memaccess should excempt safe usage inside of a class and its friends

2018-05-08 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85707

Bug ID: 85707
   Summary: -Wclass-memaccess should excempt safe usage inside of
a class and its friends
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

I get the point of the warning and would like to be able to turn it on. However
we have some types that have been specifically designed to be memset and
memcpy/memmove-able as long as certain rules are followed. This is an
implementation detail of the type so non-friend users are expected to use the
default/copy/move constructors, rather than directly manipulating the bytes.
However classes (and their friends) are always allowed to violate their own
invariants, even if external users aren't. That is why I think the warning
should be suppressed* inside of contexts that have access to internals.

*Ideally it would still trip if the class's members were not mem-accessible by
it, but it seems more important (to me) to avoid the false-positives than to
avoid these kinds of false negatives if only one is possible.

I already know about the void*-cast to suppress the warning. I tried doing that
in our code base and it was required in too many places, all of which were
correct (as in no actual misuse was found). Additionally, this trips in
third-party code that we'd rather not alter unless there is an actual bug.

As a potential alternative, adding an attribute like [[gnu::memaccessable]]
that can be put on a type to suppress the warning for uses of that type might
also work.

[Bug c++/85680] Missed optimization for value-init of variable-sized allocation

2018-05-07 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85680

--- Comment #3 from Mathias Stearn  ---
MSVC and ICC both also handle this poorly: https://godbolt.org/g/i4wMYa

https://developercommunity.visualstudio.com/content/problem/246786/poor-codegen-for-value-init-followed-by-explicit-i.html

[Bug c++/85680] Missed optimization for value-init of variable-sized allocation

2018-05-07 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85680

--- Comment #2 from Mathias Stearn  ---
FYI, I don't think this is a signed/unsigned thing since it also repros with
unsigned long https://godbolt.org/g/LTmrpi

My initial repo actually used size_t, but I (incorrectly) changed it to long
rather than unsigned long based on past experience that compiler authors prefer
repros that don't have any #includes and the (IMO) sad fact that size_t still
isn't actually part of the language.

[Bug c++/85680] New: Missed optimization for value-init of variable-sized allocation

2018-05-07 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85680

Bug ID: 85680
   Summary: Missed optimization for value-init of variable-sized
allocation
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/g/6yZEKf (compiled with -O3)


char* variableSize(long n) {
auto p = new char[n]();
for (int i = 0; i < n; i++) {
p[i] = 0xff;
}
return p;
}

char* small() {
return variableSize(8);
}

char* large() {
return variableSize(10'000);
}


The variableSize() case generates two calls two memset with the both being
conditional on n > 0 if I'm reading this right), but the two checks are done in
different ways. That may be a second optimizer bug that it missed a
jump-threading opportunity.

variableSize(long):
pushrbx
mov rbx, rdi
calloperator new[](unsigned long)
mov rcx, rax
mov rax, rbx
sub rax, 1
js  .L5
lea rax, [rbx-2]
mov edx, 1
mov rdi, rcx
cmp rax, -1
cmovge  rdx, rbx
xor esi, esi
callmemset
mov rcx, rax
.L5:
testrbx, rbx
jle .L1
mov rdi, rcx
mov rdx, rbx
mov esi, 255
callmemset
mov rcx, rax
.L1:
mov rax, rcx
pop rbx
ret

Note that when g++ can see the allocation size it *does* elide the initial
memset:


small(): # @small()
  push rax
  mov edi, 8
  call operator new[](unsigned long)
  mov qword ptr [rax], -1
  pop rcx
  ret
large(): # @large()
  push rbx
  mov edi, 1
  call operator new[](unsigned long)
  mov rbx, rax
  mov esi, 255
  mov edx, 1
  mov rdi, rax
  call memset
  mov rax, rbx
  pop rbx
  ret



Related clang bug: https://bugs.llvm.org/show_bug.cgi?id=37351

[Bug c++/83400] New: g++ -O1 doesn't execute any code in destructors with a throw statement if it sees another throw

2017-12-12 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83400

Bug ID: 83400
   Summary: g++ -O1 doesn't execute any code in destructors with a
throw statement if it sees another throw
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

Repro code below doesn't print the "in dtor" line when compiled with -O1 or
higher. It does with g++ -O0 and with clang++ at any optimization level. Note
that even with the raise() call enabled, which would prevent the second
exception from firing, it still won't print "in dtor".

I've reproed with g++ 4.8.2, 5.4.0, 7.2.0, and 7.2.1, all on linux. No warnings
are output even with -Wall -Wextra.


#include 
#include 
#include 
#include 

struct ThrowInDtor{
~ThrowInDtor() noexcept(false) {
std::cout << "in dtor"  << std::endl; // Doesn't print.
// raise(SIGINT); // Still repros with this commented in.
throw std::exception();
}
};

int main() {
try {
ThrowInDtor throws;
throw std::exception();
} catch (std::exception&) {
// unreachable
}
}

[Bug tree-optimization/77943] [5/6 Regression] Optimization incorrectly commons noexcept calls with non-noexcept calls

2016-10-17 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77943

--- Comment #12 from Mathias Stearn  ---
We were hoping to replace many places in our code that does the following with
noexcept attributes on functions:

> try {doStuffThatShouldntThrow();} catch(...) {std::terminate();}

We wanted to take advantage of noexcept preventing exceptions from propagating
in a way that causes the terminate handler to be invoked at the throw site
rather than the catch site. We've avoided doing this so far because of the bugs
that we've found, leaving us with low confidence that we can rely on it. Can
you think of any similar cases that are likely to cause issues? Where do you
think it makes sense to focus our testing?

The scariest issues we've seen were a combination of failures to enforce
noexcept (like this bug) combined with noexcept causing try/catch blocks to be
optimized out. That lead to an exception thrown from a noexcept function
bypassing the surrounding catch block and escaping two layers of protection and
reaching code that really should never see them. Are there any compiler flags
we can use to tell it not to eliminate the catches based on noexcept
annotations?

We've also noticed that 'throw()' annotations have fewer issues, but have
avoided them because they are deprecated. As compiler writers, would you trust
'throw()' more or suggest we stick with noexcept?

[Bug c++/77943] Optimization incorrectly commons noexcept calls with non-noexcept calls

2016-10-11 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77943

--- Comment #3 from Mathias Stearn  ---
> Not being a C++ expect, but following spec:
> http://en.cppreference.com/w/cpp/language/noexcept_spec
> 
> "If a search for a matching exception handler leaves a function marked
> noexcept or noexcept(true), std::terminate is called immediately."
> 
> Which is probably what happens in fatal() function

The problem is that it merges the calls to fatal() and notFatal() into a single
call with no branch, so there is no way to only call std::terminate() if the
call to throws() was through fatal(). (In reply to Martin Liška from comment
#2)

PS - I realized I uploaded an old version of the repro that shows the alternate
incorrect behavior of always calling std::terminate even when it should go
through nonFatal(). Sorry about that. If you reverse the branches of the
if/else you'll get the behavior I describe in the initial report.

[Bug c++/77943] Optimization incorrectly commons noexcept calls with non-noexcept calls

2016-10-11 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77943

--- Comment #1 from Mathias Stearn  ---
Created attachment 39787
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39787=edit
Reproducer

[Bug c++/77943] New: Optimization incorrectly commons noexcept calls with non-noexcept calls

2016-10-11 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77943

Bug ID: 77943
   Summary: Optimization incorrectly commons noexcept calls with
non-noexcept calls
   Product: gcc
   Version: 6.2.1
Status: UNCONFIRMED
  Severity: critical
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

When compiled with -O0 and -01 the program behaves correctly and only calls
std::terminate() when passed an argument. With -O2 and -O3 it doesn't call
std::terminate() which means that it allowed an exception to escape from a
noexcept function in violation of the standard. 

> g++ -O0 noexcept_test.cpp && ./a.out ; ./a.out 1
should die: 0
should die: 1
terminate called after throwing an instance of 'int'
Aborted (core dumped)

> g++ -O1 noexcept_test.cpp && ./a.out ; ./a.out 1
should die: 0
should die: 1
terminate called after throwing an instance of 'int'
Aborted (core dumped)

> g++ -O2 noexcept_test.cpp && ./a.out ; ./a.out 1
should die: 0
should die: 1

> g++ -O3 noexcept_test.cpp && ./a.out ; ./a.out 1
should die: 0
should die: 1

The actual behavior with -O2 and -O3 depends on which way the if block is
written. The attached version is the scariest, but if you swap the if and else
blocks and remove the ! from the condition, the executable will always call
std::terminate() even when it shouldn't.

[Bug c++/69300] New: g++ segfault on silly noexcept case

2016-01-15 Thread redbeard0531 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69300

Bug ID: 69300
   Summary: g++ segfault on silly noexcept case
   Product: gcc
   Version: 5.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

Created attachment 37359
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37359=edit
Preprocessed test case

This is a reduced example from a ridiculous code contest. It compiles with
clang++ but segfaults during compilation on g++. It isn't something that I
actually need to work, but I thought I'd report it since it probably shouldn't
be crashing g++.

Code:

template
struct F {
template
void f() && noexcept(::template f) {}
};

int main (int argc, char ** argv) {
F().f();
}

g++ output:

> g++ -v -save-temps -std=c++14 bug.cpp
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: /build/gcc/src/gcc-5.2.0/configure --prefix=/usr
--libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/
--enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared
--enable-threads=posix --enable-libmpx --with-system-zlib --with-isl
--enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu
--disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object
--enable-linker-build-id --enable-lto --enable-plugin
--enable-install-libiberty --with-linker-hash-style=gnu
--enable-gnu-indirect-function --disable-multilib --disable-werror
--enable-checking=release --with-default-libstdcxx-abi=gcc4-compatible
Thread model: posix
gcc version 5.2.0 (GCC) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-std=c++14' '-shared-libgcc'
'-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/cc1plus -E -quiet -v -D_GNU_SOURCE
bug.cpp -mtune=generic -march=x86-64 -std=c++14 -fpch-preprocess -o bug.ii
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/../../../../x86_64-unknown-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/../../../../include/c++/5.2.0

/usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/../../../../include/c++/5.2.0/x86_64-unknown-linux-gnu

/usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/../../../../include/c++/5.2.0/backward
 /usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/include
 /usr/local/include
 /usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/include-fixed
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-std=c++14' '-shared-libgcc'
'-mtune=generic' '-march=x86-64'
 /usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/cc1plus -fpreprocessed bug.ii
-quiet -dumpbase bug.cpp -mtune=generic -march=x86-64 -auxbase bug -std=c++14
-version -o bug.s
GNU C++14 (GCC) version 5.2.0 (x86_64-unknown-linux-gnu)
compiled by GNU C version 5.2.0, GMP version 6.0.0, MPFR version
3.1.3-p4, MPC version 1.0.3
warning: GMP header version 6.0.0 differs from library version 6.1.0.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C++14 (GCC) version 5.2.0 (x86_64-unknown-linux-gnu)
compiled by GNU C version 5.2.0, GMP version 6.0.0, MPFR version
3.1.3-p4, MPC version 1.0.3
warning: GMP header version 6.0.0 differs from library version 6.1.0.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 9ed1d81099b98de89457560501a9ea0c
g++: internal compiler error: Segmentation fault (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://bugs.archlinux.org/> for instructions.