[Bug c++/88504] New: Inconsistent error message notes when using forward-declared type as value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88504 Bug ID: 88504 Summary: Inconsistent error message notes when using forward-declared type as value Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- struct Foo; struct Bar { Bar(Foo f_) : m_foo(f_) { } Foo m_foo; }; Foo baz1() { } void baz2(Foo f_) { } void baz3() { Foo foo; } Foo g_foo; $ g++-9.0.0 -Wall -Wextra -c 20181214-fwddecl_value.cpp 20181214-fwddecl_value.cpp:10:6: error: field ‘m_foo’ has incomplete type ‘Foo’ 10 | Foo m_foo; | ^ 20181214-fwddecl_value.cpp:1:8: note: forward declaration of ‘struct Foo’ 1 | struct Foo; |^~~ 20181214-fwddecl_value.cpp:5:10: error: ‘f_’ has incomplete type 5 | Bar(Foo f_) : | ^~ 20181214-fwddecl_value.cpp:1:8: note: forward declaration of ‘struct Foo’ 1 | struct Foo; |^~~ 20181214-fwddecl_value.cpp:13:10: error: return type ‘struct Foo’ is incomplete 13 | Foo baz1() | ^ 20181214-fwddecl_value.cpp:17:15: error: ‘f_’ has incomplete type 17 | void baz2(Foo f_) | ^~ 20181214-fwddecl_value.cpp:1:8: note: forward declaration of ‘struct Foo’ 1 | struct Foo; |^~~ 20181214-fwddecl_value.cpp: In function ‘void baz2(Foo)’: 20181214-fwddecl_value.cpp:17:15: warning: unused parameter ‘f_’ [-Wunused-parameter] 17 | void baz2(Foo f_) | ^~ 20181214-fwddecl_value.cpp: In function ‘void baz3()’: 20181214-fwddecl_value.cpp:23:6: error: aggregate ‘Foo foo’ has incomplete type and cannot be defined 23 | Foo foo; | ^~~ 20181214-fwddecl_value.cpp: At global scope: 20181214-fwddecl_value.cpp:26:5: error: aggregate ‘Foo g_foo’ has incomplete type and cannot be defined 26 | Foo g_foo; | ^ Most messages contain the note where the forward decl occured, but some don't: - returning from baz1() - local variable 'foo' in baz3() - global variable 'g_foo' Mentioning it everywhere would be helpful. Quite a minor issue, but the wording of the error messages also varies somewhat: - inside the class: field ‘m_foo’ has incomplete type ‘Foo’ - as fn param, the name of the type is omitted : ‘f_’ has incomplete type - when returning, 'struct Foo' is mentioned: return type ‘struct Foo’ is incomplete - but when defining a variable, Foo is not a struct, but an aggregate: aggregate ‘Foo foo’ has incomplete type and cannot be defined
[Bug c++/88503] New: 'invalid static_cast' error message could be more helpful
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88503 Bug ID: 88503 Summary: 'invalid static_cast' error message could be more helpful Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- class Parent; class Derived; Derived* foo(Parent* p) { return static_cast(p); } $ g++-9.0.0 -Wall -Wextra -c 20181214-fwddecl_vs_static_cast.cpp 20181214-fwddecl_vs_static_cast.cpp: In function ‘Derived* foo(Parent*)’: 20181214-fwddecl_vs_static_cast.cpp:6:32: error: invalid static_cast from type ‘Parent*’ to type ‘Derived*’ 6 | return static_cast(p); |^ Now imagine that your project is somewhat more complex than the above example, and Derived derives from Parent. You know it, you are used to it. However, some code needs only the forward declarations, at other places the full definition is needed. When I bumped into this error, it caused me quite some time to figure out what is going on. I double/triple/quad checked the names of the classes, no typo, yet the error. It turned out of course that only the forward declarations were available in that TU. It would be really useful if the error message was a bit more verbose, mentioning the fact if either types were forward declared, and if no parent-child relationship info is available.
[Bug c++/88493] New: Inconsistent error messages for unknown types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88493 Bug ID: 88493 Summary: Inconsistent error messages for unknown types Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- When compiling: 8<8<8< struct Foo; struct Bar { explicit Bar(Fooo* foo_) : m_foo{foo_} { } void Baz(Fooo* foo_); Fooo* m_foo; }; 8<8<8< Foo is forward declared, Fooo is not declared/defined, simulating a typo. $ g++-8.2.1 -Wall -c 20181213-bad_type_name_diag.cpp 20181213-bad_type_name_diag.cpp:5:19: error: expected ‘)’ before ‘*’ token explicit Bar(Fooo* foo_) : ~^ ) 20181213-bad_type_name_diag.cpp:7:2: error: expected unqualified-id before ‘{’ token { ^ 20181213-bad_type_name_diag.cpp:10:11: error: ‘Fooo’ has not been declared void Baz(Fooo* foo_); ^~~~ 20181213-bad_type_name_diag.cpp:12:2: error: ‘Fooo’ does not name a type; did you mean ‘Foo’? Fooo* m_foo; ^~~~ Foo The first error is not really helping, the second one is a cascade from that. The third one is OK, but there is no suggestion. The fourth one is the most useful. For the three cases of Foo vs Fooo mismatch, three different error messages was printed. If I follow the suggestion of the first error msg, and put a ): after Fooo and comment out the rest of the line: $ g++-8.2.1 -Wall -c 20181213-bad_type_name_diag.cpp 20181213-bad_type_name_diag.cpp:5:19: error: function definition does not declare parameters explicit Bar(Fooo) : //* foo_) : ^ However, mentioning only the type of the parameter and not name it should be accepted. Then, report an error because Fooo is unknown, possibly suggesting Foo instead. GCC 6/7/8/9 report the same errors, with minor formatting/coloring differences. $ g++-8.2.1 -v Using built-in specs. COLLECT_GCC=g++-8.2.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/8.2.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure CC=gcc-8.2.1 CXX=g++-8.2.1 --enable-languages=c,c++ --disable-multilib --program-suffix=-8.2.1 --disable-bootstrap CFLAGS='-O2 -march=native -mtune=native' CXXFLAGS='-O2 -march=native -mtune=native' Thread model: posix gcc version 8.2.1 20181213 (GCC) $ g++-9.0.0 -v Using built-in specs. COLLECT_GCC=g++-9.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/9.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure CC=gcc-8.2.1 CXX=g++-8.2.1 --enable-languages=c,c++ --disable-multilib --program-suffix=-9.0.0 --disable-bootstrap CFLAGS='-O2 -march=native -mtune=native' CXXFLAGS='-O2 -march=native -mtune=native' Thread model: posix gcc version 9.0.0 20181213 (experimental) (GCC)
[Bug c++/87531] [8/9 Regression] assignment operator does nothing if performed as a call via operator=
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87531 petschy at gmail dot com changed: What|Removed |Added CC||petschy at gmail dot com --- Comment #7 from petschy at gmail dot com --- After this fix, the following code doesn't compile: template struct Ptr { Ptr& operator=(T* p_) { return operator=(p_); } template Ptr& operator=(U* p_) { ptr = p_; return *this; } T* ptr = nullptr; }; $ g++-8.2.1 -Wall -std=c++11 -c 20181204-templated_opeq.cpp 20181204-templated_opeq.cpp: In member function ‘Foo& Foo::operator=(T*)’: 20181204-templated_opeq.cpp:6:21: error: expected primary-expression before ‘>’ token return operator=(p_); ^ On the gcc-8-branch, the commit before the fix (a9a931e4) is OK. 7.3.1 (4c925b84) is OK. Tested on Debian Stretch, AMD64. Is the above code invalid?
[Bug ipa/86436] IPA-ICF: miissed optimization at class template member functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86436 --- Comment #2 from petschy at gmail dot com --- Folding instantiations of member functions of class templates is a low hanging fruit IMHO. So if they are not handled ATM, then consider this ticket as a feature request, rather than a bug.
[Bug c++/86436] New: IPA-ICF: miissed optimization at class template member functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86436 Bug ID: 86436 Summary: IPA-ICF: miissed optimization at class template member functions Product: gcc Version: 8.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- Created attachment 44363 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44363=edit test case The attached source has two class templates: NonFoldable and Foldable. The nonfoldable version uses the int template param in Bar() (a leaf function in the call graph) to do a computation, so different template params result in different Bar()'s, so they can't be folded, and because of that neither its callers. So far so good. Foldable is very similar, but puts the template param into a const member variable in the ctor and Bar() uses that member. So now the value is different, but, the code is exacly the same. Some asm dump of x86-64 code: 7.3.1 (9bcef54ae6c7df97b276a7fa8da4c90d2452333c): Dump of assembler code for function nf_foo_0(NonFoldable<0>&, int): 0x004004d0 <+0>: mov%esi,%edi 0x004004d2 <+2>: jmp0x4004a0 ::Foo(int)> Dump of assembler code for function nf_bar_0(NonFoldable<0>&, int): 0x004004e0 <+0>: mov%esi,%edi 0x004004e2 <+2>: jmp0x400490 ::Bar(int)> Dump of assembler code for function NonFoldable<0>::Foo(int): 0x004004a0 <+0>: jmp0x400490 ::Bar(int)> Dump of assembler code for function NonFoldable<0>::Bar(int): 0x00400490 <+0>: mov%edi,%eax 0x00400492 <+2>: retq Dump of assembler code for function nf_foo_42(NonFoldable<42>&, int): 0x004004f0 <+0>: mov%esi,%edi 0x004004f2 <+2>: jmp0x4004c0 ::Foo(int)> Dump of assembler code for function nf_bar_42(NonFoldable<42>&, int): 0x00400500 <+0>: mov%esi,%edi 0x00400502 <+2>: jmp0x4004b0 ::Bar(int)> Dump of assembler code for function NonFoldable<42>::Foo(int): 0x004004c0 <+0>: jmp0x4004b0 ::Bar(int)> Dump of assembler code for function NonFoldable<42>::Bar(int): 0x004004b0 <+0>: lea0x2a(%rdi),%eax 0x004004b3 <+3>: retq Dump of assembler code for function f_foo_0(Foldable<0>&, int): 0x00400510 <+0>: jmpq 0x400560 ::Foo(int)> Dump of assembler code for function f_bar_0(Foldable<0>&, int): 0x00400520 <+0>: jmpq 0x400550 ::Bar(int)> Dump of assembler code for function Foldable<0>::Foo(int): 0x00400560 <+0>: jmpq 0x400550 ::Bar(int)> Dump of assembler code for function Foldable<0>::Bar(int): 0x00400550 <+0>: mov(%rdi),%eax 0x00400552 <+2>: add%esi,%eax 0x00400554 <+4>: retq Dump of assembler code for function f_foo_42(Foldable<42>&, int): 0x00400530 <+0>: jmpq 0x400580 ::Foo(int)> Dump of assembler code for function f_bar_42(Foldable<42>&, int): 0x00400540 <+0>: jmpq 0x400570 ::Bar(int)> Dump of assembler code for function Foldable<42>::Foo(int): 0x00400580 <+0>: jmpq 0x400570 ::Bar(int)> Dump of assembler code for function Foldable<42>::Bar(int): 0x00400570 <+0>: mov(%rdi),%eax 0x00400572 <+2>: add%esi,%eax 0x00400574 <+4>: retq Under 7.3.1 no identical code folding happens at all. 8.1.1 & 9.0.0 only folds Foldable<0>::Bar with Foldable<42>::Bar(). Foo(), and the free standing functions calling these members are not recognized as foldable. I haven't thought really hard about the folding rules. Checking each fn against each fn globally by default is probably waaay too much work. However, instantiations of class template members are easy candidates. Or, on a wider scale, functions where the number and type of the args and the return type is the same, OR at least the same size. The scope should be configurable, eg compilation unit or shared lib / executable (LTO). My quick (and probably incomplete) rules would be: - if the functions are leaf OR call the same or foldable functions only - the accessed global variables (incl static members) are the same - the accessed member variables' offsets are the same, and the types are the same, OR at least they have the same size and the computations have the same results bitwise. Eg signed vs unsigned ints of the same size: reading a me
[Bug middle-end/85637] Unneeded store of member variables in inner loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85637 --- Comment #5 from petschy at gmail dot com --- Thanks, in this specific case __restrict works indeed. On a side note, is it possible to achieve the same when a char is stored through a char* member, and also incremented? eg: if (m_cur < m_end) *m_cur = val; ++m_cur; Since char* aliases everything, m_cur and m_end won't be kept in registers properly as the compiler assumes that the store through *m_cur might have changed them. No amount of __restrict pepper helped with this. Is it far fetched to request an extension which can be turned on via a cmdline flag and causes 'char* __restict p' to behave like any other restricted ptr, ie not aliasing any other char*'s, let alone other types? Any serialization code that uses classes to store the ptrs would benefit from this, as no more in-loop re-load/store would be needed for the members.
[Bug middle-end/85637] Unneeded store of member variables in inner loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85637 --- Comment #2 from petschy at gmail dot com --- Thanks. For non-char types, one can use __restrict on ptrs, but for chars it doesn't work, unfortunately (strict aliasing rules). Is there a way to tell the compiler that a char ptr doesn't alias anything in the function? The current behaviour pessimizes any code that does byte I/O with classes, if I understand the rules correcly: - for const char* it assumes that members might be read through the ptr, so it stores them back after an update - for char* it assumes that after a write, any members in registers must be re-loaded as the write might have changed them.
[Bug c++/85640] New: Code size regression vs 7.3.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85640 Bug ID: 85640 Summary: Code size regression vs 7.3.1 Product: gcc Version: 8.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- Created attachment 44062 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44062=edit source Attached the source of a simple Adler32 checksum class. The Update() fn is 32 bytes longer compared to the code generated with 7.3.1. Dump of assembler code for function Adler32::Update(void const*, unsigned int): 7.3.1 0x00400500 <+0>: test %edx,%edx 7.3.1 0x00400502 <+2>: je 0x400578 <Adler32::Update(void const*, unsigned int)+120> 7.3.1 0x00400504 <+4>: mov(%rdi),%ecx 7.3.1 0x00400506 <+6>: mov0x4(%rdi),%r8d 7.3.1 0x0040050a <+10>:mov$0x80078071,%r10d 7.3.1 0x00400510 <+16>:xor%r9d,%r9d 7.3.1 0x00400513 <+19>:cmp$0x15af,%edx 7.3.1 0x00400519 <+25>:jbe0x400527 <Adler32::Update(void const*, unsigned int)+39> 7.3.1 0x0040051b <+27>:lea-0x15b0(%rdx),%r9d 7.3.1 0x00400522 <+34>:mov$0x15b0,%edx 7.3.1 0x00400527 <+39>:lea-0x1(%rdx),%eax 7.3.1 0x0040052a <+42>:lea0x1(%rsi,%rax,1),%rdx 7.3.1 0x0040052f <+47>:nop 7.3.1 0x00400530 <+48>:add$0x1,%rsi 7.3.1 0x00400534 <+52>:movzbl -0x1(%rsi),%eax 7.3.1 0x00400538 <+56>:add%eax,%ecx 7.3.1 0x0040053a <+58>:add%ecx,%r8d 7.3.1 0x0040053d <+61>:cmp%rdx,%rsi 7.3.1 0x00400540 <+64>:mov%ecx,(%rdi) 7.3.1 0x00400542 <+66>:mov%r8d,0x4(%rdi) 7.3.1 0x00400546 <+70>:jne0x400530 <Adler32::Update(void const*, unsigned int)+48> 7.3.1 0x00400548 <+72>:mov%ecx,%eax 7.3.1 0x0040054a <+74>:mul%r10d 7.3.1 0x0040054d <+77>:mov%r8d,%eax 7.3.1 0x00400550 <+80>:shr$0xf,%edx 7.3.1 0x00400553 <+83>:imul $0xfff1,%edx,%edx 7.3.1 0x00400559 <+89>:sub%edx,%ecx 7.3.1 0x0040055b <+91>:mul%r10d 7.3.1 0x0040055e <+94>:mov%ecx,(%rdi) 7.3.1 0x00400560 <+96>:shr$0xf,%edx 7.3.1 0x00400563 <+99>:imul $0xfff1,%edx,%edx 7.3.1 0x00400569 <+105>: sub%edx,%r8d 7.3.1 0x0040056c <+108>: test %r9d,%r9d 7.3.1 0x0040056f <+111>: mov%r9d,%edx 7.3.1 0x00400572 <+114>: mov%r8d,0x4(%rdi) 7.3.1 0x00400576 <+118>: jne0x400510 <Adler32::Update(void const*, unsigned int)+16> 7.3.1 0x00400578 <+120>: repz retq Dump of assembler code for function Adler32::Update(void const*, unsigned int): 8.1.1 0x00400500 <+0>: test %edx,%edx 8.1.1 0x00400502 <+2>: je 0x400598 <Adler32::Update(void const*, unsigned int)+152> 8.1.1 0x00400508 <+8>: mov(%rdi),%ecx 8.1.1 0x0040050a <+10>:mov0x4(%rdi),%r8d 8.1.1 0x0040050e <+14>:push %rbx 8.1.1 0x0040050f <+15>:mov$0x80078071,%ebx 8.1.1 0x00400514 <+20>:nopl 0x0(%rax) 8.1.1 0x00400518 <+24>:xor%r11d,%r11d 8.1.1 0x0040051b <+27>:cmp$0x15af,%edx 8.1.1 0x00400521 <+33>:jbe0x40052f <Adler32::Update(void const*, unsigned int)+47> 8.1.1 0x00400523 <+35>:lea-0x15b0(%rdx),%r11d 8.1.1 0x0040052a <+42>:mov$0x15b0,%edx 8.1.1 0x0040052f <+47>:mov%edx,%r10d 8.1.1 0x00400532 <+50>:mov%rsi,%rax 8.1.1 0x00400535 <+53>:add%rsi,%r10 8.1.1 0x00400538 <+56>:nopl 0x0(%rax,%rax,1) 8.1.1 0x00400540 <+64>:add$0x1,%rax 8.1.1 0x00400544 <+68>:movzbl -0x1(%rax),%r9d 8.1.1 0x00400549 <+73>:add%r9d,%ecx 8.1.1 0x0040054c <+76>:add%ecx,%r8d 8.1.1 0x0040054f <+79>:mov%ecx,(%rdi) 8.1.1 0x00400551 <+81>:mov%r8d,0x4(%rdi) 8.1.1 0x00400555 <+85>:cmp%r10,%rax 8.1.1 0x00400558 <+88>:jne0x400540 <Adler32::Update(void const*, unsigned int)+64> 8.1.1 0x0040055a <+90>:lea-0x1(%rdx),%eax 8.1.1 0x0040055d <+93>:lea0x1(%rsi,%rax,1),%rsi 8.1.1 0x00400562 <+98>:mov
[Bug c++/85637] New: Unneeded store of member variables in inner loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85637 Bug ID: 85637 Summary: Unneeded store of member variables in inner loop Product: gcc Version: 7.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- Created attachment 44058 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44058=edit source Attached a simple Adler32 checksum class. When updating with an array of bytes, the inner loop just accumulates the two sums, then the modulo is done in the outer loop. This way the cost of the two modulos is amortized. At the start the two member variables are loaded into registers, however, they are stored back to memory in each inner loop iteration. Then, also at the end after the modulo, but before the end of the outer loop. There is only one exit from the function. Why not store the registers back just once right before the ret? Dump of assembler code for function Adler32::Update(void const*, unsigned int): 0x00400500 <+0>: test %edx,%edx 0x00400502 <+2>: je 0x400578 <Adler32::Update(void const*, unsigned int)+120> 0x00400504 <+4>: mov(%rdi),%ecx ; ecx is m_s1 0x00400506 <+6>: mov0x4(%rdi),%r8d ; r8d is m_s2 0x0040050a <+10>:mov$0x80078071,%r10d 0x00400510 <+16>:xor%r9d,%r9d 0x00400513 <+19>:cmp$0x15af,%edx 0x00400519 <+25>:jbe0x400527 <Adler32::Update(void const*, unsigned int)+39> 0x0040051b <+27>:lea-0x15b0(%rdx),%r9d 0x00400522 <+34>:mov$0x15b0,%edx 0x00400527 <+39>:lea-0x1(%rdx),%eax 0x0040052a <+42>:lea0x1(%rsi,%rax,1),%rdx 0x0040052f <+47>:nop 0x00400530 <+48>:add$0x1,%rsi 0x00400534 <+52>:movzbl -0x1(%rsi),%eax 0x00400538 <+56>:add%eax,%ecx; m_s1 += *buf 0x0040053a <+58>:add%ecx,%r8d; m_s2 += m_s1 0x0040053d <+61>:cmp%rdx,%rsi 0x00400540 <+64>:mov%ecx,(%rdi) ; !!! unneeded store 0x00400542 <+66>:mov%r8d,0x4(%rdi) ; !!! ditto 0x00400546 <+70>:jne0x400530 <Adler32::Update(void const*, unsigned int)+48> 0x00400548 <+72>:mov%ecx,%eax 0x0040054a <+74>:mul%r10d 0x0040054d <+77>:mov%r8d,%eax 0x00400550 <+80>:shr$0xf,%edx 0x00400553 <+83>:imul $0xfff1,%edx,%edx 0x00400559 <+89>:sub%edx,%ecx 0x0040055b <+91>:mul%r10d 0x0040055e <+94>:mov%ecx,(%rdi) ; !!! this could be done after the jne at +118 0x00400560 <+96>:shr$0xf,%edx 0x00400563 <+99>:imul $0xfff1,%edx,%edx 0x00400569 <+105>: sub%edx,%r8d 0x0040056c <+108>: test %r9d,%r9d 0x0040056f <+111>: mov%r9d,%edx 0x00400572 <+114>: mov%r8d,0x4(%rdi) ; !!! ditto 0x00400576 <+118>: jne0x400510 <Adler32::Update(void const*, unsigned int)+16> 0x00400578 <+120>: repz retq The above code is generated w/ 7.3.1, 6.3.1 generates the exact same code. 8.0.1 and 8.1.1 generates somewhat different code, longer by 32 bytes, but the placing of the stores are the same. The size difference is odd, but I'll open another bug for that. Platform: AMD64 (FX-8150), Debian 9.4 $ g++-6.3.1 -v Using built-in specs. COLLECT_GCC=g++-6.3.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.3.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-6.3.1 --disable-bootstrap --enable-checking=release CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 6.3.1 20170120 (GCC) $ g++-7.3.1 -v Using built-in specs. COLLECT_GCC=g++-7.3.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/7.3.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-7.3.1 --disable-bootstrap CFLAGS='-O2 -march=native -mtune=native' CXXFLAGS='-O2 -march=native -mtune=native' Thread model: posix gcc version 7.3.1 20180429 (GCC) $ g++-8.0.1 -v Using built-in specs. COLLECT_GCC=g++-8.0.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/8.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-langu
[Bug c++/84378] New: Misleading diagnostics when using ambiguous names for ptr to memfun args in member fn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84378 Bug ID: 84378 Summary: Misleading diagnostics when using ambiguous names for ptr to memfun args in member fn Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- Created attachment 43414 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43414=edit source to reproduce the bug An enum member and a using typedef clashes. When defining a member function pointer, GCC gives the correct reason if it is defined in a free standing function, however, the reasoning is omitted if it is defined in a class member function. Tried with 7.3.1, 8.0.0 and 8.0.1, all the same. In both cases there is some misleading 'noise' in the diagnostics. Errors for the free standing function: 20180214-pmf.cpp: In function ‘void foo()’: 20180214-pmf.cpp:25:19: error: cannot declare pointer to ‘void’ member void(Bar::*mfpf)(MyFoo, bool, const char*); ^ There is NO 'void' member. 20180214-pmf.cpp:25:19: error: reference to ‘MyFoo’ is ambiguous 20180214-pmf.cpp:19:19: note: candidates are: ‘Enum MyFoo’ enum Enum : int { MyFoo, MyBar, MyBaz }; ^ 20180214-pmf.cpp:9:44: note: ‘using MyFoo = struct ns1::Foo<int, ns2::MyFooTag>’ using MyFoo = ns1::Foo<int, class MyFooTag>; ^ This is all I needed! 20180214-pmf.cpp:25:26: error: expected primary-expression before ‘bool’ void(Bar::*mfpf)(MyFoo, bool, const char*); ^~~~ 20180214-pmf.cpp:25:32: error: expected primary-expression before ‘const’ void(Bar::*mfpf)(MyFoo, bool, const char*); ^ 20180214-pmf.cpp:25:43: error: expression list treated as compound expression in initializer [-fpermissive] void(Bar::*mfpf)(MyFoo, bool, const char*); ^ 20180214-pmf.cpp:27:15: error: cannot convert ‘void (Bar::*)(ns2::MyFoo, bool, const char*)’ {aka ‘void (Bar::*)(ns1::Foo<int, ns2::MyFooTag>, bool, const char*)’} to ‘void Bar::*’ in assignment mfpf = ::Baz; ^~~ This isn't too helpful, either. Errors for the member function. The most helpful message about the ambiguity is omitted, unfortunately: 20180214-pmf.cpp: In member function ‘void Class::MemFun()’: 20180214-pmf.cpp:38:20: error: cannot declare pointer to ‘void’ member void(Bar::*mfpf)(MyFoo2, bool, const char*); ^~ 20180214-pmf.cpp:38:28: error: expected primary-expression before ‘bool’ void(Bar::*mfpf)(MyFoo2, bool, const char*); ^~~~ 20180214-pmf.cpp:38:34: error: expected primary-expression before ‘const’ void(Bar::*mfpf)(MyFoo2, bool, const char*); ^ 20180214-pmf.cpp:38:45: error: expression list treated as compound expression in initializer [-fpermissive] void(Bar::*mfpf)(MyFoo2, bool, const char*); ^ 20180214-pmf.cpp:40:16: error: cannot convert ‘void (Bar::*)(ns2::MyFoo2, bool, const char*)’ {aka ‘void (Bar::*)(ns1::Foo<int, ns2::MyFoo2Tag>, bool, const char*)’} to ‘void Bar::*’ in assignment mfpf = ::Baz2; ^~~~ $ g++-7.3.1 -v Using built-in specs. COLLECT_GCC=g++-7.3.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/7.3.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-7.3.1 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 7.3.1 20180214 (GCC) $ g++-8.0.1 -v Using built-in specs. COLLECT_GCC=g++-8.0.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/8.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-8.0.1 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 8.0.1 20180214 (experimental) (GCC)
[Bug c++/77620] Generic compile time regression of 7.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77620 --- Comment #6 from petschy at gmail dot com --- Would it be sensible to put an extra line to the output of 'gcc/g++ -v' if the slow checks are enabled, which just states this fact / warns about (possibly mentioning the use of --enable-checking=release at configure)? Future tickets like this might be avoided this way.
[Bug c++/79172] -Wunused-but-set-parameter gone nuts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79172 --- Comment #3 from petschy at gmail dot com --- $ g++-7.0.1 -pthread -Werror -Wall -Wextra 20170123-Wunused-but-set-parameter.cpp 20170123-Wunused-but-set-parameter.cpp: In constructor ‘CRSARC4Base::CRSARC4Base(unsigned int, unsigned int)’: 20170123-Wunused-but-set-parameter.cpp:100:39: error: parameter ‘msends_’ set but not used [-Werror=unused-but-set-parameter] CRSARC4Base::CRSARC4Base(unsigned int msends_, unsigned int mrecvs_) : ^~~ 20170123-Wunused-but-set-parameter.cpp:100:61: error: parameter ‘mrecvs_’ set but not used [-Werror=unused-but-set-parameter] CRSARC4Base::CRSARC4Base(unsigned int msends_, unsigned int mrecvs_) : ^~~ $ g++-7.0.1 -v Using built-in specs. COLLECT_GCC=g++-7.0.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/7.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-7.0.1 --disable-bootstrap --enable-checking=release CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 7.0.1 20170120 (experimental) (GCC) Tested on 64bit Debian Jessie, CPU is AMD FX-8150.
[Bug c++/79172] -Wunused-but-set-parameter gone nuts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79172 --- Comment #2 from petschy at gmail dot com --- Created attachment 40563 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40563=edit preprocessed source of the reduced test case
[Bug c++/79172] New: -Wunused-but-set-parameter gone nuts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79172 Bug ID: 79172 Summary: -Wunused-but-set-parameter gone nuts Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- I made a fresh build from the gcc 7 master branch (6f0a524c). Now my code doesn't compile: common/src/mgsnetCRSARC4Base.cpp: In constructor ‘mgs::net::CRSARC4Base::CRSARC4Base(uint32_t, uint32_t)’: common/src/mgsnetCRSARC4Base.cpp:46:45: error: parameter ‘msends_’ set but not used [-Werror=unused-but-set-parameter] mgs::net::CRSARC4Base::CRSARC4Base(uint32_t msends_, uint32_t mrecvs_) : ^~~ common/src/mgsnetCRSARC4Base.cpp:46:63: error: parameter ‘mrecvs_’ set but not used [-Werror=unused-but-set-parameter] mgs::net::CRSARC4Base::CRSARC4Base(uint32_t msends_, uint32_t mrecvs_) : ^~~ This is a rather simple ctor: 46: mgs::net::CRSARC4Base::CRSARC4Base(uint32_t msends_, uint32_t mrecvs_) : CTCPIPBase(msends_, mrecvs_), m_rsa_padding(mgs::crypto::CRSAEngine::PADDING_8000) { m_rng.Seed(); } The two params are forwarded to the ctor of the base class, so they are definitely used; the implementation of the base ctor is in another cpp. It does this at -O0 and -O3, too. An earlier version of 7.0 had no such problem (probably a few months earlier I guess), and current 6.2.1, 6.3.1 compiles the code fine. Tried to make a reduced test case, but simply cloning the inheritance structure and the ctor signatures didn't trigger the warning, unfortunately. I will have another go with this if I have a little free time. There are lots of other ctors that forward to the base class, but only this single instance is picked out. Could it be because of virtual inheritance? This is the only distinguishing attribute I can think of atm. Bisected, hope this helps: commit 4076953ad7a3e1f54c5caf6c7c23fd8878702001 Author: nathan <nathan@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Fri Oct 7 20:01:17 2016 + cp/ PR c++/64433 DR1658, DR1611 * init.c (emit_mem_initializers): Don't construct vbases of abstract classes. (push_base_cleanups): Don't push vbase cleanups for abstract class when in C++14 mode. * method.c (synthethesized_method_walk): Don't walk vbases of abstract classes when in C++14 mode. testsuite/ PR c++/66443 * g++.dg/cpp0x/pr66443-cxx11.C: New. * g++.dg/cpp0x/pr66443-cxx11-2.C: New. * g++.dg/cpp1y/pr66443-cxx14.C: New * g++.dg/cpp1y/pr66443-cxx14-2.C: New. * g++.dg/cpp1y/pr66443-cxx14-3.C: New. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@240874 138bc75d-0d04-0410-961f-82ee72b054a4
[Bug c++/77787] segfault in mangle.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77787 --- Comment #1 from petschy at gmail dot com --- That last function in json.hpp was gutted: //template int foo(int div_) { ASSERT(div_ == 0); return 0; } Removed the assertions from all the template functions, as this moved the crash location upward, then section type conflicts started to appear. Commented out the assertions from back to front until the point where the code compiled. Lots of assertions remained that didn't cause trouble. Now I have an assert() in a plain member function of a plain class. Then lots of assertions in inline functions and static functions of fully specialized class templates. At the end I have the foo fn as above. If the template line is commented out, 'section type conflict' is the result, no crash. If I comment out the assert in foo() or in the plain member, it compiles. If both assertions are active, and I uncomment the //template line, it crashes. Unfortunately, the test still can't reproduce the behaviour, though I put in there a class, and a class template specialization, too. It also turned out that function templates ignore the section attribute on the static variable inside, gotcha. That's why I didn't get conflicts in the first version of the test. Moreover, the 'inline' on the Helper::Get() function is ignored, so the section of the static variable inside will not be flagged as "aG" always, but as "a" or as "aG" depending on the inline status of the outside function using the assert(). That's why I did get conflicts at all.
[Bug c++/77787] New: segfault in mangle.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77787 Bug ID: 77787 Summary: segfault in mangle.c Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- Background: while trying to put cold data to a dedicated section I managed to crash the compiler. The idea was that rarely used strings, eg src file names and expressions of assertions will be separated from normal data, resulting in better dcache utilization: void prn(const char*); void foo() { __attribute__((section(".cold_rodata"))) static const char x[] = "foo"; prn(x); } inline void bar() { __attribute__((section(".cold_rodata"))) static const char x[] = "bar"; prn(x); } int main() { foo(); bar(); } Let's pretend that the x[] arrays were put there by ASSERT() macros. Unfortunately, this won;t compile: $ g++ -c 20160928-section_type_conflict.cpp 20160928-section_type_conflict.cpp:9:20: error: x causes a section type conflict with x static const char x[] = "bar"; ^ 20160928-section_type_conflict.cpp:4:20: note: ‘x’ was declared here static const char x[] = "foo"; ^ I know, I know: I shouldn't be using __attribute__ on non-globals. Unfortunately, there is no hot/cold attribute for data, only for code. For my specific case, a 'cold' data attribute would partly solve the problem, but that's probably another discussion. The explanation can be found here, along with proposed solutions: http://stackoverflow.com/questions/35091862/inline-static-data-causes-a-section-type-conflict I thought that rather than messing with __asm__, I go full ret^H^H^Hcrafty: instead of the static x[], I define a local struct with a static inline fn, which has the static x[] in the desired section. This way x[] is always in an inline function, so the conflict goes away. The result: gcc segfaults. 5.4.1, 6.2.1 and 7.0.0 all do it. I built a -g -O0 from 7.0, this is the output: common/src/dbgmac.h:264:23: internal compiler error: in operator[], at vec.h:732 DBG_STRING_DEF(expr_str, #expr); \ ^ common/src/platform.h:97:29: note: in definition of macro ‘CONCAT_’ #define CONCAT_(a_, b_) a_ ## b_ ^~ common/src/platform.h:99:33: note: in expansion of macro ‘CONCAT2’ #define CONCAT3(a_, b_, c_) CONCAT2(CONCAT2(a_, b_), c_) ^~ common/src/platform.h:98:29: note: in expansion of macro ‘CONCAT_’ #define CONCAT2(a_, b_) CONCAT_(a_, b_) ^~ common/src/platform.h:99:44: note: in expansion of macro ‘CONCAT2’ #define CONCAT3(a_, b_, c_) CONCAT2(CONCAT2(a_, b_), c_) ^~ common/src/dbgmac.h:73:34: note: in expansion of macro ‘CONCAT3’ #define UNIQ_VAR_NAME(n_)CONCAT3(n_, _, __LINE__) ^~ common/src/dbgmac.h:86:22: note: in expansion of macro ‘UNIQ_VAR_NAME’ static const char UNIQ_VAR_NAME(var_) [] = val_; \ ^ common/src/dbgmac.h:264:4: note: in expansion of macro ‘DBG_STRING_DEF’ DBG_STRING_DEF(expr_str, #expr); \ ^~ common/src/dbgmac.h:208:35: note: in expansion of macro ‘HARD_ASSERT’ #define ASSERT(expr, args...) HARD_ASSERT(expr, ## args); ^~~ common/src/json.hpp:1493:2: note: in expansion of macro ‘ASSERT’ ASSERT(div_ != 0); ^~ 0x7c190a vec<tree_node*, va_gc, vl_embed>::operator[](unsigned int) ../../gcc/vec.h:732 0xa2f260 local_class_index ../../gcc/cp/mangle.c:1845 0xa2f9ea discriminator_for_local_entity ../../gcc/cp/mangle.c:1881 0xa2fd6f write_local_name ../../gcc/cp/mangle.c:1972 0xa299c0 write_name ../../gcc/cp/mangle.c:917 0xa291f0 write_encoding ../../gcc/cp/mangle.c:779 0xa2fb5d write_local_name ../../gcc/cp/mangle.c:1941 0xa299c0 write_name ../../gcc/cp/mangle.c:917 0xa291f0 write_encoding ../../gcc/cp/mangle.c:779 0xa28c89 write_mangled_name ../../gcc/cp/mangle.c:744 0xa39b28 mangle_decl_string ../../gcc/cp/mangle.c:3709 0xa39b6e get_mangled_id ../../gcc/cp/mangle.c:3731 0xa3a032 mangle_decl(tree_node*) ../../gcc/cp/mangle.c:3801 0x145f393 decl_assembler_name(tree_node*) ../../gcc/tree.c:669 0xbd4c72 symbol_table::insert_to_assembler_name_hash(symtab_node*, bool) ../../gcc/symtab.c:171 0xbd4fef symbol_table::symtab_initialize_asm_name_hash() ../../gcc/symtab.c:263 0xbd71dc symtab_node::get_for_asmname(tree_node const*) ../../gcc/symtab.c:930 0xbf366d handle_alias_pairs ../../gcc/
[Bug c++/77620] Generic compile time regression of 7.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77620 --- Comment #2 from petschy at gmail dot com --- Thanks, --enable-checking=release did the trick. Those unreleased checks definitely have some runtime cost :) My project was built in 3m35 with -O3, and the gcc master branch: 7.0.0 -O0 7m30 -O1 7m00 -O2 8m23 -O3 9m02 These are effectively the same as with 6.2.1. Sorry for the noise.
[Bug c++/77620] New: Generic compile time regression of 7.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77620 Bug ID: 77620 Summary: Generic compile time regression of 7.0 Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- I noticed that compiling my C++ project with 7.0 at -O3 took twice as much time as before with earlier major versions. First I thought that it might be that 7.0 used more optimizations, but a significant time difference remained even with -O0: 5.4.1 6.2.1 7.0.0 -O0 2m10 2m18 3m43 -O1 2m45 2m51 5m38 -O2 3m24 3m26 6m46 -O3 3m25 3m34 7m05 Also did measurements with the gcc source, compiling the master branch (b55f1f4) with for o in 0 1 2 3; do for v in 4.9.2 5.4.1 6.2.1 7.0.0; do rm -Rf *; ../configure CC=gcc-$v CXX=g++-$v --enable-languages=c,c++ --disable-multilib --program-suffix=-`cat ../gcc/BASE-VER` --disable-bootstrap CFLAGS="-O$o -march=native" CXXFLAGS="-O$o -march=native"; time make -j8; echo "-O$o $v"; echo hit enter to continue; read x; done; done 4.9.2 5.4.1 6.2.1 7.0.0 -O0 7m157m127m28 8m18 -O1 7m117m137m02 9m11 -O2 8m198m178m2410m57 -O3 8m538m579m0512m03 The tests were done on a PC with Debian Jessie 64bit, AMD FX-8150 @ 4GHz, 16GB RAM, XFS on SSD. $ gcc-4.9.2 -v Using built-in specs. COLLECT_GCC=gcc-4.9.2 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --enable-languages=c,c++ --disable-multilib --program-suffix=-4.9.2 Thread model: posix gcc version 4.9.2 20140808 (prerelease) (GCC) $ gcc-5.4.1 -v Using built-in specs. COLLECT_GCC=gcc-5.4.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.4.1/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-5.4.1 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 5.4.1 20160829 (GCC) $ gcc-6.2.1 -v Using built-in specs. COLLECT_GCC=gcc-6.2.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.2.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-6.2.1 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 6.2.1 20160831 (GCC) $ gcc-7.0.0 -v Using built-in specs. COLLECT_GCC=gcc-7.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-7.0.0 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 7.0.0 20160831 (experimental) (GCC)
[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513 --- Comment #5 from petschy at gmail dot com --- Some more details, hope this helps. Preprocessed one of the oddly behaving files with 5.4.1, 6.2.1 and 7.0.0, then tried to compile each preprocessed file with each compiler version. 5.4.1 warned for all preprocessed files, 6.2.1 and 7.0.0 didn't warn for any of them. This doesn't mean that 6.2.1 and 7.0.0 is not affected, just for this particular file they didn't warn. Diffing the files revealed that there is no difference in the user code, only in the system headers included, eg comes from /usr/local/include/c++/5.4.1/string for 5.4.1, etc. All of the problematic lines are assignments of the form var = NULL; or var1 = var2 = NULL; ALL of the NULL assignments in the file are reported. NONE of the == or != tests are reported. Eg: this code warns: AbstractImage::AbstractImage() { pixels= # 124 "common/src/AbstractImage.cpp" 3 4 __null # 124 "common/src/AbstractImage.cpp" ; channels_n=0; mode=255; width=-1; height=-1; } But if I replace 'pixels =' with 'bool b = pixels ==' above, the warning disappears, which is strange, I think. The flags are the same for the != and == tests, ie # "common/src/AbstractImage.cpp" 3 4 is everywhere.
[Bug c++/77513] -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513 --- Comment #2 from petschy at gmail dot com --- I don't want to enable them. The problem is not with too little but too many warnings. A snippet from one of the problematic files: { NULL, NULL, false, false } is preprocessed to { # 62 "AdsPlugin.cpp" 3 4 __null # 62 "AdsPlugin.cpp" , # 62 "AdsPlugin.cpp" 3 4 __null # 62 "AdsPlugin.cpp" , false, false } }; Here I see the same flags, yet for these two NULLs gcc warns.
[Bug c++/77513] New: -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77513 Bug ID: 77513 Summary: -Wzero-as-null-pointer-constant vs 0, nullptr, NULL and __null Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- Created attachment 39580 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39580=edit Preprocessed source, generated with g++-7.0.0 -std=c++14 -Wzero-as-null-pointer-constant 20160907-null.cpp -E > 20160907-null.ii Yesterday I switched on the warning for a ~250kloc codebase to clean it up. Used 7.0, it was tedious but it was done. I had to replace NULLs also, not just 0s, but at that time I wasn't suspecting anything, though it seemed a bit strange. Then, tried to build on another machine with 5.4.1, and to my surprise, tons of warnings appeared. Then tried to build on my machine with 5.4.1, the same results. It turned out that NULLs are frowned upon, quite inconsistently. 5.4.1 has problems with 7 cpp files, 6.2.1 and 7.0 with just a single one. Did a grep for NULL, and as expected for a large and aging codebase, there were lots of them, but they are not treated equally. All files are c++, and compiled with the same flags. Preprocessed 2 problematic files with all three gcc versions mentioned. Diffing them revealed that there is no difference in the actual code, only what gets included due to the differing gcc versions. All NULLs were replaced with __null's by the preprocessor, which is defined in the gcc version specific stddef.h include. Crafted a test case: #include char* a = 0; char* b = nullptr; char* c = __null; char* d = NULL; int main() { } $ g++-5.4.1 -std=c++14 -Wzero-as-null-pointer-constant 20160907-null.cpp 20160907-null.cpp:2:11: warning: zero as null pointer constant [-Wzero-as-null-pointer-constant] char* a = 0; ^ 20160907-null.cpp:4:11: warning: zero as null pointer constant [-Wzero-as-null-pointer-constant] char* c = __null; ^ 6.2.1 and 7.0 print exactly the same warnings. So NULL is ok, but __null is not? The end of the preprocessed source looks like this: # 2 "20160907-null.cpp" char* a = 0; char* b = nullptr; char* c = __null; char* d = # 5 "20160907-null.cpp" 3 4 __null # 5 "20160907-null.cpp" ; int main() { } c and d initialized the same except for whitespace and the two "'# 6" lines around d's __null. I naively thought that these are only to communicate line info to the compiler, but if I delete the first one: $ g++-7.0.0 -std=c++14 -Wzero-as-null-pointer-constant 20160907-null.ii 20160907-null.cpp:2:11: warning: zero as null pointer constant [-Wzero-as-null-pointer-constant] char* a = 0; ^ 20160907-null.cpp:4:11: warning: zero as null pointer constant [-Wzero-as-null-pointer-constant] char* c = __null; ^~ 20160907-null.cpp:6:10: warning: zero as null pointer constant [-Wzero-as-null-pointer-constant] int main() ^ The interpretation of __null at d changed for some reason. What is going on? It seems that the interpretation can change unpredictably, and in the problematic source files __null's are misdiagnosed even when the "# ..." lines are around them. For c++11 and later code, why is NULL defined as __null, rather than nullptr? I put a fast bandaid on my code by redefining NULL to be nullptr after the last include in the problematic files, but since the number of problematic files seems to change from gcc version to gcc version, this is rather fragile, let alone unelegant. Platform is Debian Jessie AMD64, the gcc versions: $ g++-5.4.1 -v Using built-in specs. COLLECT_GCC=g++-5.4.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.4.1/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-5.4.1 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 5.4.1 20160829 (GCC) $ g++-6.2.1 -v Using built-in specs. COLLECT_GCC=g++-6.2.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.2.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-6.2.1 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 6.2.1 20160831 (GCC) $ g++-7.0.0 -v Using built-in specs. COLLECT_GCC=g++-7.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-7.0.0 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 7.0.0 20160831 (experimental) (GCC)
[Bug c++/77502] -Wzero-as-null-pointer-constant : misleading/imprecise messages
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77502 --- Comment #1 from petschy at gmail dot com --- I found another case: initializing an array of structs: struct X { const char* p; int i; }; X x[] = { { "hello", 0 }, { 0, 0 }, // ! { 0, 0 }, // ! { 0, 0 } // ! }; // all 3 errors marked here
[Bug c++/77502] New: -Wzero-as-null-pointer-constant : misleading/imprecise messages
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77502 Bug ID: 77502 Summary: -Wzero-as-null-pointer-constant : misleading/imprecise messages Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- Created attachment 39575 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39575=edit C++ source g++-7.0.0 -Werror -Wall -Wextra -Wzero-as-null-pointer-constant 20160906-Wzero-as-null-pointer-constant.cpp 20160906-Wzero-as-null-pointer-constant.cpp: In constructor ‘Foo::Foo()’: 20160906-Wzero-as-null-pointer-constant.cpp:10:6: error: zero as null pointer constant [-Werror=zero-as-null-pointer-constant] c(0) // 1 ^ 20160906-Wzero-as-null-pointer-constant.cpp: At global scope: 20160906-Wzero-as-null-pointer-constant.cpp:17:20: error: zero as null pointer constant [-Werror=zero-as-null-pointer-constant] void Fn(char* p = 0); // 2, OK ^ 20160906-Wzero-as-null-pointer-constant.cpp: In member function ‘void TBar::Fn(char*) [with T = int]’: 20160906-Wzero-as-null-pointer-constant.cpp:34:9: error: zero as null pointer constant [-Werror=zero-as-null-pointer-constant] br->Fn(); // 3 ^ 20160906-Wzero-as-null-pointer-constant.cpp: In member function ‘void TBaz::Fn(char*) [with T = int]’: 20160906-Wzero-as-null-pointer-constant.cpp:35:9: error: zero as null pointer constant [-Werror=zero-as-null-pointer-constant] bz->Fn(); // 4 ^ The issues: 1) 0 is used instead of nullptr when initializing Foo.a ("a(0)"), but the error marker is at the end of the initializer list, not at the line of the actual error. 2) The default argument of Fn() in the non-template Bar class is diagnosed OK 3,4) The default argument of Fn() in the template classes TBar and TBaz is diagnosed, but the error marker is put at the call site, not at the declaration. This caused me quite some pondering on the production codebase, as for a largish templated class hierarchy it's not obvious just by looking at the call site which fn will be actually called. Debian Jessie, AMD64 $ g++-7.0.0 -v Using built-in specs. COLLECT_GCC=g++-7.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-7.0.0 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 7.0.0 20160831 (experimental) (GCC)
[Bug tree-optimization/77485] Missed dead store elimination of aggregate store followed by partial stores
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77485 --- Comment #2 from petschy at gmail dot com --- I agree that the generic case can become quite complicated: if after the memset, the individual values are written with gaps between them, or multiple contiguous chunks with gaps between them, it's not easy to tell whether having a single memset + overwrites is better than having multiple memsets with distinct regions + the individual byte writes, or anything in-between. It all depends on the actual pattern. However, for a simplified approach I can think of is keeping track of contiguous regions that are written, then trimming the regions based on the order and overlap, or merging them if they are adjacent. In this particular case this would mean that [0,199] : memset 0 + [0,31] : const init could be converted to [32,199] : memset 0 + [0,31] : const init knowing that the const init comes later. A similar adjustment can be made if a second const init region overlaps with the end of the memset region. Adjacent or overlapping const init regions can be merged. But then, of course comes the devil with the details: if the trivial merging and trimming of the intervals is done, - at what length is it worth having the memset merged into the const init regions, if a short memset is stuck between two const init regions? - and vica versa, at what length is it worth having a single memset with an overwriting const init region at the middle vs memset + const init + memset as disjunct regions? - at what point is it worth storing the whole data in .rodata and just memcpy it to the target? - how to integrate regions of runtime calculated values into the above? For my particular case, I can work around this inefficiency by setting the buffer to the exact size. I have no idea how a simple region based approach like the above would perform in general and whether it would worth the development effort.
[Bug c++/77456] Suboptimal code when returning a string generated with a constexpr fn at compile time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77456 --- Comment #8 from petschy at gmail dot com --- I created two other bugs (bug 77482 for the segfault and bug 77485 for the DSE issue). As I noted in the latter, I'm a bit confused about the store merging, and what change Kyrill's patch will make, as the version compiled with gcc 7.0 somewhat merges the stores using xmm0, so the problem is not that no merging occurs, but it occurs inconsistently. Furthermore, there must be a threshold at the amount of data above which the codegen should decide that it's more efficient to store the bytes in .rodata and memcpy to the destination than to store with multiple insns, even if merged. This logic kicks in at baz_sized(), but does not in baz(). Interestingly, in the latter no xmm0 was used, every single byte is movb'd after the memset, whereas foo() and bar() with smaller data used xmm0, too. Dump of assembler code for function baz(): 0x00400800 <+0>: sub$0x8,%rsp 0x00400804 <+4>: mov$0x4e20,%edx 0x00400809 <+9>: xor%esi,%esi 0x0040080b <+11>:callq 0x4004f0 <memset@plt> 0x00400810 <+16>:movb $0x30,(%rax) 0x00400813 <+19>:movb $0x20,0x1(%rax) 0x00400817 <+23>:movb $0x31,0x2(%rax) 0x0040081b <+27>:movb $0x20,0x3(%rax) 0x0040081f <+31>:movb $0x32,0x4(%rax) 0x00422674 <+138868>:movb $0x32,0x4db3(%rax) 0x0042267b <+138875>:movb $0x30,0x4db4(%rax) 0x00422682 <+138882>:movb $0x30,0x4db5(%rax) 0x00422689 <+138889>:add$0x8,%rsp 0x0042268d <+138893>:retq Even if the byte stores were merged into 64bit stores, the function still would be huge, and a memcpy instead would be way better.
[Bug c++/77485] New: Missed dead store elimination when returning constexpr generated data
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77485 Bug ID: 77485 Summary: Missed dead store elimination when returning constexpr generated data Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- Created attachment 39563 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39563=edit C++ source A string is generated at compile time with constexpr function, then it is returned by value, in a struct having a char array. There are two versions of the generator: foo() generates into a larger array, foo_sized() first calculates the needed size, then generates into an array with the exactly needed size. The char array must be initialized due to constexpr, and these zero stores are not all eliminated when overwritten with the actual generated characters. BUT, only if the array in the struct is larger than the generated data: foo_sized(): Dump of assembler code for function foo_sized(): 0x004005e0 <+0>: movdqa 0xa8(%rip),%xmm0# 0x4006b0 0x004005e8 <+8>: mov%rdi,%rax 0x004005eb <+11>:movups %xmm0,(%rdi) 0x004005ee <+14>:movdqa 0xaa(%rip),%xmm0# 0x4006c0 0x004005f6 <+22>:movups %xmm0,0x10(%rdi) 0x004005fa <+26>:retq 0x4006b0 : "0 1 2 3 4 5 6 7 8 9 10 11 12 13" gcc 6.2.1 and 7.0.0 generate the exact same code. Since the size of the buffer equals to the number of generated characters (32), two 16 byte load+stores are used. foo() with gcc 7.0: Dump of assembler code for function foo(): 0x00400560 <+0>: mov%rdi,%rdx; rdi = rdx = 0x7fffe1b0 0x00400563 <+3>: movq $0x0,0xc0(%rdi) ; zero out the last 8 bytes of the buffer 0x0040056e <+14>:lea0x8(%rdi),%rdi ; rdi += 8 => 0x7fffe1b8 0x00400572 <+18>:mov%rdx,%rcx; rcx = 0x7fffe1b0 0x00400575 <+21>:movdqa 0x113(%rip),%xmm0# 0x4006b0 0x0040057d <+29>:and$0xfff8,%rdi ; already aligned 0x00400581 <+33>:xor%eax,%eax 0x00400583 <+35>:sub%rdi,%rcx; rcx = -8 0x00400586 <+38>:add$0xc8,%ecx ; rcx = 0xc0, 0xc8 is the size of the buffer, ie 200 0x0040058c <+44>:shr$0x3,%ecx; rcx = 0x18 0x0040058f <+47>:rep stos %rax,%es:(%rdi); rdi is buf+8, ecx is (size-8)/8, so will zero out the char buffer from index 8 to 0xbf 0x00400592 <+50>:movups %xmm0,(%rdx) 0x00400595 <+53>:movb $0x38,0x10(%rdx) 0x00400599 <+57>:movb $0x20,0x11(%rdx) 0x0040059d <+61>:mov%rdx,%rax 0x004005a0 <+64>:movb $0x39,0x12(%rdx) 0x004005a4 <+68>:movb $0x20,0x13(%rdx) 0x004005a8 <+72>:movb $0x31,0x14(%rdx) 0x004005ac <+76>:movb $0x30,0x15(%rdx) 0x004005b0 <+80>:movb $0x20,0x16(%rdx) 0x004005b4 <+84>:movb $0x31,0x17(%rdx) 0x004005b8 <+88>:movb $0x31,0x18(%rdx) 0x004005bc <+92>:movb $0x20,0x19(%rdx) 0x004005c0 <+96>:movb $0x31,0x1a(%rdx) 0x004005c4 <+100>: movb $0x32,0x1b(%rdx) 0x004005c8 <+104>: movb $0x20,0x1c(%rdx) 0x004005cc <+108>: movb $0x31,0x1d(%rdx) 0x004005d0 <+112>: movb $0x33,0x1e(%rdx) 0x004005d4 <+116>: retq 0x4006b0 : "0 1 2 3 4 5 6 7 8 9 10 11 12 13" This is the same data used in foo_sized(). I wrote the register values at the end of each line of interest. The issues I found: 0) Not all byte stores are merged. The first 16 bytes are copied w/ xmm0, but the rest initialized byte-wise. I was told in bug 77456 that there is work in progress solving that (see [1]), but I'm a bit confused, since it seems that some merging does occur here already using xmm0, but the second 16 bytes are not merged for some reason. The last byte of the second 16 byte pack is zero, so it might be that that zero write is eliminated (no "movb 0x00, 0x1f(%rdx)" in the disasm), due to the previous zero fill, and the remaining 15 bytes written byte wise as it's not an exact fit for xmm0, but I'm just speculating here. Code generated with gcc 6.2.1 does not use xmm0, all bytes are stored with movb; the zero fill is the same, and the zero store to 0x1f is missing, too. 1) The zero fill of the buffer
[Bug c++/77482] New: Segfault when compiling ill-formed constexpr code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77482 Bug ID: 77482 Summary: Segfault when compiling ill-formed constexpr code Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- Created attachment 39561 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39561=edit C++ source In fixbuf(), the return statement is commented out at line 76. This makes the program ill-formed since the return type will be deduced to void, but the caller expects a value returned. Instead of an error message, 7.0 segfaults and 6.2.1 gets confused. g++-7.0.0 -O3 -Wall -Wextra -g 20160905-constexpr-segfault.cpp ‘ In function ‘auto foo()’: Segmentation fault constexpr auto x = fixbuf<13, 200>(); 6.2.1 doesn't segfault, but something is definitely amiss: g++-6.2.1 -O3 -Wall -Wextra -g 20160905-constexpr-segfault.cpp ‘ 20160905-constexpr-segfault.cpp:81: confused by earlier errors, bailing out In a terminal window with black bg and gray font, the single quote is gray, then the error message on the next line is bold white, and it stays so, so anything I type after this will be bold white. 6.0 seems to be OK: g++-6.0.0 -O3 -Wall -Wextra -g 20160905-constexpr-segfault.cpp 20160905-constexpr-segfault.cpp: In function ‘auto foo()’: 20160905-constexpr-segfault.cpp:81:37: error: ‘constexpr const void x’ has incomplete type constexpr auto x = fixbuf<13, 200>(); ^ 20160905-constexpr-segfault.cpp:82:9: error: unable to deduce ‘auto’ from ‘x’ return x; ^ 20160905-constexpr-segfault.cpp: In function ‘auto foo_sized()’: 20160905-constexpr-segfault.cpp:87:38: error: ‘constexpr const void s’ has incomplete type constexpr auto s = fixbuf<13, 0, 1>(); ^ 20160905-constexpr-segfault.cpp:88:35: error: no matching function for call to ‘fixbuf()’ constexpr auto x = fixbuf<13, s>(); ^ 20160905-constexpr-segfault.cpp:69:6: note: candidate: template constexpr auto fixbuf() auto fixbuf() ^~ 20160905-constexpr-segfault.cpp:69:6: note: template argument deduction/substitution failed: 20160905-constexpr-segfault.cpp:88:35: error: could not convert template argument ‘s’ to ‘unsigned int’ constexpr auto x = fixbuf<13, s>(); ^ 20160905-constexpr-segfault.cpp:89:9: error: unable to deduce ‘auto’ from ‘x’ return x; ^ Tested on Debian Jessie AMD64, the detailed gcc versions: $ g++-7.0.0 -v Using built-in specs. COLLECT_GCC=g++-7.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/7.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-7.0.0 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 7.0.0 20160831 (experimental) (GCC) $ g++-6.2.1 -v Using built-in specs. COLLECT_GCC=g++-6.2.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.2.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-6.2.1 --disable-bootstrap CFLAGS='-O2 -march=native' CXXFLAGS='-O2 -march=native' Thread model: posix gcc version 6.2.1 20160831 (GCC) $ g++-6.0.0 -v Using built-in specs. COLLECT_GCC=g++-6.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-6.0.0 Thread model: posix gcc version 6.0.0 20160302 (experimental) (GCC)
[Bug c++/77456] Suboptimal code when returning a string generated with a constexpr fn at compile time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77456 --- Comment #5 from petschy at gmail dot com --- Sorry. Should I open dedicated bugs for them, or can you work from this single one? Though the example code would be the same. Probably I would have picked a more descriptive title mentioning the DSE issue, the bogus error message formatting and the segfault.
[Bug c++/77456] Suboptimal code when returning a string generated with a constexpr fn at compile time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77456 --- Comment #2 from petschy at gmail dot com --- #22141 does not mention a DSE issue, nor a segfault of the compiler, so hardly an exact duplicate.
[Bug c++/77456] New: Suboptimal code when returning a string generated with a constexpr fn at compile time
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77456 Bug ID: 77456 Summary: Suboptimal code when returning a string generated with a constexpr fn at compile time Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- Created attachment 39541 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39541=edit C++ source I ran into this when converting expression trees to strings at compile time. Though it's surely a rare application, the fix might have positive impact on a wider range of scenarios. The attached code converts the integers [0..N] to a string at compile time. There are several conversions with differing N's. Also, some conversions calculate the exact size of the resulting strings, others just use a large enough buffer. Platform is is Debian Jessie, x86-64. Tested w/ 6.x and 7.0. To compile: g++ -std=c++14 -Wall -Wextra -O3 20160831-constexpr.cpp Please be patient, this takes almost 30 secs on my machine (AMD FX 8150 @ 4GHz), due to lots of compile-time constexpr work. foo(): [0..13] w/ a 200 byte buffer. It seems that the initial zero fill of the buffer is not considered in dead-store elimination, so the 200 bytes are rep stos'd, then the actual characters are copied via xmm0 and bytewise literal stores: Dump of assembler code for function _Z3foov: 0x00400620 <+0>: mov%rdi,%rdx 0x00400623 <+3>: movq $0x0,0xc0(%rdi) 0x0040062e <+14>:lea0x8(%rdi),%rdi 0x00400632 <+18>:mov%rdx,%rcx 0x00400635 <+21>:movdqa 0x27033(%rip),%xmm0# 0x427670 0x0040063d <+29>:and$0xfff8,%rdi 0x00400641 <+33>:xor%eax,%eax 0x00400643 <+35>:sub%rdi,%rcx 0x00400646 <+38>:add$0xc8,%ecx 0x0040064c <+44>:shr$0x3,%ecx 0x0040064f <+47>:rep stos %rax,%es:(%rdi) 0x00400652 <+50>:movups %xmm0,(%rdx) 0x00400655 <+53>:movb $0x38,0x10(%rdx) 0x00400659 <+57>:movb $0x20,0x11(%rdx) 0x0040065d <+61>:mov%rdx,%rax 0x00400660 <+64>:movb $0x39,0x12(%rdx) 0x00400664 <+68>:movb $0x20,0x13(%rdx) 0x00400668 <+72>:movb $0x31,0x14(%rdx) 0x0040066c <+76>:movb $0x30,0x15(%rdx) 0x00400670 <+80>:movb $0x20,0x16(%rdx) 0x00400674 <+84>:movb $0x31,0x17(%rdx) 0x00400678 <+88>:movb $0x31,0x18(%rdx) 0x0040067c <+92>:movb $0x20,0x19(%rdx) 0x00400680 <+96>:movb $0x31,0x1a(%rdx) 0x00400684 <+100>: movb $0x32,0x1b(%rdx) 0x00400688 <+104>: movb $0x20,0x1c(%rdx) 0x0040068c <+108>: movb $0x31,0x1d(%rdx) 0x00400690 <+112>: movb $0x33,0x1e(%rdx) 0x00400694 <+116>: retq Since the buffer is larger, all the movb's could have been converted to another xmm0 load+store. Though an explicit zero byte is written in the C++ code after the last digit, this is missing in the disassembly above, so there is no "movb $0x00, 0x1f(%rdx)" at the end, meaning that the compiler eliminated this store, instead of merging all the 16 byte stores into a single xmm0 operation, and skipping the first 32 bytes in the rep stos. foo_sized() generates the same string, but first it calculates the needed size. There is no zero fill here in the asm, so it was successfully eliminated, and the characters are initialized via two xmm0 loads/stores, as expected: Dump of assembler code for function _Z9foo_sizedv: 0x004006a0 <+0>: movdqa 0x26fc8(%rip),%xmm0# 0x427670 0x004006a8 <+8>: mov%rdi,%rax 0x004006ab <+11>:movups %xmm0,(%rdi) 0x004006ae <+14>:movdqa 0x26fca(%rip),%xmm0# 0x427680 0x004006b6 <+22>:movups %xmm0,0x10(%rdi) 0x004006ba <+26>:retq bar/bar_sized/bar_static/bar_sized_static(): the same as foo, but the range is [0..42], and the static versions use a static constexpr, and return the buffer pointer, not the buffer by value. bar() zero fills and then copies over with xmm0 and byte literals. bar_sized() lacks the zero fill, but initializes the characters the same way. The static versions just return a pointer as expected. baz_sized() works as expected: since the memory to copy is large, it calls memcpy instead of doing the above xmm0 + literal bytes stuff. The problem is with baz(). The range
[Bug c++/69673] New: Can't pass members in lambda capture list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69673 Bug ID: 69673 Summary: Can't pass members in lambda capture list Product: gcc Version: 5.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Target Milestone: --- struct A { int i; }; void foo(A& a) { auto l = []() { }; } void bar(A& a) { auto& i = a.i; auto l = []() { }; } void baz() { A a; auto l = []() { }; } ---8<---8<---8<--- Only bar() compiles, the other two functions fail : g++-5.3.1 -std=c++11 20160204-lambda.cpp error: expected ‘,’ before ‘.’ token auto l = []() { }; ^ Every gcc version I tried failed: 4.7.4, 4.8.4, 4.9.2, 5.3.1, 6.0.0 g++-5.3.1 -v Using built-in specs. COLLECT_GCC=g++-5.3.1 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.3.1/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-5.3.1 --with-build-config=my-bootstrap-O3-native Thread model: posix gcc version 5.3.1 20160204 (GCC) Regards, Peter
[Bug c++/69673] Can't pass members in lambda capture list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69673 --- Comment #2 from petschy at gmail dot com --- Is this an accidental omission in the std, or allowing member access would cause some trouble? Thanks, Peter
[Bug c++/64596] Friendship not recognized and template param deduction error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64596 --- Comment #3 from petschy at gmail dot com --- (In reply to Daniel Krügler from comment #2) 1) if the friend declaration is invalid in line 15, then g++ should tell me so, shouldn't it? But atm it compiles it, and the surprise comes later, when the private member can't be accessed, despite the friend decl. Should I file a separate ticket about this? Any detail about why this restriction about friend and alias templates is present in the std? templatetypename, bool friend class Txn; This would work, of course, but this form would give friend access to any Txn on Str, even with different E template params. That might lead to subtle bugs where invalid usage is not caught at compile time. Thanks, Peter
[Bug debug/53770] Regression: incorrect line numbers in debug info since 4.5+
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53770 --- Comment #3 from petschy at gmail dot com --- Tried now w/ g++ 4.9.1 (Debian 4.9.1-19), and single stepping is still wrong. The only difference is that it doesn't stop on ++f in line 30 after breaking out from the loop. However, all the other bugs are still there. Also tested w/ g++-5.0 (8fe6ab3): - in do_print(), the outer loop iteration is OK, it doesn't stop now on the last printf line - it still doesn't step on the break; on line 26 - but then continues correctly on line 36 after breaking out (was: line 30 ++f) - the difference is still present between the templated and non-templated versions: do_print2(), the non-templated version jumps from line 85 to 91 (the break is still skipped), which is the closing brace of the loop. The templated version jumps to the first statement after the loop, correctly. $ g++-5.0.0 -v Using built-in specs. COLLECT_GCC=g++-5.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-5.0.0 Thread model: posix gcc version 5.0.0 20150203 (experimental) (GCC)
[Bug debug/53770] Regression: incorrect line numbers in debug info since 4.5+
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53770 --- Comment #4 from petschy at gmail dot com --- Clarification: I double checked now, and the templated and the non-templated versions (do_print vs do_print2) are the same, with the same disassembly, only the addresses differing. This is true for 4.9.1 and for 5.0, too. So my previous statement that these two versions of the functions behave differently doesn't hold. So I sum up again: 4.9.1: - the printf (line 62 122) is stepped on in each loop iteration, this is the jmp insn in the disassembly, that jumps to the beginning of the loop. the insns for the printf start on the next insn. - if the condition is true in line 25/85, the break; in line 26 is skipped and it stops on line 36/96, which is the next statement after the break; - upon loop exit, after stepping over the final printf (line 62/122), it jumps back to the while() in line 11/71, and only prints if this while() is stepped over, then goes to the closing brace in line 63/123 5.0: - the printf (62/122) is not stepped on in each iteration, OK - break; is not stepped on in line 25/85, as above, however, if the condition is true, it jumps to line 31/91, the closing brace of the loop, and then to line 36/96, the next statement. I think it should go immediately to the next statement. The exception would be to run dtors upon loop exit, but this is not the case now, as the variables are plain chars. - the loop exit is OK, it doesn't jump back to the while() after the printf, and prints when the printf is stepped over. If I put a breakpoint to line 26/86 in the debugger (the break; that is skipped), gdb says that placed the breakpoints on the same lines. However, when single-stepping, it will break on line 28/88, if the condition is false. This is the statement after the if() w/ the break; If the condition is true, it won't stop on the breakpoint. This is true for both gcc versions tested.
[Bug c++/64615] Access level check error: g++ thinks the non default ctor is protected while its public
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64615 --- Comment #1 from petschy at gmail dot com --- Created attachment 34457 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34457action=edit complete test case
[Bug c++/64615] New: Access level check error: g++ thinks the non default ctor is protected while its public
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64615 Bug ID: 64615 Summary: Access level check error: g++ thinks the non default ctor is protected while its public Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Access level is changed from protected (base class) to public (derived class) via using declaration. The default ctor and the two overloaded member functions are accessible, however the non-default ctor is not, g++ complains that it's protected. g++ 4.9 and 5.0 (20150115) give the same error messages. cmdline: g++-5.0.0 -Wall -std=c++11 20150115-using_base_ctor.cpp 8---8---8---8--- class B { protected: B() { } B(int) { } void Foo() { } void Foo(int) { } }; class D : public B { public: using B::B; using B::Foo; }; void d_ctor() { D d; } void d_ctor2() { D d(0); // ! } void d_foo(D* d) { d-Foo(); } void d_foo2(D* d) { d-Foo(0); }
[Bug c++/64596] Friendship not recognized and template param deduction error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64596 --- Comment #1 from petschy at gmail dot com --- Created attachment 3 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=3action=edit code
[Bug c++/64596] New: Friendship not recognized and template param deduction error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64596 Bug ID: 64596 Summary: Friendship not recognized and template param deduction error Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Explanation of the attached code: Str is a stream class, Txn is its helper, thus needs access to Str's private members. There is also a Glue class that is supposed to glue together a stream implementation and the streamed type. This way the streamed type needs to grant friend access only to Glue, and need not / should not know about the streams. The stream might also grant friendship to Glue. For a specific stream type and streamed type Glue should be specialized and perform the actual streaming there, optionally using the private members of the stream and the streamed type. Implementing this I ran into two errors and though I have a workaround, it's not clear to me that the errors are due to some strange interaction of the language rules, or bugs in the compiler. Plan A: the Txn class is separate from Str. Str wants to grant friendship to Txn, but with the same E type only. templatebool R friend class TxnE,R; won't work as the compiler sees this as a specialization. To work around this I used a templated using directive: templatebool R using Tx = TxnE,R; and granted friendship to Tx. Unfortunately this doesn't work, when Txn tries to access the private member i, an error is reported. Plan B: move Txn inside StrB. This solves the access problem, but then there's a new error: the Glue specialization won't compile, telling me that the template params cannot be deduced. This is strange, as the syntax is the same as in Plan A. Plan C: move Txn inside StrC, but create TxnC, a simple forwarder class outside, too. This way the access works, since Txn is inside, and the Glue specialization works since we glue the outside class, TxnC. However, this is tedious with real code. Tested with 4.9.1 and 5.0 (20141222), both give the same errors: g++ -std=c++11 -Wall 20150114-friend.cpp 20150114-friend.cpp:81:8: error: template parameters not deducible in partial specialization: struct GlueStrBTxnE, R, int // error: template parameters not deducible in partial specialization ^ 20150114-friend.cpp:81:8: note: ‘E’ 20150114-friend.cpp: In instantiation of ‘TxnE, R::Txn(TxnE, R::S) [with E = int; bool R = false; TxnE, R::S = Strint]’: 20150114-friend.cpp:45:38: required from ‘static void GlueTxnE, R, int::Foo() [with E = int; bool R = false]’ 20150114-friend.cpp:52:30: required from here 20150114-friend.cpp:17:6: error: ‘int Strint::i’ is private int i; ^ 20150114-friend.cpp:28:3: error: within this context ++s.i; // error: ‘int Strint::i’ is private ^
[Bug c++/64446] New: Misleading error message when inheriting from a template class w/o the template params
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64446 Bug ID: 64446 Summary: Misleading error message when inheriting from a template class w/o the template params Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com When compiling the under code, g++ gives a misleading error message: $ g++-5.0.0 -Wall 20141230-templ_base.cpp 20141230-templ_base.cpp:7:1: error: expected class-name before ‘{’ token 'Base' is definitely a valid class name. The problem is that the name given is a class, but it's a template and the template argument is missing. However, in the second case, when inheriting from Base2 and only one template argument is given of the two, the error message is OK: wrong number of template arguments (1, should be 2). Something similar would be desirable in the first case, not to waste time staring at the screen, searching for a typo in the class name and finding none. Like 'classname is a template but no template arguments are given'. $ g++-5.0.0 -v Using built-in specs. COLLECT_GCC=g++-5.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-5.0.0 Thread model: posix gcc version 5.0.0 20141222 (experimental) (GCC) 4.9 and 4.8 gives the same misleading error message. 8888 templatetypename T struct Base { }; struct Foo : Base { // error: expected class-name before ‘{’ token }; templatetypename T, typename U struct Base2 { }; struct Foo2 : Base2int // OK: wrong number of template arguments (1, should be 2) { };
[Bug c++/64446] Misleading error message when inheriting from a template class w/o the template params
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64446 --- Comment #1 from petschy at gmail dot com --- One subtlety: templatetypename T=void struct Base3 { }; struct Foo3 : Base3 { }; In this case complaining about missing template params is probably inappropriate, since Base3 is perfectly valid. So on second thought, the error should be about the missing after the class name.
[Bug c++/64380] New: Missed optimization: smarter dead store elimination in dtors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64380 Bug ID: 64380 Summary: Missed optimization: smarter dead store elimination in dtors Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Some of the stores are eliminated in dtors already, if the values are not used later. But if there are function calls after the stores, then they are not eliminated. I see the reason for this, but some functions are special, eg free(), operator delete and others with the same semantics: they won't crawl back and access these variables, so if the vars are not used locally, and no other functions are called, the stores could be eliminated. This would be useful eg for classes where there is a user callable function that releases some/all resources, while keeping the instance alive, and the dtor calls the same function to release all resources. In this latter case, stores that are otherwise needed to have a proper state can be omitted since the instance is being destroyed, anyway. This is a minor issue probably, since the program shouldn't spend most of its time running dtors. However, some function attribute symmetric in spirit to 'malloc' would be nice: eg 'free': this would mean that if called, it won't reach back to the variables of the calling scope, either through its arguments or through global variables, so those stores could be safely eliminated that are otherwise dead. g++-5.0.0 -v Using built-in specs. COLLECT_GCC=g++-5.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../configure --enable-languages=c,c++ --disable-multilib --program-suffix=-5.0.0 Thread model: posix gcc version 5.0.0 20141222 (experimental) (GCC) Compliled with g++-5.0.0 -g -O3 -Wall 20141222-dtor-deadstore.cpp Dump of assembler code for function test_ra(Foo*): 0x004005f0 +0: push %rbp 0x004005f1 +1: push %rbx 0x004005f2 +2: mov%rdi,%rbp 0x004005f5 +5: sub$0x8,%rsp # two stores before the loop 0x004005f9 +9: movl $0x1,0x10(%rdi) 0x00400600 +16:movl $0x2,0x14(%rdi) 0x00400607 +23:mov0x8(%rdi),%rdi 0x0040060b +27:test %rdi,%rdi 0x0040060e +30:je 0x400620 test_ra(Foo*)+48 0x00400610 +32:mov(%rdi),%rbx 0x00400613 +35:callq 0x4004d0 _ZdlPv@plt 0x00400618 +40:test %rbx,%rbx 0x0040061b +43:mov%rbx,%rdi 0x0040061e +46:jne0x400610 test_ra(Foo*)+32 # two stores after the loop, so far so good 0x00400620 +48:movq $0x0,0x8(%rbp) 0x00400628 +56:movl $0x3,0x18(%rbp) 0x0040062f +63:add$0x8,%rsp 0x00400633 +67:pop%rbx 0x00400634 +68:pop%rbp 0x00400635 +69:retq Dump of assembler code for function test_dtor(Foo*): # two stores before the lop in the dtor. these won't ever be read again # could be eliminated 0x00400640 +0: movl $0x1,0x10(%rdi) 0x00400647 +7: movl $0x2,0x14(%rdi) 0x0040064e +14:mov0x8(%rdi),%rdi 0x00400652 +18:test %rdi,%rdi 0x00400655 +21:je 0x400671 test_dtor(Foo*)+49 0x00400657 +23:push %rbx 0x00400658 +24:nopl 0x0(%rax,%rax,1) 0x00400660 +32:mov(%rdi),%rbx 0x00400663 +35:callq 0x4004d0 _ZdlPv@plt 0x00400668 +40:test %rbx,%rbx 0x0040066b +43:mov%rbx,%rdi 0x0040066e +46:jne0x400660 test_dtor(Foo*)+32 # no stores here, the ones after 'delete' were eliminated successfully 0x00400670 +48:pop%rbx 0x00400671 +49:repz retq Dump of assembler code for function test_dtor2(Foo*): 0x00400680 +0: push %rbp 0x00400681 +1: push %rbx 0x00400682 +2: mov%rdi,%rbp 0x00400685 +5: sub$0x8,%rsp # 4 dead stores in the src, the one to the same addr is eliminated 0x00400689 +9: movl $0xc0,(%rdi) 0x0040068f +15:movl $0x1,0x10(%rdi) 0x00400696 +22:movl $0x2,0x14(%rdi) 0x0040069d +29:mov0x8(%rdi),%rdi 0x004006a1 +33:test %rdi,%rdi 0x004006a4 +36:je 0x4006c0 test_dtor2(Foo*)+64 0x004006a6 +38:nopw %cs:0x0(%rax,%rax,1) 0x004006b0 +48:mov(%rdi),%rbx 0x004006b3 +51:callq 0x4004d0 _ZdlPv@plt 0x004006b8 +56:test %rbx,%rbx 0x004006bb +59:mov%rbx,%rdi 0x004006be +62:jne
[Bug c++/64191] New: -march=native messes up dead code elimination in loop calling dtor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64191 Bug ID: 64191 Summary: -march=native messes up dead code elimination in loop calling dtor Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Without -march=native, the loops in the 3 fns are eliminated as expected, resulting in single retq's. With -march=native, the loop which calls the defined, but empty dtor is compiled into something rather weird. However, the other empty Nop() call is optimized away as expected. g++-5.0.0 -g -O3 -Wall -Wextra -c 20141205-dtor_loop.cpp g++-5.0.0 -g -O3 -Wall -Wextra -o 20141205-dtor_loop 20141205-dtor_loop.o Dump of assembler code for function foo_dtor_loop(Foo*, unsigned int): 0x00400570 +0:repz retq Dump of assembler code for function bar_dtor_loop(Bar*, unsigned int): 0x00400580 +0:repz retq Dump of assembler code for function bar_nop_loop(Bar*, unsigned int): 0x00400590 +0:repz retq So far so good. g++-5.0.0 -g -O3 -march=native -Wall -Wextra -c 20141205-dtor_loop.cpp g++-5.0.0 -g -O3 -march=native -Wall -Wextra -o 20141205-dtor_loop 20141205-dtor_loop.o Dump of assembler code for function foo_dtor_loop(Foo*, unsigned int): 0x00400570 +0:retq Dump of assembler code for function bar_dtor_loop(Bar*, unsigned int): 0x00400578 +0: test %rdi,%rdi 0x0040057b +3: je 0x4005b8 bar_dtor_loop(Bar*, unsigned int)+64 0x0040057d +5: mov%esi,%esi 0x0040057f +7: lea(%rdi,%rsi,4),%rax 0x00400583 +11:cmp%rax,%rdi 0x00400586 +14:jae0x4005b8 bar_dtor_loop(Bar*, unsigned int)+64 0x00400588 +16:mov$0x3,%edx 0x0040058d +21:lea-0x4(%rax),%rsi 0x00400591 +25:sub%rdi,%rdx 0x00400594 +28:add%rsi,%rdx 0x00400597 +31:mov%rdx,%rcx 0x0040059a +34:shr$0x2,%rcx 0x0040059e +38:lea0x1(%rcx),%r8 0x004005a2 +42:dec%rcx 0x004005a5 +45:shr%rcx 0x004005a8 +48:lea0x2(%rcx,%rcx,1),%rcx 0x004005ad +53:cmp$0x2f,%rdx 0x004005b1 +57:jbe0x4005b8 bar_dtor_loop(Bar*, unsigned int)+64 0x004005b3 +59:cmp%rcx,%r8 0x004005b6 +62:je 0x4005b8 bar_dtor_loop(Bar*, unsigned int)+64 0x004005b8 +64:retq Dump of assembler code for function bar_nop_loop(Bar*, unsigned int): 0x004005c0 +0: retq The bar_dtor_loop() fn is clearly a mess, unfortunately I can't follow the computation. The bar_inc_loop() does a single int increment on each object, to see what loop code is generated if not empty fns are called. It is as expected: the loop is unrolled 16x times, and the residual part is executed in a tight loop: 0x00400648 +120:sub$0x4,%rdx 0x0040064c +124:incl (%rdx) 0x0040064e +126:cmp%rdx,%rdi 0x00400651 +129:jb 0x400648 bar_inc_loop(Bar*, unsigned int)+120 g++-5.0.0 -v Using built-in specs. COLLECT_GCC=g++-5.0.0 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --enable-languages=c,c++ --disable-multilib --program-suffix=-5.0.0 Thread model: posix gcc version 5.0.0 20141027 (experimental) (GCC) cat /proc/cpuinfo processor: 0 vendor_id: AuthenticAMD cpu family: 21 model: 1 model name: AMD FX(tm)-8150 Eight-Core Processor stepping: 2 microcode: 0x6000626 cpu MHz: 1400.000 cache size: 2048 KB physical id: 0 siblings: 8 core id: 0 cpu cores: 4 apicid: 16 initial apicid: 0 fpu: yes fpu_exception: yes cpuid level: 13 wp: yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bugs: fxsave_leak bogomips: 7624.63 TLB size: 1536 4K pages clflush size: 64 cache_alignment: 64 address sizes: 48 bits physical, 48 bits virtual power management: ts ttp tm 100mhzsteps hwpstate cpb Unfortunately, I couldn't test with the latest version since the build fails
[Bug c++/63657] [4.9/5 regression] -Wunused-variable: warning supressed by virtual dtor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63657 petschy at gmail dot com changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|DUPLICATE |--- --- Comment #5 from petschy at gmail dot com --- To further prove my case, here is the disassembly of foo() and bar(). As expected, they are identical, no ctors/dtors are ever called, so there can be no side effects. Dump of assembler code for function foo(): 0x00400766 +0: push %rbp 0x00400767 +1: mov%rsp,%rbp 0x0040076a +4: sub$0x10,%rsp 0x0040076e +8: callq 0x400797 getfoo() 0x00400773 +13:mov%rax,-0x8(%rbp) 0x00400777 +17:leaveq 0x00400778 +18:retq End of assembler dump. Dump of assembler code for function bar(): 0x00400779 +0: push %rbp 0x0040077a +1: mov%rsp,%rbp 0x0040077d +4: sub$0x10,%rsp 0x00400781 +8: callq 0x4007a2 getbar() 0x00400786 +13:mov%rax,-0x8(%rbp) 0x0040078a +17:leaveq 0x0040078b +18:retq End of assembler dump.
[Bug c++/63657] [4.9/5 regression] -Wunused-variable: warning supressed by virtual dtor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63657 --- Comment #4 from petschy at gmail dot com --- Sorry, but this is definitely not the same issue. Bug 38958 is about returning by value and binding to a reference. This issue is about returning a REFERENCE and binding it to a reference. No class ctor/dtor/copy is involved, references are just syntactic sugar on pointers: a single pointer is returned and stored locally. This is 100% sure, since in the production code the bug appeared with a singleton which is not constructible/destructible/copyable to the outside world. Please reopen the issue.
[Bug c++/63657] [4.9 regression] -Wunused-variable: warning supressed by virtual dtor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63657 --- Comment #1 from petschy at gmail dot com --- Bisected the regression: commit a8b52ce38f3056e464457ba1e95efa25a8f08d07 Author: paolo paolo@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Jun 12 21:36:36 2013 + /cp 2013-06-12 Paolo Carlini paolo.carl...@oracle.com PR c++/38958 * decl.c (poplevel): For the benefit of -Wunused-variable see through references. /testsuite 2013-06-12 Paolo Carlini paolo.carl...@oracle.com PR c++/38958 * g++.dg/warn/Wunused-var-20.C: New. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@200042 138bc75d-0d04-0410-961f-82ee72b054a4
[Bug c++/63657] New: [4.9 regression] -Wunused-variable: warning supressed by virtual dtor fn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63657 Bug ID: 63657 Summary: [4.9 regression] -Wunused-variable: warning supressed by virtual dtor fn Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com The under code has two unused variables, which are references to classes. We should have two warnings, however 4.9.1 and 5.0 trunk gives just one. 4.7.2 and 4.8.3 are ok. The second warning is supressed by the virtual dtor in Bar. Only the dtor does the trick, if I comment it out or instead I define a plain virtual fn, the warning appears. g++-4.8 -Wunused-variable -c 20141022-unused_warn.cpp 20141022-unused_warn.cpp: In function ‘void foo()’: 20141022-unused_warn.cpp:7:7: warning: unused variable ‘f’ [-Wunused-variable] Foo f = getfoo(); ^ 20141022-unused_warn.cpp: In function ‘void bar()’: 20141022-unused_warn.cpp:18:7: warning: unused variable ‘b’ [-Wunused-variable] Bar b = getbar(); ^ g++-5.0.0 -Wunused-variable -c 20141022-unused_warn.cpp 20141022-unused_warn.cpp: In function ‘void foo()’: 20141022-unused_warn.cpp:7:7: warning: unused variable ‘f’ [-Wunused-variable] Foo f = getfoo(); ^ 888 class Foo { }; Foo getfoo(); void foo() { Foo f = getfoo(); } class Bar { virtual ~Bar() {} }; Bar getbar(); void bar() { Bar b = getbar(); }
[Bug c/28901] -Wunused-variable ignores unused const initialised variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28901 petschy at gmail dot com changed: What|Removed |Added CC||petschy at gmail dot com --- Comment #7 from petschy at gmail dot com --- For backward compatibility, wouldn't it be better to add a new flag, -Wunused-static-const-variable, instead of changing the behaviour of -Wunused-variable?
[Bug c++/62062] New: Missed optimization: write ptr reloaded in each loop iteration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62062 Bug ID: 62062 Summary: Missed optimization: write ptr reloaded in each loop iteration Product: gcc Version: 4.9.2 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com When developing a binary i/o lib, I ran into some performance degradation in the writer functions. My investigation revealed that the write pointer was loaded/stored in each loop iteration. Although this can be dodged by hand-tuning the code via local variables kept in registers, the resulting code is longer, less clear, harder to maintain, etc. For this report, I recompiled w/ 4.9.2, but earlier versions in 4.x give the same results. The test box is an AMD FX w/ Debian Jessie. gcc compiled from git commit f964b16: g++-4.9.2 -v Using built-in specs. COLLECT_GCC=g++-4.9.2 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --enable-languages=c,c++ --disable-multilib --program-suffix=-4.9.2 Thread model: posix gcc version 4.9.2 20140808 (prerelease) (GCC) compiler/linker flags used: g++-4.9.2 -c 20140725-reg_vs_mem.cpp -g -std=c++11 -Wall -Wextra -Werror -Wundef -Wshadow -O3 -fno-tree-vectorize g++-4.9.2 -o 20140725-reg_vs_mem 20140725-reg_vs_mem.o -g Tried with less optimization, too, but made no difference. -fno-tree-vectorize was used because otherwise the code generated for the encoder functions used the vector registers, which resulted in serious code size bloat and ~2-3x runtimes, opposed to the ~1.5x runtime increase due to the redundant loads/stores. For the tests, I made two versions for each function: one that expects a ptr ref (baseline version), and another one that expects a buffer object (w/ beg/end ptrs for advanced functionality, which is not used here). As expected, the same code is generated for these two. I ruled out the aliasing rules first: I use int ptrs in this test, so that the 'char* aliases everything' rule is dodged. To prove my case: write_run_char_ptr_ref() loads/stores the char ptr in each loop, since the character written invalidates the pointer itself: 0x00400570 +0: sub$0x1,%edx 0x00400573 +3: js 0x40058d write_run_char_ptr_ref(char*, int, int)+29 0x00400575 +5: nopl (%rax) 0x00400578 +8: mov(%rdi),%rax # loop beg, load ptr 0x0040057b +11:sub$0x1,%edx 0x0040057e +14:cmp$0x,%edx 0x00400581 +17:lea0x1(%rax),%rcx 0x00400585 +21:mov%rcx,(%rdi) # store updated ptr 0x00400588 +24:mov%sil,(%rax) 0x0040058b +27:jne0x400578 write_run_char_ptr_ref(char*, int, int)+8 0x0040058d +29:repz retq write_run_char_ptr_ref_unaliased() keeps the ptr in a register, but this was achieved w/ some platform dependent asm trickery, see unaliased_storeb() [btw, some platform independent builtin would be nice for this], I use this in the production code, which writes bytes at the lowest level. 0x00400590 +0: test %edx,%edx 0x00400592 +2: jle0x4005af write_run_char_ptr_ref_unaliased(char*, int, int)+31 0x00400594 +4: mov(%rdi),%rax # load ptr 0x00400597 +7: sub$0x1,%edx 0x0040059a +10:lea0x1(%rax,%rdx,1),%rdx # end ptr 0x0040059f +15:nop 0x004005a0 +16:mov%sil,(%rax) # loop body, no ptr load/store 0x004005a3 +19:add$0x1,%rax 0x004005a7 +23:cmp%rdx,%rax 0x004005aa +26:jne0x4005a0 write_run_char_ptr_ref_unaliased(char*, int, int)+16 0x004005ac +28:mov%rax,(%rdi) # store ptr after the loop 0x004005af +31:repz retq write_run_ptr_ref() uses int ptr, and the ptr is kept in a register for the loop, without any trickery. The disassembly is the same as write_run_char_ptr_ref_unaliased() above, except ints are written. write_run_buf() is exactly the same. So far so good. The next step is a variable length encoder, encode_ptr_ref(). This is not a real working encoding, just for demonstration. Since the probabilities are implied from the encoding lengths, the if conditions are peppered with builtin_expect's. Again, the ptr ref and buf object versions are exactly the same. However, I noticed that the ptr is written back multiple times, depending on which 'if' becomes true. For the most likely case, it doesn't matter: only one write is performed, but before the conditional jump. For the last two less likely cases, two redundant writes are performed. Moreover, for the 3rd 'if' block the actual useful ptr write
[Bug c++/62062] Missed optimization: write ptr reloaded in each loop iteration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62062 --- Comment #1 from petschy at gmail dot com --- Created attachment 33274 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33274action=edit source
[Bug c++/62062] Missed optimization: write ptr reloaded in each loop iteration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62062 --- Comment #3 from petschy at gmail dot com --- (In reply to Andrew Pinski from comment #2) Your inline asm is broken. How? In unaliased_storeb()? That's for only proving that the redundant loads/stores are not an artifact of aliasing, and is not used by the functions that write to int*'s.
[Bug c++/60625] New: attributes on template member function definition inside class definition not supported
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60625 Bug ID: 60625 Summary: attributes on template member function definition inside class definition not supported Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com struct Foo { templateint U // error: attributes are not allowed on a function-definition static int Bar() __attribute__((always_inline)) { return U; } // no error, although this is a fn definition, too static int Baz(bool x) __attribute__((always_inline)) { return x ? Bar5() : Bar42(); } }; g++ -c 20140323-force_inline.cpp 20140323-force_inline.cpp:5:20: error: attributes are not allowed on a function-definition static int Bar() __attribute__((always_inline)) Tried with all minor versions from 4.4 to 4.9, same results. Supporting attributes on in-class defined template members would be really useful, or rather, convenient. Otherwise, one have to declare the function with the attribute inside the class, then define it outside the class. For short and simple functions this is rather tedious. Regards, Peter
[Bug c++/60625] attributes on template member function definition inside class definition not supported
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60625 --- Comment #3 from petschy at gmail dot com --- Thanks. It's then an inconsistency, right? Because the non-template fn def didn't trigger the error while the template version did so. Moreover, the error message is misleading, because it said attributes were not allowed, while they are allowed, just not at the end.
[Bug tree-optimization/57723] Missed optimization: recursion around empty function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57723 --- Comment #7 from petschy at gmail dot com --- Is it a plausible assumption that if a function is not marked as 'noreturn' and the loop doesn't change the program's state then the loop could be optimized away? Cases: - the loop terminates, but the state is not changed, NOP - the loop does not terminate (in this case a cycle of the Node's), but the function should return (no noreturn attr), so this is probably a bug in the prg I can't think of any cases right now for the second point where that would be the desired behaviour of the program, instead of a bug. Please comment on this.
[Bug tree-optimization/57723] Missed optimization: recursion around empty function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57723 --- Comment #10 from petschy at gmail dot com --- Thanks for the explanation. The multithreaded argument is sound, but then, on second thought, even in single threaded programs the state might be altered by a signal handler, or another process if the memory is shared, so the optimization might break the program. The bottom line is that the compiler doesn't have enough information, so it must be conservative, hence the loop stays in. Adding a new fn attribute probably wouldn't be enough, since in general there might be more than one potentially infinite loop inside the fn, with different semantics, so optimizing all of them away still could be improper. Hence the attribute should apply to a given loop only ('finite'), but implementing it is probably too much trouble for this rare case, and the compiler still needs to eliminate the recursion, too, which might be more complex, eg multiple functions calling each other in the general case. For my specific case, I can solve the problem by providing a trait for the allocator which says 'free() is NOP, so don't bother', then the top level function can decide what to do, traverse free or do nothing. Mikael: could you please provide some info on the ptrace() wizardy you mentioned, if it's not confidental? I got curious. Based on the discussion so far, do you think that clang is overly smart in this case, producing potentially broken code? free_all2() was compiled into a single ret, and the other two functions lack the recursion, only have the node traversal of the current level, which seems to be an error to me, because if there is an infinite loop on one of the levels, the program's behaviour will be different when compiled with optimizations. If I set n_-down to null before the recursive call, it generated the expected code, which is interesting, since then the loop is more likely on the 'finite side'. Thanks, Peter
[Bug tree-optimization/57723] New: Missed optimization: recursion around empty function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57723 Bug ID: 57723 Summary: Missed optimization: recursion around empty function Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com Background: freeing nodes of a tree allocated with custom allocators. One of the allocators can't free individual pointers, so free() is NOP in that case (the whole pool will be freed at once when the allocator is destroyed). With this allocator, the whole recursive traversal can be eliminated in theory. Examining the disasm of the generated code revealed that gcc unfolds the recursion many levels, just to do the unneeded node traversal; the actual call to the empty free() fn is eliminated. In the test case, loop() does a simple linear traversal of the linked nodes. The pointers are not volatile, and are only read, so there should not be any side effects. Why can't the compiler optimize away the whole loop? Clang does a somewhat better job, the recursion is optimized away, and one function is completely reduced to NOP (free_all2()), but the others still have the node traversal loop. Tried with gcc 4.6, 4.7.3, 4.9.0 with the same results. g++-4.9.0 -v: Using built-in specs. COLLECT_GCC=g++-4.9.0 COLLECT_LTO_WRAPPER=/home/usr-local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --enable-languages=c,c++ --program-suffix=-4.9.0 Thread model: posix gcc version 4.9.0 20130626 (experimental) (GCC) commit 944f42fc29289812416f34d7b0c497ee79065396 command line: g++-4.9.0 -std=c++11 -O3 -Wall 20130626-free_node.cpp Regards, Peter
[Bug tree-optimization/57723] Missed optimization: recursion around empty function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57723 --- Comment #1 from petschy at gmail dot com --- Created attachment 30365 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30365action=edit test case source
[Bug tree-optimization/57723] Missed optimization: recursion around empty function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57723 --- Comment #2 from petschy at gmail dot com --- Created attachment 30366 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30366action=edit gcc amd64 disassembly
[Bug tree-optimization/57723] Missed optimization: recursion around empty function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57723 --- Comment #3 from petschy at gmail dot com --- Created attachment 30367 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30367action=edit clang amd64 disassembly
[Bug tree-optimization/57723] Missed optimization: recursion around empty function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57723 --- Comment #4 from petschy at gmail dot com --- Ooops, the test case won't perform the freeing completely, since the recursive call is not inside the 'down' traversal loop, so only the first node on the given level would be recursively freed, but this doesn't affect the missed optimization.
[Bug tree-optimization/57723] Missed optimization: recursion around empty function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57723 --- Comment #5 from petschy at gmail dot com --- Created attachment 30368 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30368action=edit fixed test case (correct recursive traversal)
[Bug tree-optimization/57236] Missed optimization: weird pointer update after the loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57236 --- Comment #5 from petschy at gmail dot com --- I spotted a minor size inefficiency in the code: 0x00400720 +32:jle0x4007c5 _Z6write2R6Streamj+197 0x00400726 +38:mov%rdx,%rsi 0x00400729 +41:test %r12b,%r12b ... 0x004007c5 +197:mov(%rbx),%rsi 0x004007c8 +200:mov%rbx,%rdi 0x004007cb +203:callq 0x4005d0 _ZN6Stream5WriteEPhS0_ 0x004007d0 +208:mov(%rbx),%rdx 0x004007d3 +211:mov%rdx,0x8(%rbx) 0x004007d7 +215:mov%rdx,%rsi 0x004007da +218:jmpq 0x400729 _Z6write2R6Streamj+41 The mov at +215 would not be needed if the jump went to +38 instead of +41.
[Bug tree-optimization/57236] New: Missed optimization: weird pointer update after the loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57236 Bug ID: 57236 Summary: Missed optimization: weird pointer update after the loop Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: minor Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com In short: In a loop, I write to and increment a pointer in each iteration. Then, after the loop, I write to the pointer once more. I noticed that after the loop, instead of using the pointer and writing to it, the generated code calculates the resulting pointer from the loop count and the value of the pointer before the loop. This is clearly unneeded work. In detail: The attached code was reduced from a variable-length integer I/O lib, using binary streams. Bytes go into a buffer in the stream class, and get flushed when the buffer is full. Stream::WriteU8() writes a single byte: checks whether there is place in the buffer, flushes if needed and stores the byte. Calling this fn from loops is ineffective, as the flush check is performed on each call. To work around this, there is the Txn class which allows the flush check to be amortized: checks only once in the ctor, and the stream buffer ptr is updated in the dtor with the number of bytes written. write1() uses Stream::WriteU8() directly, the buffer pointer gets loaded/stored on each iteration. I tried to Ensure() the needed number of bytes before the loop, but that didn't eliminate the loads/stores, I guess that this is too hard to track in the optimizer, though I'd be interested to read comments about this. write2() is the same, except Stream::Txn is used. The pointer is loaded once, written to and incremented, so far so good. But when the loop exits, comes the weird part. The generated code looks like: 0x0040078d +141: mov%rsi,%rdi 0x00400790 +144: mov%r12d,%ecx 0x00400793 +147: mov%ebp,%r8d 0x00400796 +150: add$0x1,%rdi 0x0040079a +154: shr%cl,%r8d 0x0040079d +157: and$0x7f,%r8d 0x004007a1 +161: mov%r8b,-0x1(%rdi) 0x004007a5 +165: sub$0x7,%cl 0x004007a8 +168: jne0x400793 _Z6write2R6Streamj+147 loop ends here, when cl is zero, then 0x004007aa +170: mov$0xffb7,%ecx 0x004007af +175: mov%r12d,%eax 0x004007b2 +178: imul %ecx,%eax 0x004007b5 +181: sub$0x1,%eax 0x004007b8 +184: movzbl %al,%eax 0x004007bb +187: lea0x1(%rsi,%rax,1),%rsi jump to the last WriteU8() after the loop 0x004007c0 +192: jmpq 0x40072e _Z6write2R6Streamj+46 The source is: while (UNLIKELY(b)) { txn.WriteU8((v b) 0x7f); b -= 7; } txn.WriteU8(v | 0x80); If I undertand correctly, that imul calculates the pointer increment from the loop count, which is added to rsi (the pointer value before the loop). However, the pointer is readily available in rdi. clang 3.4 generates the expected code: after the loop, just jumps and uses rax for the final write: 0x00400723 +179: mov%rdx,%rax 0x00400726 +182: mov%bl,%cl 0x00400728 +184: mov%ebp,%esi 0x0040072a +186: shr%cl,%esi 0x0040072c +188: and$0x7f,%sil 0x00400730 +192: mov%sil,(%rax) 0x00400733 +195: inc%rax 0x00400736 +198: add$0xf9,%bl 0x00400739 +201: jne0x400726 write2(Stream, unsigned int)+182 0x0040073b +203: jmpq 0x40069f write2(Stream, unsigned int)+47 For the full disasm dumps, see the attached files. I'm just guessing that this is a tree-optimization issue, please change the component if needed. On a side note: are there any benefits of pre-incrementing the pointer and then writing to offset -1? The insn is 4 bytes instead of 3, does the better scheduling (?) justify the code size increment? Versions tested: g++-4.8.1 -v Using built-in specs. COLLECT_GCC=g++-4.8.1 COLLECT_LTO_WRAPPER=/home/usr-local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.8.1/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --enable-languages=c,c++ --program-suffix=-4.8.1 Thread model: posix gcc version 4.8.1 20130427 (prerelease) (GCC) g++ -v Using built-in specs. COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.2-5' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
[Bug tree-optimization/57236] Missed optimization: weird pointer update after the loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57236 --- Comment #1 from petschy at gmail dot com --- Created attachment 30089 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30089action=edit preprocessed source
[Bug tree-optimization/57236] Missed optimization: weird pointer update after the loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57236 --- Comment #2 from petschy at gmail dot com --- Created attachment 30090 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30090action=edit original source
[Bug tree-optimization/57236] Missed optimization: weird pointer update after the loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57236 --- Comment #3 from petschy at gmail dot com --- Created attachment 30091 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30091action=edit 4.8.1 generated code of write2()
[Bug tree-optimization/57236] Missed optimization: weird pointer update after the loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57236 --- Comment #4 from petschy at gmail dot com --- Created attachment 30092 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30092action=edit clang 3.4 generated code of write2()
[Bug tree-optimization/57244] New: Missed optimization: dead register move before noreturn fn call unnecessary store/load or reg
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57244 Bug ID: 57244 Summary: Missed optimization: dead register move before noreturn fn call unnecessary store/load or reg Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: minor Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: petschy at gmail dot com These are two separate issues, however, both occured in the same function, so I think it's simpler to report them together. The reduced test case is the reader equivalent of the writer code I posted earlier today in #57236. The workings are very similar: using a helper class to amortize the number of buffer refills. The compiler unrolled the loop, with five iterations. The read pointer is kept in a register (rbx), not incremented, but used with increasing offsets. The potential end pointer is kept in r12, updated in each iteration to the value value corresponding to the bytes read (ebx + N), and stored back to the memory at the end. However, I noticed a third issue now when looking through the code: 0) Just before the exit, the store of the end pointer looks like this: 0x0040090a +58:sub%ebx,%r12d 0x0040090d +61:add%r12,%rbx 0x00400910 +64:mov%rbx,0x10(%rbp) ebx is the initial read pointer, r12 is the new pointer after N bytes were read. r12d-ebx = N, rbx + N = new end = r12d. The sub and the add is unnecessary, a single mov %r12d,0x10(%rbp) would do the very same. +61 and +64 are not branch targets, so I think the code could be optimized more. 1) if there are no bytes at all in the buffer after the refill, we should throw an exception: 0x004008f1 +33:cmp%rcx,%rbx 0x004008f4 +36:jae0x4009a1 _Z5read2R6Stream+209 ... 0x004009a1 +209:mov%rbx,%r12 0x004009a4 +212:callq 0x400830 _Z9throw_eofv The mov at +209 is unnecessary. r12 holds the new end pointer (which is the same as the start, ebx, since no bytes were read), but it is only useful if the code ever reaches +58 (see above), where it gets stored back to memory. But that won't happen, since throw_eof() throws an exception and doesn't ever return. The other branches that throw jump to +212, so no dead move there. 2) the last iteration of the unrolled loop misses a check at the end, this changes the register assignments and introduces an unnecessary extra store/load to/from rsp. 4th iteration: 0x0040096a +154:movzbl 0x3(%rbx),%edx 0x0040096e +158:shl$0x7,%eax 0x00400971 +161:lea0x4(%rbx),%r12 0x00400975 +165:mov%edx,%esi 0x00400977 +167:and$0x7f,%esi 0x0040097a +170:or %esi,%eax 0x0040097c +172:test %dl,%dl 0x0040097e +174:js 0x40090a _Z5read2R6Stream+58 5th iteration: 0x00400985 +181:movzbl 0x4(%rbx),%esi 0x00400989 +185:shl$0x7,%eax 0x0040098c +188:lea0x5(%rbx),%r12 0x00400990 +192:mov%sil,(%rsp) 0x00400994 +196:mov(%rsp),%edx 0x00400997 +199:and$0x7f,%edx 0x0040099a +202:or %edx,%eax 0x0040099c +204:jmpq 0x40090a _Z5read2R6Stream+58 This is a regression probably, as 4.7 generates code w/o the store/load, and the byte is at the beginning loaded into %edx, not %esi, just like in the earlier iterations. 4.8.1, 4.9.0 generates the above suboptimal code. The tested gcc versions and flags are the very same as in #57236. Regards, Peter
[Bug tree-optimization/57244] Missed optimization: dead register move before noreturn fn call unnecessary store/load of reg
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57244 --- Comment #1 from petschy at gmail dot com --- Created attachment 30093 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30093action=edit preprocessed source
[Bug tree-optimization/57244] Missed optimization: dead register move before noreturn fn call unnecessary store/load of reg
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57244 --- Comment #2 from petschy at gmail dot com --- Created attachment 30094 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30094action=edit original source
[Bug tree-optimization/57244] Missed optimization: dead register move before noreturn fn call unnecessary store/load of reg
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57244 --- Comment #3 from petschy at gmail dot com --- Created attachment 30095 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30095action=edit disassembly dump
[Bug c++/55097] New: typedef not recognized in templated class
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55097 Bug #: 55097 Summary: typedef not recognized in templated class Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: pets...@gmail.com gcc 4.7 and 4.8 chokes on the following code: 8888888 class Id { public: Id(); Id(char a, char b); explicitId(int v); Id(const char* id_); }; templatetypename ID class Foo { public: // typedef ID IdType; void Bar(const ID id_); typedef ID IdType; }; templatetypename ID void FooID::Bar(const IdType id_) //void FooID::Bar(const ID id_) { } void foo() { FooId f; f.Bar(hello); } 8888888 $ g++ -c gcctypedef.cpp gcctypedef.cpp: In function ‘void foo()’: gcctypedef.cpp:29:15: error: no matching function for call to ‘FooId::Bar(const char [6])’ gcctypedef.cpp:29:15: note: candidate is: gcctypedef.cpp:21:6: note: void FooID::Bar(const IdType) [with ID = Id; FooID::IdType = Id] gcctypedef.cpp:21:6: note: no known conversion for argument 1 from ‘const char [6]’ to ‘Id’ Earlier versions are ok, clang 3.0 also compiles the code. If the typedef in class Foo is _before_ the Bar fn declaration, gcc compiles the code. If instead of the typedef (IdType) the original type (ID) is used in the argument of the Bar fn at the definition, gcc compiles the code. I git bisect'd the commit that introduced the bug: commit 44f861fca343148a1b0720105ec2b7f14bbcc849 Author: jason jason@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Wed Feb 8 09:52:11 2012 + PR c++/52035 * pt.c (tsubst): Strip uninstantiated typedef. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@184000 138bc75d-0d04-0410-961f-82ee72b054a4
[Bug debug/53770] New: Regression: incorrect line numbers in debug info since 4.5+
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53770 Bug #: 53770 Summary: Regression: incorrect line numbers in debug info since 4.5+ Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: debug AssignedTo: unassig...@gcc.gnu.org ReportedBy: pets...@gmail.com Created attachment 27703 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27703 the test case Single stepping the code from the debugger shows bad lines at places, the code is correct however. Command to produce the executable: g++-4.8.0 -o dbginfobug dbginfobug.cpp -save-temps -g -O0 -Wall -Wextra -v Using built-in specs. COLLECT_GCC=g++-4.8.0 COLLECT_LTO_WRAPPER=/home/usr-local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.8.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --enable-languages=c,c++ --program-suffix=-4.8.0 Thread model: posix gcc version 4.8.0 20120605 (experimental) (GCC) COLLECT_GCC_OPTIONS='-o' 'dbginfobug' '-save-temps' '-g' '-O0' '-Wall' '-Wextra' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /home/usr-local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.8.0/cc1plus -E -quiet -v -iprefix /home/usr-local/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/ -D_GNU_SOURCE dbginfobug.cpp -mtune=generic -march=x86-64 -Wall -Wextra -g -fworking-directory -O0 -fpch-preprocess -o dbginfobug.ii ignoring nonexistent directory /home/usr-local/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/../../../../x86_64-unknown-linux-gnu/include ignoring duplicate directory /home/usr-local/bin/../lib/gcc/../../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/../../../../include/c++/4.8.0 ignoring duplicate directory /home/usr-local/bin/../lib/gcc/../../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/../../../../include/c++/4.8.0/x86_64-unknown-linux-gnu ignoring duplicate directory /home/usr-local/bin/../lib/gcc/../../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/../../../../include/c++/4.8.0/backward ignoring duplicate directory /home/usr-local/bin/../lib/gcc/../../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/include ignoring duplicate directory /home/usr-local/bin/../lib/gcc/../../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/include-fixed ignoring nonexistent directory /home/usr-local/bin/../lib/gcc/../../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/../../../../x86_64-unknown-linux-gnu/include #include ... search starts here: #include ... search starts here: /home/usr-local/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/../../../../include/c++/4.8.0 /home/usr-local/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/../../../../include/c++/4.8.0/x86_64-unknown-linux-gnu /home/usr-local/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/../../../../include/c++/4.8.0/backward /home/usr-local/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/include /home/usr-local/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/include-fixed /usr/local/include /usr/include End of search list. COLLECT_GCC_OPTIONS='-o' 'dbginfobug' '-save-temps' '-g' '-O0' '-Wall' '-Wextra' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /home/usr-local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.8.0/cc1plus -fpreprocessed dbginfobug.ii -quiet -dumpbase dbginfobug.cpp -mtune=generic -march=x86-64 -auxbase dbginfobug -g -O0 -Wall -Wextra -version -o dbginfobug.s GNU C++ (GCC) version 4.8.0 20120605 (experimental) (x86_64-unknown-linux-gnu) compiled by GNU C version 4.8.0 20120605 (experimental), GMP version 5.0.5, MPFR version 3.1.0-p10, MPC version 0.9 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU C++ (GCC) version 4.8.0 20120605 (experimental) (x86_64-unknown-linux-gnu) compiled by GNU C version 4.8.0 20120605 (experimental), GMP version 5.0.5, MPFR version 3.1.0-p10, MPC version 0.9 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: c6a0954413719d6fcdda7217d65221de COLLECT_GCC_OPTIONS='-o' 'dbginfobug' '-save-temps' '-g' '-O0' '-Wall' '-Wextra' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' as -v --64 -o dbginfobug.o dbginfobug.s GNU assembler version 2.22 (x86_64-linux-gnu) using BFD version (GNU Binutils for Debian) 2.22 COMPILER_PATH=/home/usr-local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.8.0/:/home/usr-local/bin/../libexec/gcc/ LIBRARY_PATH=/home/usr-local/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/:/home/usr-local/bin/../lib/gcc/:/home/usr-local/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/../../../../lib64/:/lib/../lib64/:/usr/lib/x86_64-linux-gnu/:/home/usr-local/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.8.0/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-o' 'dbginfobug' '-save-temps' '-g' '-O0' '-Wall' '-Wextra' '-v' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /home/usr-local/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.8.0/collect2 --eh-frame-hdr -m elf_x86_64 -dynamic-linker
[Bug debug/53770] Regression: incorrect line numbers in debug info since 4.5+
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53770 --- Comment #1 from petschy at gmail dot com 2012-06-25 20:29:05 UTC --- I tested on 32bit Debian Wheezy, too, with stock 4.6.3, and everything was ok. Another AMD64 Wheezy box w/ stock 4.6.3 showed the bug, with a minor variation: when the condition in line 25 was true, it stepped on the break at line 26, but then also stepped on line 30 with the ++f. The same is for 4.7.0. Looking at the disassembly when standing on ++f: 0x00400a99 +234:nop = 0x00400a9a +235:jmp0x400aa0 do_printchar(char*, unsigned long, char const*)+241 0x00400a9c +237:addl $0x1,-0x14(%rbp) The last insn is ++f, and the jump should belong to the break I guess.
[Bug c++/52460] New: Misleading error message with templated c++ code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52460 Bug #: 52460 Summary: Misleading error message with templated c++ code Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: pets...@gmail.com compiling the following: ---8---8---8---8--- templatetypename T struct Base { typename T::Typevar; }; templatetypename U struct Derived : BaseDerivedU { typedef U Type; }; void foo() { Derivedint i; } ---8---8---8---8--- gives the error gcctempl.cpp: In instantiation of ‘struct BaseDerivedint ’: gcctempl.cpp:7:8: required from ‘struct Derivedint’ gcctempl.cpp:13:15: required from here gcctempl.cpp:4:19: error: no type named ‘Type’ in ‘struct Derivedint’ on all tested gcc versions (4.4, 4.5, 4.6, 4.7). There is definitely a type called 'Type' in struct 'Derived', hence the error message is misleading. I'm not sure, the above code is probably ill-formed, because it creates a circular dependence between the two types, but then, this should be communicated to the user.
[Bug c++/51640] Misleading error if the type in the catch() is ambiguous
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51640 --- Comment #2 from petschy at gmail dot com 2012-01-04 20:45:00 UTC --- I pinpointed the commit that introduced the bug, using git bisect: commit 4272a2e82e431ac85afd0404d49b28043dc480ee Author: paolo paolo@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Fri Nov 27 10:44:49 2009 + git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@154698 138bc75d-0d04-0410-961f-82ee72b054a4
[Bug c++/51640] New: Misleading error if the type in the catch() is ambiguous
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51640 Bug #: 51640 Summary: Misleading error if the type in the catch() is ambiguous Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: pets...@gmail.com Created attachment 26154 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26154 test case that triggers the bug There is a regression from g++ 4.4 in later versions. If the name of the class is ambiguous in a catch(), this fact is not reported. At line 25 when compiling the attached test case with 4.5.x, 4.6.x or 4.7.0, the ambiguity of the type name 'ex2' is not mentioned. 4.4.6 reports the ambiguity at the catch(), though for other ambiguities (at variable definition, class declaration w/ ambiguous parent) it prints the message twice. 4.5 and above versions print the error once in these latter cases. I suspect that this regression might be connected to fixing the duplicated error printing. Tested on an amd64 machine with Debian Wheezy, stock 4.4, 4.5, 4.6 versions (4.4.6-14, 4.5.3-9, 4.6.2-7) and 4.7.0 built from svn trunk r182460. The original cpp file is attached, since it doesn't have any preprocessor directives except for comments. Command line and output: g++-r182460 -v -save-temps -Wall -Wextra -c test_gccexbug2.cpp Using built-in specs. COLLECT_GCC=g++-r182460 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --program-suffix -r182460 Thread model: posix gcc version 4.7.0 20111219 (experimental) (GCC) COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-Wextra' '-c' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/cc1plus -E -quiet -v -D_GNU_SOURCE test_gccexbug2.cpp -mtune=generic -march=x86-64 -Wall -Wextra -fpch-preprocess -o test_gccexbug2.ii ignoring nonexistent directory /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../x86_64-unknown-linux-gnu/include #include ... search starts here: #include ... search starts here: /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../include/c++/4.7.0 /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../include/c++/4.7.0/x86_64-unknown-linux-gnu /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/../../../../include/c++/4.7.0/backward /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/include /usr/local/include /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.7.0/include-fixed /usr/include End of search list. COLLECT_GCC_OPTIONS='-v' '-save-temps' '-Wall' '-Wextra' '-c' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.7.0/cc1plus -fpreprocessed test_gccexbug2.ii -quiet -dumpbase test_gccexbug2.cpp -mtune=generic -march=x86-64 -auxbase test_gccexbug2 -Wall -Wextra -version -o test_gccexbug2.s GNU C++ (GCC) version 4.7.0 20111219 (experimental) (x86_64-unknown-linux-gnu) compiled by GNU C version 4.7.0 20111218 (experimental), GMP version 5.0.2, MPFR version 3.1.0-p3, MPC version 0.9 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 GNU C++ (GCC) version 4.7.0 20111219 (experimental) (x86_64-unknown-linux-gnu) compiled by GNU C version 4.7.0 20111218 (experimental), GMP version 5.0.2, MPFR version 3.1.0-p3, MPC version 0.9 GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: fafea236350b995726228b0b21cd1771 test_gccexbug2.cpp: In function ‘void bar()’: test_gccexbug2.cpp:25:11: error: expected type-specifier before ‘ex2’ test_gccexbug2.cpp:25:14: error: expected ‘)’ before ‘’ token test_gccexbug2.cpp:25:14: error: expected ‘{’ before ‘’ token test_gccexbug2.cpp:25:15: error: expected primary-expression before ‘)’ token test_gccexbug2.cpp:25:15: error: expected ‘;’ before ‘)’ token
[Bug c++/51640] Misleading error if the type in the catch() is ambiguous
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51640 --- Comment #1 from petschy at gmail dot com 2011-12-20 16:49:02 UTC --- Created attachment 26155 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26155 a slightly more verbose test case Extended test case with ambiguous type name in variable definition and class declaration.