[Bug c/113255] New: wrong code with -O2 -mtune=k8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113255 Bug ID: 113255 Summary: wrong code with -O2 -mtune=k8 Product: gcc Version: 13.2.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: mednafen at sent dot com Target Milestone: --- Target: x86_64-pc-linux-gnu Created attachment 57000 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57000=edit test case Linux x86_64 $ /usr/local/gcc-13.2.0/bin/gcc -Wall -O2 -o sprite sprite.c && ./sprite 1 $ /usr/local/gcc-13.2.0/bin/gcc -Wall -O2 -mtune=k8 -o sprite sprite.c && ./sprite 0 Aborted The problem appears to have started around version 10.2.0.
[Bug c++/91678] New: decltype returns wrong type under certain conditions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91678 Bug ID: 91678 Summary: decltype returns wrong type under certain conditions Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: mednafen at sent dot com Target Milestone: --- I'm not totally sure this is a bug, but the following C++ code generates a compilation error("invalid cast of an rvalue expression of type 'float*' to type 'float*&'") in g++ 9.2: float* test(float* c) { return (decltype(c + 0))(float*)c; } whereas this does not: float* test(float* c) { return (decltype(c + 0))c; }
[Bug c++/85882] Value of local variable changes unintentionally if certain optimization are enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85882 mednafen at sent dot com changed: What|Removed |Added CC||mednafen at sent dot com --- Comment #4 from mednafen at sent dot com --- Created attachment 44170 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44170=edit smaller test case
[Bug c/83843] New: [8 Regression] wrong code at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83843 Bug ID: 83843 Summary: [8 Regression] wrong code at -O2 Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: mednafen at sent dot com Target Milestone: --- Created attachment 43125 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43125=edit test case Linux x86_64 $ /usr/local/gcc8-256680/bin/gcc -Wall -O0 -o crcred crcred.c && ./crcred fe fd $ /usr/local/gcc8-256680/bin/gcc -Wall -O2 -o crcred crcred.c && ./crcred 01 02 Aborted
[Bug target/81516] Wrong code with -m32 -O2 on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81516 --- Comment #5 from mednafen at sent dot com --- Generated assembly looks like it's grabbing garbage off the stack and writing it to b: a: subl$20, %esp fildl 24(%esp) movsd (%esp), %xmm1 movsd %xmm1, b [...]
[Bug target/81516] Wrong code with -m32 -O2 on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81516 mednafen at sent dot com changed: What|Removed |Added CC||mednafen at sent dot com --- Comment #2 from mednafen at sent dot com --- Created attachment 41811 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41811=edit alternate, further reduced test case bug occurs with: -O2 -m32 -march=i686 -msse2 -mfpmath=387 but not with: -O2 -m32 -march=i686 -msse2 -mfpmath=sse
[Bug c++/81438] New: silent bad code generation with computed goto exit from catch block
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81438 Bug ID: 81438 Summary: silent bad code generation with computed goto exit from catch block Product: gcc Version: 7.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: mednafen at sent dot com Target Milestone: --- Created attachment 41755 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41755=edit Test case Seems to be an old issue. Would expect it to work correctly, or for there to be a compilation-time error.
[Bug target/80402] New: Missed optimization on x86/x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80402 Bug ID: 80402 Summary: Missed optimization on x86/x86_64 Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: mednafen at sent dot com Target Milestone: --- Target: x86_64 Created attachment 41181 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41181=edit sample code A statement like "if(!(a & 0xF) || (b & (1U << 6)))" could be compiled to a "test","bt" pair followed by a single conditional branch/move instruction, but gcc currently compiles it to a combination of two tests and conditional branch/move instructions. From https://software.intel.com/sites/default/files/managed/ad/01/253666-sdm-vol-2a.pdf Page 3-114, BT: "The ZF flag is unaffected." Old versions(prior to early 2010, as far as I can tell) of the manual had the flag as being undefined, so it may be prudent to talk to Intel and AMD engineers before implementing this optimization. Attached is sample code that includes the optimal form via inline assembly(test4() function).
[Bug c/80301] Sub-optimal code with an array of structs offsetted inside a struct global on x86/x86_64 at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80301 --- Comment #1 from mednafen at sent dot com --- Created attachment 41115 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41115=edit correct test code
[Bug c/80301] New: Sub-optimal code with an array of structs offsetted inside a struct global on x86/x86_64 at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80301 Bug ID: 80301 Summary: Sub-optimal code with an array of structs offsetted inside a struct global on x86/x86_64 at -O2 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: mednafen at sent dot com Target Milestone: --- Created attachment 41114 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41114=edit test code gcc -fno-asynchronous-unwind-tables -S -O2 -o test.s -c test.c The displacement should be coded in the memory access instructions, but instead a separate addition instruction is being generated. Assembly is from 4.9.2, but the issue shows up on at least 5.3.0, 6.1.0, and a relatively recent 7.0 build. [...] func: movl%edi, %edx addq$2, %rdx movlm(,%rdx,8), %eax cmpl%edi, %eax je .L2 movlm+4(,%rdx,8), %eax .L2: rep ret [...]
[Bug c/79818] New: [7 Regression] wrong code with -fwrapv and -Os/-O1/-O2/-O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79818 Bug ID: 79818 Summary: [7 Regression] wrong code with -fwrapv and -Os/-O1/-O2/-O3 Product: gcc Version: 7.0.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: mednafen at sent dot com Target Milestone: --- Created attachment 40871 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40871=edit test case Linux x86_64 $ /usr/local/gcc-0f7b961/bin/gcc --version gcc (GCC) 7.0.1 20170302 (experimental) $ /usr/local/gcc-0f7b961/bin/gcc -fwrapv -O1 -o test test.c ; ./test Aborted $ /usr/local/gcc-0f7b961/bin/gcc -fwrapv -O2 -o test test.c ; ./test Aborted
[Bug c/77392] Premature optimization based on return data from thread
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77392 mednafen at sent dot com changed: What|Removed |Added CC||mednafen at sent dot com --- Comment #1 from mednafen at sent dot com --- The return type for the thread entry point function is wrong, and the variable whose address/pointer you're passing to pthread_join is of the wrong type, which is probably leading to stack corruption. These are things the compiler should have warned or errored-out about if not for the explicit void* casts.
[Bug c/71539] incomplete execution of a nested loop for -O2 and -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71539 mednafen at sent dot com changed: What|Removed |Added CC||mednafen at sent dot com --- Comment #1 from mednafen at sent dot com --- test.c:9:17: runtime error: signed integer overflow: 2 * 4611686018427387904 cannot be represented in type 'long int' test.c:9:8: runtime error: signed integer overflow: 2 * -9223372036854775808 cannot be represented in type 'long int' Refer to the C standard, and use -fwrapv https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Code-Gen-Options.html#index-fwrapv-2791 if you really need to write such code(or want a safety net against unpredictable future optimizations that take advantage of such undefined behavior).
[Bug c/70646] Corrupt truncated function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646 mednafen at sent dot com changed: What|Removed |Added CC||mednafen at sent dot com --- Comment #7 from mednafen at sent dot com --- Following code aborts on x86_64 4.9.2 and 5.3.0 at -O2, at least: #pragma GCC optimize("no-unit-at-a-time") typedef unsigned char u8; typedef unsigned long long u64; static inline __attribute__((always_inline)) u64 __swab64p(const u64 *p) { return (__builtin_constant_p((u64)(*p)) ? ((u64)( (((u64)(*p) & (u64)0x00ffULL) << 56) | (((u64)(*p) & (u64)0xff00ULL) << 40) | (((u64)(*p) & (u64)0x00ffULL) << 24) | (((u64)(*p) & (u64)0xff00ULL) << 8) | (((u64)(*p) & (u64)0x00ffULL) >> 8) | (((u64)(*p) & (u64)0xff00ULL) >> 24) | (((u64)(*p) & (u64)0x00ffULL) >> 40) | (((u64)(*p) & (u64)0xff00ULL) >> 56))) : __builtin_bswap64(*p)); } static inline u64 wwn_to_u64(void *wwn) { return __swab64p(wwn); } void __attribute__((noinline,noclone)) broken(u64* shost) { u8 node_name[8] = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}; *shost = wwn_to_u64(node_name); } void __attribute__((noinline,noclone)) dummy(void) { __builtin_abort(); } int main(int argc, char* argv[]) { u64 v; broken(); if(v != (u64)-1) __builtin_abort(); return 0; }
[Bug other/66179] New: Sub-optimal code generation with __attribute__((leaf))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66179 Bug ID: 66179 Summary: Sub-optimal code generation with __attribute__((leaf)) Product: gcc Version: 5.1.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: mednafen at sent dot com Target Milestone: --- Created attachment 35556 -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35556action=edit C code which demonstrates the issue. The attached code, when compiled like(for example): gcc -fno-asynchronous-unwind-tables -O2 -S -o leaf.s -c leaf.c (with 4.9.2 or 5.1.0, Linux x86_64) produces bloated, sub-optimal assembly like: test_call_leaf: pushq %r12 pushq %rbp pushq %rbx movla(%rip), %r12d movlc(%rip), %ebx callfunction_leaf movlb(%rip), %ebp addl$2, %r12d addl$2, %ebx movl%r12d, a(%rip) callfunction_leaf addl$2, %ebp movl%ebx, c(%rip) movl%ebp, b(%rip) popq%rbx popq%rbp popq%r12 ret compared to a more optimal possibility: test_call_normal: subq$8, %rsp addl$2, a(%rip) callfunction_normal addl$2, b(%rip) callfunction_normal addl$2, c(%rip) addq$8, %rsp ret