[Bug middle-end/93487] Missed tail-call optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93487 --- Comment #5 from Petr Skocik --- Another case of a missed tailcall which might warrant a separate mention: struct big{ long _[10]; }; void takePtr(void *); void takeBigAndPassItsAddress(struct big X){ takePtr(); } This should ideally compile to just `lea 8(%rsp), %rdi; jmp takePtr;`. The compiler might be tempted here to use the taking of an address of a local here as a reason not to tail call, and clang misses this optimization too, probably for this reason, but tailcalling here is fine as the particular local here isn't allocated by the function but rather the callee during the call. Icc does do this optimization: https://godbolt.org/z/a6coTzPjz
[Bug c/90181] Feature request: provide a way to explicitly select specific named registers in constraints
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90181 Petr Skocik changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #16 from Petr Skocik --- The current way of loading stuff into regs that don't have a specific constraint for them also breaks on gcc (but not on clang) if the variable is marked const. https://godbolt.org/z/1PvYsrqG9
[Bug middle-end/112844] Branches under -Os (unlike -O{1, 2, 3}) do not respect __builtin_expect hints
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112844 --- Comment #2 from Petr Skocik --- (In reply to Jakub Jelinek from comment #1) > With -Os you ask the code to be small. So, while internally the hint is > still present in edge probabilities, -Os is considered more important and > certain code changes based on the probabilities aren't done if they are > known or expected to result in larger code. Thanks. I very much like the codegen I get with gcc -Os, often better than what I get with clang. But the sometimes counter-obvious branch layout at -Os is annoying to me, especially considering I've measured it a couple of times as being the source of a slowdown. Sure you can save a (most-often-than not 2-byte) jump by conditionally jumping over an unlikely branch instead of conditionally jumping to an unlikely branch placed after ret and having it jump back in the function body (the latter is what all the other compilers do at -Os), but I'd rather have the code spend the extra two bytes and have my happy paths be fall-through as they should be.
[Bug target/114097] Missed register optimization in _Noreturn functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097 --- Comment #4 from Petr Skocik --- Excellent! Thank you very much. Didn't realize the functionality was already there, but didn't work without an explicit __attribute((noreturn)). Now I can get rid of my most complex assembly function which I stupidly (back then I thought cleverly) wrote. :)
[Bug rtl-optimization/10837] noreturn attribute causes no sibling calling optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10837 Petr Skocik changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #19 from Petr Skocik --- IMO(In reply to Xi Ruoyao from comment #16) > In practice most _Noreturn functions are abort, exit, ..., i.e. they are > only executed one time so optimizing against a cold path does not help much. > I don't think it's a good idea to encourage people to construct some fancy > code by a recursive _Noreturn function (why not just use a loop?!) And if > you must write such fancy code anyway IMO musttail attribute (PR83324) will > be a better solution. There's also longjmp, which may not be all that super cold and may be executed multiple times. And while yeah, nobody will notice a single call vs jmp time save against a process spawn/exit, for a longjmp wrapper, it'll make it a few % faster (as would utilizing _Noreturn attributes for better register allocation: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097, which would also save a bit of codesize too). Taillcalls can also save a bit of codesize if the target is near.
[Bug c/114097] New: Missed register optimization in _Noreturn functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097 Bug ID: 114097 Summary: Missed register optimization in _Noreturn functions Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Consider a never-returning functions such as this: #include #include //_Noreturn void noret(unsigned A, unsigned B, unsigned C, unsigned D, unsigned E, jmp_buf Jb){ for(;A--;) puts("A"); for(;B--;) puts("B"); for(;C--;) puts("C"); for(;D--;) puts("D"); for(;E--;) puts("E"); longjmp(Jb,1); } https://godbolt.org/z/35YjrhjYq In its prologue, gcc saves the arguments in call-preserved registers to preserve them around the puts calls, and it does so the usual way: by (1) pushing the old values of the call-preserved registers to the stack and (2) actually moving the arguments into the call-preserved registers. pushq %r15 movq%r9, %r15 pushq %r14 movl%edi, %r14d pushq %r13 movl%esi, %r13d pushq %r12 movl%edx, %r12d pushq %rbp movl%ecx, %ebp pushq %rbx movl%r8d, %ebx pushq %rax //... Since this function demonstrably never returns, step 1 can be entirely elided as the old values of the call-preserved registers won't ever need to be restored (desirably, gcc does not generate the would-be-dead restoration code): movq%r9, %r15 movl%edi, %r14d movl%esi, %r13d movl%edx, %r12d movl%ecx, %ebp movl%r8d, %ebx pushq %rax //... (Also desirable would be the unrealized tailoptimization of the longjmp call in this case: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10837)
[Bug c/114011] New: Feature request: __goto__
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114011 Bug ID: 114011 Summary: Feature request: __goto__ Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Gcc has __volatile__. I can only assume the rationale for it is so that inline asm macros can do __asm __volatile__ and not have to worry about user-redefines of the volatile keyword (which while not quite approved by the standard, is sometimes practically useful). While the __asm syntax also allows the goto keyword, there's currently no __goto__ counterpart to __volatile__, which could similarly protect against goto redefines. Adding it is trivial and consistent with the already existing volatile/__volatile__ pair. Would you consider it? ( Why am I redefining goto? I'm basically doing it within the confines of a macro framework to force a static context check on gotos to prevent gotos out of scopes where doing it would be an error. Something like: enum { DISALLOW_GOTO_HERE = 0 }; //normally, goto is allowed #define goto while(_Generic((int(*)[!DISALLOW_GOTO_HERE])0, int(*)[1]:1)) goto //statically checked goto int main(void){ goto next; next:; //OK, not disallowed in this context #if 0 //would fail to compile enum {DISALLOW_GOTO_HERE=1}; //disallowed in this context goto next2; next2:; #endif } While this redefine does not syntactically disturb C, it does disturb `__asm goto()`, which I, unfortunately, have one very frequently used instance of, and since there's no way to suppress an object macro redefine, I'd like to be able to change it to `__asm __goto__` and have it peacefully coexist with the goto redefine. )
[Bug c/112844] New: Branches under -Os (unlike -O{1,2,3}) do not respect __builtin_expect hints
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112844 Bug ID: 112844 Summary: Branches under -Os (unlike -O{1,2,3}) do not respect __builtin_expect hints Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- A simple example that demonstrates this is: int test(void); void yes(void); void expect_yes(void){ if (__builtin_expect(test(),1)) yes(); else {} } void expect_no(void){ if (__builtin_expect(test(),0)) yes(); else {} } For an optimized x86-64 output, one should expect: -a fall-through to a yes() tailcall for the expect_yes() case, preceded by a conditional jump to code doing a plain return -a fall-through to a plain return for the expect_no() case, preceded by a conditional jump to a yes() tailcall (or even more preferably: a conditional-taicall to yes() with the needed stack adjustment done once before the test instead of being duplicated in each branch after the test) Indeed, that's how gcc lays it out for -O{1,2,3} (https://godbolt.org/z/rG3P3d6f7) as does clang at -O{1,2,3,s} (https://godbolt.org/z/EcKbrn1b7) and icc at -O{1,2,3,s} (https://godbolt.org/z/Err73eGsb). But gcc at -Os seems to have a very strong preference to falling through to call yes() even in void expect_no(void){ if (__builtin_expect(test(),0)) yes(); else {} } and even in void expect_no2(void){ if (__builtin_expect(!test(),1)){} else yes(); } essentially completely disregarding any user attempts at controlling the branch layout of the output.
[Bug ipa/106116] Missed optimization: in no_reorder-attributed functions, tail calls to the subsequent function could just be function-to-function fallthrough
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106116 --- Comment #4 from Petr Skocik --- It would be interesting to do this at the assembler level, effectively completely turning what's equivalent to `jmp 1f; 1:` to nothing. This would also be in line with the GNU assembler's apparent philosophy that jmp is a high-level variadic-length instruction (either jmp, or jmpq, whichever is possible first => this could become: nothing, jmp, or jmpq). I have a bunch of multiparam functions such with supporting functions structured as follows: void func_A(int A){ func_AB(DEFAULT_C); } void func_AB(int A, int B){ func_ABC(A,B,DEFAULT_C); } void func_ABC(int A, int B, int C){ func_ABCD(A,B,C,DEFAULT_D); } void func_ABC(int A, int B, int C, int D){ //... } which could size-wise benefit from eliding the jumps, turning them into fallthrus this way, but yeah, probably not worth the effort (unless somebody knows how to easily hack gas to do it).
[Bug middle-end/109766] New: Passing doubles through the stack generates a stack adjustment pear each such argument at -Os/-Oz.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109766 Bug ID: 109766 Summary: Passing doubles through the stack generates a stack adjustment pear each such argument at -Os/-Oz. Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- /* Passing doubles through the stack generates a stack adjustment pear each such argument at -Os/-Oz. These stack adjustments are only coalesced at -O1/-O2/-O3, leaving -Os/-Oz with larger code. */ #define $expr(...) (__extension__({__VA_ARGS__;})) #define $regF0 $expr(register double x __asm("xmm0"); x) #define $regF1 $expr(register double x __asm("xmm1"); x) #define $regF2 $expr(register double x __asm("xmm2"); x) #define $regF3 $expr(register double x __asm("xmm3"); x) #define $regF4 $expr(register double x __asm("xmm4"); x) #define $regF5 $expr(register double x __asm("xmm5"); x) #define $regF6 $expr(register double x __asm("xmm6"); x) #define $regF7 $expr(register double x __asm("xmm7"); x) void func(char const*Fmt, ...); void callfunc(char const*Fmt, double D0, double D1, double D2, double D3, double D4, double D5, double D6, double D7){ func(Fmt,$regF0,$regF1,$regF2,$regF3,$regF4,$regF5,$regF6,$regF7, D0,D1,D2,D3,D4,D5,D6,D7); /* //gcc @ -Os/-Oz : 0: 50 push %rax 1: b0 08 mov$0x8,%al 3: 48 8d 64 24 f8 lea-0x8(%rsp),%rsp 8: 66 0f d6 3c 24 movq %xmm7,(%rsp) d: 48 8d 64 24 f8 lea-0x8(%rsp),%rsp 12: 66 0f d6 34 24 movq %xmm6,(%rsp) 17: 48 8d 64 24 f8 lea-0x8(%rsp),%rsp 1c: 66 0f d6 2c 24 movq %xmm5,(%rsp) 21: 48 8d 64 24 f8 lea-0x8(%rsp),%rsp 26: 66 0f d6 24 24 movq %xmm4,(%rsp) 2b: 48 8d 64 24 f8 lea-0x8(%rsp),%rsp 30: 66 0f d6 1c 24 movq %xmm3,(%rsp) 35: 48 8d 64 24 f8 lea-0x8(%rsp),%rsp 3a: 66 0f d6 14 24 movq %xmm2,(%rsp) 3f: 48 8d 64 24 f8 lea-0x8(%rsp),%rsp 44: 66 0f d6 0c 24 movq %xmm1,(%rsp) 49: 48 8d 64 24 f8 lea-0x8(%rsp),%rsp 4e: 66 0f d6 04 24 movq %xmm0,(%rsp) 53: e8 00 00 00 00 callq 58 54: R_X86_64_PLT32 func-0x4 58: 48 83 c4 48 add$0x48,%rsp 5c: c3 retq $sz(callfunc)=93 //clang @ -Os/-Oz : 0: 48 83 ec 48 sub$0x48,%rsp 4: f2 0f 11 7c 24 38 movsd %xmm7,0x38(%rsp) a: f2 0f 11 74 24 30 movsd %xmm6,0x30(%rsp) 10: f2 0f 11 6c 24 28 movsd %xmm5,0x28(%rsp) 16: f2 0f 11 64 24 20 movsd %xmm4,0x20(%rsp) 1c: f2 0f 11 5c 24 18 movsd %xmm3,0x18(%rsp) 22: f2 0f 11 54 24 10 movsd %xmm2,0x10(%rsp) 28: f2 0f 11 4c 24 08 movsd %xmm1,0x8(%rsp) 2e: f2 0f 11 04 24 movsd %xmm0,(%rsp) 33: b0 08 mov$0x8,%al 35: e8 00 00 00 00 callq 3a 36: R_X86_64_PLT32 func-0x4 3a: 48 83 c4 48 add$0x48,%rsp 3e: c3 retq $sz(callfunc)=63 //gcc @ -O1 : 0: 48 83 ec 48 sub$0x48,%rsp 4: f2 0f 11 7c 24 38 movsd %xmm7,0x38(%rsp) a: f2 0f 11 74 24 30 movsd %xmm6,0x30(%rsp) 10: f2 0f 11 6c 24 28 movsd %xmm5,0x28(%rsp) 16: f2 0f 11 64 24 20 movsd %xmm4,0x20(%rsp) 1c: f2 0f 11 5c 24 18 movsd %xmm3,0x18(%rsp) 22: f2 0f 11 54 24 10 movsd %xmm2,0x10(%rsp) 28: f2 0f 11 4c 24 08 movsd %xmm1,0x8(%rsp) 2e: f2 0f 11 04 24 movsd %xmm0,(%rsp) 33: b8 08 00 00 00 mov$0x8,%eax 38: e8 00 00 00 00 callq 3d 39: R_X86_64_PLT32 func-0x4 3d: 48 83 c4 48 add$0x48,%rsp 41: c3 retq $sz(callfunc)=66 */ } https://godbolt.org/z/d8T3hxqWK
[Bug preprocessor/109704] New: #pragma {push,pop}_macro broken for identifiers that contain dollar signs at nonfirst positions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704 Bug ID: 109704 Summary: #pragma {push,pop}_macro broken for identifiers that contain dollar signs at nonfirst positions Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: preprocessor Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- This following dollarsign-less example compiles fine as expected: #define MACRO 1 _Static_assert(MACRO,""); #pragma push_macro("MACRO") #undef MACRO #define MACRO 0 _Static_assert(!MACRO,""); #pragma pop_macro("MACRO") _Static_assert(MACRO,""); //OK Substituting $MACRO for MACRO still works, but with MACRO$ or M$CRO the final assertions fail: https://godbolt.org/z/n1EoGao74
[Bug tree-optimization/93265] memcmp comparisons of structs wrapping a primitive type not as compact/efficient as direct comparisons of the underlying primitive type under -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93265 --- Comment #3 from Petr Skocik --- Here's another example (which may be summarizing it more nicely) struct a{ char _[4]; }; #include int cmp(struct a A, struct a B){ return !!memcmp(,,4); } Expected x86-64 codegen (✓ for gcc -O2/-O3 and for clang -Os/-O2/-O3) xor eax, eax cmp edi, esi setne al ret gcc -Os codegen: subq$24, %rsp movl$4, %edx movl%edi, 12(%rsp) leaq12(%rsp), %rdi movl%esi, 8(%rsp) leaq8(%rsp), %rsi callmemcmp testl %eax, %eax setne %al addq$24, %rsp movzbl %al, %eax ret https://godbolt.org/z/G5eE5GYv4
[Bug c/94379] Feature request: like clang, support __attribute((__warn_unused_result__)) on structs, unions, and enums
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94379 --- Comment #2 from Petr Skocik --- Excellent! For optional super extra coolness, this might work (and clang doesn't do this) with statement expressions too so that statement expression-based macros could be marked warn_unused_result through it too. typedef struct __attribute((__warn_unused_result__)) { int x; } wur_retval_t; wur_retval_t foo(void){ int x=41; return (wur_retval_t){x+1}; } #define foo_macro() ({ int x=41; (wur_retval_t){x+1}; }) void use(void){ foo(); //warn unused result ✓ foo_macro(); //perhaps should "warn unused result" too? }
[Bug c/109567] New: Useless stack adjustment by 16 around calls with odd stack-argument counts on SysV x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109567 Bug ID: 109567 Summary: Useless stack adjustment by 16 around calls with odd stack-argument counts on SysV x86_64 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- For function calls with odd stack argument counts, gcc generates a useless `sub $16, %rsp` at the beginning of the calling function. Example (https://godbolt.org/z/Y4ErE8ee9): #include int callprintf_0stk(char const*Fmt){ return printf(Fmt,0,0,0,0,0),0; } int callprintf_1stk(char const *Fmt){ return printf(Fmt,0,0,0,0,0, 1),0; } //useless sub $0x10,%rsp int callprintf_2stk(char const *Fmt){ return printf(Fmt,0,0,0,0,0, 1,2),0; } int callprintf_3stk(char const *Fmt){ return printf(Fmt,0,0,0,0,0, 1,2,3),0; } //useless sub $0x10,%rsp int callprintf_4stk(char const *Fmt){ return printf(Fmt,0,0,0,0,0, 1,2,3,4),0; } int callprintf_5stk(char const *Fmt){ return printf(Fmt,0,0,0,0,0, 1,2,3,4,5),0; } //useless sub $0x10,%rsp int callprintf_6stk(char const *Fmt){ return printf(Fmt,0,0,0,0,0, 1,2,3,4,5,6),0; } int callprintf_7stk(char const *Fmt){ return printf(Fmt,0,0,0,0,0, 1,2,3,4,5,6,7),0; } //useless sub $0x10,%rsp
[Bug middle-end/108799] Improper deprecation diagnostic for rsp clobber
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108799 Petr Skocik changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #3 from Petr Skocik --- Very good question. The deprecation of SP clobbers could use some explanation if there are indeed good reasons for it. IMO, if listing the SP as a clobber both (1) forces a frame pointer with frame-pointer-relative addressing of spills (and the frame pointer isn't clobbered too) and (2) avoids the use of the red zone (and it absolutely should continue to do both of these things in my opinion) then gcc shouldn't need to care about redzone clobbers (as in the `pushf;pop` example) or even a wide class of stack pointer changes (assembly-made stack allocation and frees) just as long as no spills made by the compiler are clobbered (or opened to being clobbered from signal handlers) by such head-of-the-stack manipulation. Even with assembly-less standard C that uses VLAs or allocas, gcc cannot count on being in control of the stack pointer anyway, so why be so fussy about it when something as expert-oriented as inline assembly tries to manipulate it?
[Bug c/108194] GCC won't treat two compatible function types as compatible if any of them (or both of them) is declared _Noreturn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108194 --- Comment #6 from Petr Skocik --- (In reply to Petr Skocik from comment #5) > (In reply to Andrew Pinski from comment #4) > > Invalid as mentioned in r13-3135-gfa258f6894801a . > > I believe it's still a bug for pre-c2x __typeof. > While it is GCC's prerogative to include _Noreturn/__attribute((noreturn)) > into the type for its own __typeof (which, BTW, I think is better design > than the standardized semantics), I think two otherwise compatible function > types should still remain compatible if they both either have or don't have > _Noreturn/__attribute((noreturn)). But treating `_Noreturn void > NR_FN_A(void);` > as INcompatible with `_Noreturn void NR_FN_B(void);` that's just wonky, IMO. OK, the bug was MINE after all. For bug report archeologists: I was doing what was meant to be a full (qualifers-including) type comparison wrong. While something like _Generic((__typeof(type0)*)0, __typeof(type1)*:1, default:0) suffices to get around _Generic dropping qualifs (const/volatile/_Atomic) in its controlling expression, for function pointer types at single pointer layer of indirection, the _Noreturn attribute will still get dropped in the controlling expression of _Generic (I guess that makes sense because they're much more closely related to functions that how another pointer type would be related to its target type) and another pointer layer of indirection if required as in `_Generic((__typeof(type0)**)0, __typeof(type1)**:1, default:0)`. Thanks you all very much, especially jos...@codesourcery.com, who pointed me (pun intended) to the right solution over email. :)
[Bug c/108194] GCC won't treat two compatible function types as compatible if any of them (or both of them) is declared _Noreturn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108194 Petr Skocik changed: What|Removed |Added Resolution|INVALID |FIXED --- Comment #5 from Petr Skocik --- (In reply to Andrew Pinski from comment #4) > Invalid as mentioned in r13-3135-gfa258f6894801a . I believe it's still a bug for pre-c2x __typeof. While it is GCC's prerogative to include _Noreturn/__attribute((noreturn)) into the type for its own __typeof (which, BTW, I think is better design than the standardized semantics), I think two otherwise compatible function types should still remain compatible if they both either have or don't have _Noreturn/__attribute((noreturn)). But treating `_Noreturn void NR_FN_A(void);` as INcompatible with `_Noreturn void NR_FN_B(void);` that's just wonky, IMO.
[Bug c/108194] New: GCC won't treat two compatible function types as compatible if any of them (or both of them) is declared _Noreturn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108194 Bug ID: 108194 Summary: GCC won't treat two compatible function types as compatible if any of them (or both of them) is declared _Noreturn Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- (same with __attribute((noreturn))) Example (https://godbolt.org/z/ePGd95sWz): void FN_A(void); void FN_B(void); _Noreturn void NR_FN_A(void); _Noreturn void NR_FN_B(void); _Static_assert(_Generic((__typeof(*(FN_A))*){0}, __typeof(*(FN_B))*: 1), ""); //OK ✓ _Static_assert(_Generic((__typeof(*(NR_FN_A))*){0}, __typeof(*(NR_FN_B))*: 1), ""); //ERROR ✗ _Static_assert(_Generic((__typeof(*(FN_A))*){0}, __typeof(*(NR_FN_B))*: 1), ""); //ERROR ✗ As you can see from the Compiler Explorer link, clang accepts all three, which is as it should be as per the standard, where _Noreturn is a function specifier (https://port70.net/~nsz/c/c11/n1570.html#6.7.4), which means it shouldn't even go into the type. (Personally, I don't even mind it going into the type just as long as two otherwise identical _Noreturn functio declarations are deemed as having the same type). Regards, Petr Skocik
[Bug c/107831] Missed optimization: -fclash-stack-protection causes unnecessary code generation for dynamic stack allocations that are clearly less than a page
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107831 --- Comment #9 from Petr Skocik --- Regarding the size of alloca/VLA-generated code under -fstack-clash-protection. I've played with this a little bit and while I love the feature, the code size increases seem quite significant and unnecessarily so. Take a simple void ALLOCA_C(size_t Sz){ char buf[Sz]; asm volatile ("" : : "r"([0])); } gcc -fno-stack-clash-protection: 17 bytes gcc -fstack-clash-protection: 72 bytes clang manages with less of an increase: -fno-stack-clash_protection: 26 bytes -stack-clash-protection: 45 bytes Still this could be as low as 11 bytes for the -fclash-stack-protection version (less than for the unprotected one!) all by using a simple call to an assembly function, whose code can be no-clobber without much extra effort. Linked in compiler explorer is a crack at the idea along with benchmarks: https://godbolt.org/z/f8rhG1ozs The performance impact of the call seems negligible (practically less than 1ns, though in the above quick-and-dirty benchmark it fluctuates a tiny bit, sometimes even giving the non-inline version an edge). I originally suggested popping the address of the stack and repushing before calling returning. Ended up just repushing -- the old return address becomes part of the alloca allocation. The concern that this could mess up the return stack buffer of the CPU seems valid but all the benchmarks indicate it doesn't--not even when the ret address is popped--just as long as the return target address is the same. (When it isn't, the performance penalty is rather significant: measured a 19 times slowdown of that for comparison (it's also in the linked benchmarks)). The (x86-64) assembly function: #define STR(...) STR__(__VA_ARGS__) //{{{ #define STR__(...) #__VA_ARGS__ //}}} asm(STR( .global safeAllocaAsm; safeAllocaAsm: //no clobber, though does expect 16-byte aligned at entry as usual push %r10; cmp $16, %rdi; ja .LsafeAllocaAsm__test32; push 8(%rsp); ret; .LsafeAllocaAsm__test32: push %r10; push %rdi; mov %rsp, %r10; sub $17, %rdi; and $-16, %rdi; //(-32+15)&(-16) //substract the 32 and 16-align, rounding up jnz .LsafeAllocaAsm__probes; .LsafeAllocaAsm__ret: lea (3*8)(%r10,%rdi,1), %rdi; push (%rdi); mov -8(%rdi), %r10; mov -16(%rdi), %rdi; ret; .LsafeAllocaAsm__probes: sub %rdi, %r10; //r10 is the desired rsp .LsafeAllocaAsm__probedPastDesiredSpEh: cmp %rsp, %r10; jge .LsafeAllocaAsm__pastDesiredSp; orl $0x0,(%rsp); sub $0x1000,%rsp; jmp .LsafeAllocaAsm__probedPastDesiredSpEh; .LsafeAllocaAsm__pastDesiredSp: mov %r10, %rsp; //set the desired sp jmp .LsafeAllocaAsm__ret; .size safeAllocaAsm, .-safeAllocaAsm; )); Cheers, Petr Skocik
[Bug c/107831] Missed optimization: -fclash-stack-protection causes unnecessary code generation for dynamic stack allocations that are clearly less than a page
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107831 --- Comment #7 from Petr Skocik --- (In reply to Jakub Jelinek from comment #4) > Say for > void bar (char *); > void > foo (int x, int y) > { > __attribute__((assume (x < 64))); > for (int i = 0; i < y; ++i) > bar (__builtin_alloca (x)); > } > all the alloca calls are known to be small, yet they can quickly cross pages. > Similarly: > void > baz (int x) > { > if (x >= 512) __builtin_unreachable (); > char a[x]; > bar (a); > char b[x]; > bar (b); > char c[x]; > bar (c); > char d[x]; > bar (d); > char e[x]; > bar (e); > char f[x]; > bar (f); > char g[x]; > bar (g); > char h[x]; > bar (h); > char i[x]; > bar (i); > char j[x]; > bar (j); > } > All the VLAs here are small, yet together they can cross a page. > So, we'd need to punt for dynamic allocations in loops and for others > estimate > the maximum size of all the allocations together (+ __builtin_alloca > overhead + normal frame size). I think this shouldn't need probes either (unless you tried to coalesce the allocations) on architectures where making a function call touches the stack. Also alloca's of less than or equal to half a page intertwined with writes anywhere to the allocated blocks should be always safe (but I guess I'll just turn stack-clash-protection off in the one file where I'm making such clearly safe dynamic stack allocations).
[Bug c/107831] Missed optimization: -fclash-stack-protection causes unnecessary code generation for dynamic stack allocations that are clearly less than a page
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107831 --- Comment #6 from Petr Skocik --- (In reply to Jakub Jelinek from comment #2) > (In reply to Petr Skocik from comment #1) > > Sidenote regarding the stack-allocating code for cases when the size is not > > known to be less than pagesize: the code generated for those cases is quite > > large. It could be replaced (at least under -Os) with a call to a special > > assembly function that'd pop the return address (assuming the target machine > > pushes return addresses to the stack), allocate adjust and allocate the > > stack size in a piecemeal fashion so as to not skip guard pages, the repush > > the return address and return to caller with the stacksize expanded. > > You certainly don't want to kill the return stack the CPU has, even if it > results in a few saved bytes for -Os. That's a very interesting point because I have written x86_64 assembly "functions" that did pop the return address, pushed something to the stack, and then repushed the return address and returned. In a loop, it doesn't seem to perform badly compared to inline code, so I figure it shouldn't be messing with the return stack buffer. After all, even though the return happens through a different place in the callstack, it's still returning to the original caller. The one time I absolutely must have accidentally messed with the return stack buffer was when I wrote context switching routine and originally tried to "ret" to the new context. It turned out to be very measurably many times slower that `pop %rcx; jmp *%rcx;` (also measured on a loop), so that's why I think popping a return address, allocating on the stack, and then pushing and returning is not really a performance killer (on my Intel CPU anyway). If it was messing with the return stack buffer, I think would be getting similar slowdowns to what I got with context switching code trying to `ret`.
[Bug c/107831] Missed optimization: -fclash-stack-protection causes unnecessary code generation for dynamic stack allocations that are clearly less than a page
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107831 --- Comment #1 from Petr Skocik --- Sidenote regarding the stack-allocating code for cases when the size is not known to be less than pagesize: the code generated for those cases is quite large. It could be replaced (at least under -Os) with a call to a special assembly function that'd pop the return address (assuming the target machine pushes return addresses to the stack), allocate adjust and allocate the stack size in a piecemeal fashion so as to not skip guard pages, the repush the return address and return to caller with the stacksize expanded.
[Bug c/107831] New: Missed optimization: -fclash-stack-protection causes unnecessary code generation for dynamic stack allocations that are clearly less than a page
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107831 Bug ID: 107831 Summary: Missed optimization: -fclash-stack-protection causes unnecessary code generation for dynamic stack allocations that are clearly less than a page Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- I'm talking allocations such as char buf [ (uint8_t)size ]; The resulting code for this should ideally be the same with or without -fstack-clash-protection as this can clearly never skip a whole page. But gcc generates a big loop trying to touch every page-sized subpart of that allocation. https://godbolt.org/z/G8EbzbshK
[Bug c/106116] New: Missed optimization: in no_reorder-attributed functions, tail calls to the subsequent function could just be function-to-function fallthrough
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106116 Bug ID: 106116 Summary: Missed optimization: in no_reorder-attributed functions, tail calls to the subsequent function could just be function-to-function fallthrough Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Example: __attribute((noinline,no_reorder)) int fnWithExplicitArg(int ExplicitArg); __attribute((noinline,no_reorder)) int fnWithDefaultArg(void){ return fnWithExplicitArg(42); } int fnWithExplicitArg(int ExplicitArg){ int useArg(int); return 12+useArg(ExplicitArg); } Generated fnWithDefaultArg: fnWithDefaultArg: mov edi, 42 jmp fnWithExplicitArg fnWithExplicitArg: //... Desired fnWithDefaultArg fnWithDefaultArg: mov edi, 42 //fallthru fnWithExplicitArg: //... https://gcc.godbolt.org/z/Ph3onxoh9
[Bug target/85927] ud2 instruction generated starting with gcc 8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85927 Petr Skocik changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #5 from Petr Skocik --- I think it'd be more welcome if gcc just put nothing there like clang does.
[Bug c/102096] New: Gcc unnecessarily initializes indeterminate variables passed across function boundaries
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102096 Bug ID: 102096 Summary: Gcc unnecessarily initializes indeterminate variables passed across function boundaries Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Compared to clang where: long ret_unspec(void){ auto long rv; return rv; } void take6(long,long,long,long,long,long); void call_take6(void) { //6 unnecessary XORs on GCC auto long id0; //indeterminate auto long id1; //indeterminate auto long id2; //indeterminate auto long id3; //indeterminate auto long id4; //indeterminate auto long id5; //indeterminate take6(id0,id1,id2,id3,id4,id5); } yields (x86_64): ret_unspec:# @ret_unspec2 retq call_take6: # @call_take6 jmp take6 (1+5 bytes), GCC compiles the above to ret_unspec2: xorl%eax, %eax ret call_take6: xorl%r9d, %r9d xorl%r8d, %r8d xorl%ecx, %ecx xorl%edx, %edx xorl%esi, %esi xorl%edi, %edi jmp take6 (3+19 bytes), unnecessarily 0-initializing the indeterminate return-value/arguments. Type casting the called function can often be hackishly used to get the same assembly but doing so is technically UB and not as generic as supporting the passing of unspecified arguments/return values, which can be used to omit argument register initializations not just for arguments at the end of an argument pack but also in the middle. TL;DR: Allowing to passing/return indeterminate variables without generating initializing code for them would be nice. Clang already does it.
[Bug c/98418] Valid integer constant expressions based on expressions that trigger -Wshift-overflow are treated as non-constants
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98418 pskocik at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #3 from pskocik at gmail dot com --- You're right. The bug was in my code. struct foo { unsigned bit: (0xll<<40)!=0; }; is indeed UB due to http://port70.net/~nsz/c/c11/n1570.html#6.5.7p4, but struct foo { unsigned bit: (0xull<<40)!=0; }; isn't and GCC accepts it without complaint. Apologies for the false alarm.
[Bug c/98418] Valid integer constant expressions based on expressions that trigger -Wshift-overflow are treated as non-constants
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98418 --- Comment #2 from pskocik at gmail dot com --- You're right. The bug was in my code. struct foo { unsigned bit: (0xll<<40)!=0; }; is indeed UB due to http://port70.net/~nsz/c/c11/n1570.html#6.5.7p4, but struct foo { unsigned bit: (0xull<<40)!=0; }; isn't and GCC accepts it without complaint. Apologies for the false alarm.
[Bug c/98418] New: Valid integer constant expressions based on expressions that trigger -Wshift-overflow are treated as non-constants
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98418 Bug ID: 98418 Summary: Valid integer constant expressions based on expressions that trigger -Wshift-overflow are treated as non-constants Product: gcc Version: 6.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- This causes things like: struct foo { unsigned bit: (0xll<<40)!=0; }; to elicit a -pedantic warning about the bitfield width not being a proper integer constant expression, even though it is. In other contexts, a complete compilation error might ensue: extern int bar[ (0xll<<40)!=0 ]; //seen as an invalid VLA https://gcc.godbolt.org/z/7zfz96 Neither clang nor gcc <= 5 appear to have this bug. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93241 seems related.
[Bug c/96625] New: Unnecessarily large assembly generated when a bit-offsetted higher-end end of a uint64_t-backed bitfield is shifted toward the high end (left) by its bit-offset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96625 Bug ID: 96625 Summary: Unnecessarily large assembly generated when a bit-offsetted higher-end end of a uint64_t-backed bitfield is shifted toward the high end (left) by its bit-offset Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- (Bitfields backed by 32-bit unsigneds are handled well.) My example (https://gcc.godbolt.org/z/Yac38T): #include #define FRONTSZ 3 #define UTYPE uint64_t struct s{ union { UTYPE whole; struct { UTYPE front:FRONTSZ, tail:8*sizeof(UTYPE)-FRONTSZ; }; };}; UTYPE hiShifted_tail(struct s X) { return X.tail<>FRONTSZ< (14 bytes): 0: 48 b8 f8 ff ff ff ff ff ff 1f movabs rax,0x1ff8 a: 48 21 f8andrax,rdi d: c3 ret (8 bytes): 0: 48 89 f8movrax,rdi 3: 48 83 e0 f8 andrax,0xfff8 7: c3 ret The codegen follows the same pattern for other front-sizes. hiShifted_tail() on clang (regardless of whether uint64_t or uint32_t is used as the backing type) and on gcc with uint32_t rather than uin64_t used as the bitfield-backing-type follows the smaller codegen patter of hiShifted_tail{2,3}.
[Bug c/96420] New: -Wsign-extensions warnings are generated from system header macros
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96420 Bug ID: 96420 Summary: -Wsign-extensions warnings are generated from system header macros Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Gcc doesn't silence -Wsign-conversion warnings in the expansion of system-header macros (e.g., in the expansion of Musl's/Cygwin's FD_SET) unlike other warnings in system-header macros. E.g., #include void f(int X) { fd_set set; FD_ZERO(); FD_SET(X,); FD_CLR(X+1,); (void)FD_ISSET(X+2,); } generates -Wsign-conversion warnings when compiled with musl-gcc or with gcc on Cygwin. Arguably, this should be fixed in the respective c libs, but the treatment of -Wsign-conversion in system-header macro expansion does seem inconsistent with that of other warnings in that context.
[Bug c/95857] New: Silencing an unused label warning with (void)& can make gcc segfault
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95857 Bug ID: 95857 Summary: Silencing an unused label warning with (void)& can make gcc segfault Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Created attachment 48777 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48777=edit preprocessed reproducer that crashes gcc >= 8.1 at -O2/-O3/-Os In certain more complex contexts and with optimization on (>= -O2), silencing -Wunused-label warnings with (void)& will make gcc segfault. The attached example ( https://gcc.godbolt.org/z/iEhgL2 ) obtained with creduce crashes gcc >= 8.1 when compiled at -O2/-O3/-Os. I haven't observed the bug in older versions of gcc.
[Bug c/95126] New: Missed opportunity to turn static variables into immediates
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95126 Bug ID: 95126 Summary: Missed opportunity to turn static variables into immediates Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Example: For: struct small{ short a,b; signed char c; }; void call_func(void) { extern int func(struct small X); static struct small const s = { 1,2,0 }; func(s); } clang renders (x86_64): : 0: bf 01 00 02 00 movedi,0x20001 5: e9 00 00 00 00 jmpa 6: R_X86_64_PLT32 func-0x4 whereas gcc renders: : 0: 0f b7 3d 00 00 00 00movzx edi,WORD PTR [rip+0x0]# 7 3: R_X86_64_PC32.rodata-0x2 7: 0f b7 05 00 00 00 00movzx eax,WORD PTR [rip+0x0]# e a: R_X86_64_PC32.rodata-0x4 e: 48 c1 e7 10 shlrdi,0x10 12: 48 09 f8or rax,rdi 15: 0f b7 3d 00 00 00 00movzx edi,WORD PTR [rip+0x0]# 1c 18: R_X86_64_PC32 .rodata 1c: 48 c1 e7 20 shlrdi,0x20 20: 48 09 c7or rdi,rax 23: e9 00 00 00 00 jmp28 24: R_X86_64_PLT32 func-0x4 https://gcc.godbolt.org/z/Qxq6Rh
[Bug middle-end/94703] Small-sized memcpy leading to unnecessary register spillage unless done through a dummy union
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94703 --- Comment #11 from pskocik at gmail dot com --- Thanks for the shot at a fix, Richard Biener. Since I have reported this, I think I should mentioned a related suboptimality that should probably be getting fixed alongside with this (if this one is getting fixed), namely that while int64_t zextend_int_to_int64_nospill(int *X) { union { int64_t _; } r = {0}; return memcpy(_,X,sizeof(*X)),r._; } (and hopefully later even int64_t zextend_int_to_int64_spill(int *X) { int64_t r = {0}; return memcpy(,X,sizeof(*X)),r; } ) generates, on x86_64, the optimal zextend_int_to_int64_nospill: mov eax, DWORD PTR [rdi] ret for zeroextending promotions of sub-int types, an extra xor instruction gets generated, e.g.: int64_t zextend_short_to_int64_nospill_but_suboptimal(short *X) { union { int64_t _; } r ={0}; return memcpy(_,X,sizeof(*X)),r._; } => zextend_short_to_int64_nospill_but_suboptimal: xor eax, eax mov ax, WORD PTR [rdi] ret which was surprising to me because it doesn't happen with zero-extending memcpy-based promotion from {,u}ints to larger types ({,u}{,l}longs). https://gcc.godbolt.org/z/ZjXaCw
[Bug c/94703] New: Small-sized memcpy leading to unnecessary register spillage unless done through a dummy union
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94703 Bug ID: 94703 Summary: Small-sized memcpy leading to unnecessary register spillage unless done through a dummy union Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- The problem, demonstrated in code examples below, can be suppressed by memcpying into a union (possibly just a one-member union), but that seems like a silly workaround that shouldn't be required. Examples: #include #include uint64_t get4_1(void const *X) { //spills uint64_t r = 0; memcpy(,X,4); return r; } uint64_t get4_nospill(void const *X) { //doesn't spill union { uint64_t u64; } u = {0}; memcpy(,X,sizeof(uint32_t)); return u.u64; } uint64_t get2_1(void const *X) { //spills uint64_t r = 0; memcpy(,X,2); return r; } uint64_t get2_nospill(void const *X) { //doesn't spill union { uint64_t u64; } u = {0}; memcpy(,X,sizeof(uint16_t)); return u.u64; } void backend(void const*Src, size_t Sz); static inline void valInPtrInl(void *Src, size_t Sz) { if(Sz<=sizeof(void const*)){ #if 1 //spills void const*inlSrc; memcpy(,Src,Sz); backend(inlSrc,Sz); return; #else //doesn't spill union{ void const*inlSrc; } u; memcpy(,Src,Sz); backend(u.inlSrc,Sz); return; #endif } backend(Src,Sz); return; } void valInPtr(int X) { valInPtrInl(,sizeof(X)); } GCC 9.3 output on x86_64: get4_1: mov QWORD PTR [rsp-8], 0 mov eax, DWORD PTR [rdi] mov DWORD PTR [rsp-8], eax mov rax, QWORD PTR [rsp-8] ret get4_nospill: mov eax, DWORD PTR [rdi] ret get2_1: mov QWORD PTR [rsp-8], 0 movzx eax, WORD PTR [rdi] mov WORD PTR [rsp-8], ax mov rax, QWORD PTR [rsp-8] ret get2_nospill: xor eax, eax mov ax, WORD PTR [rdi] ret valInPtr: mov DWORD PTR [rsp-16], edi mov rdi, QWORD PTR [rsp-16] mov esi, 4 jmp backend Clang 3.1 output on x86_64: get4_1: # @get4_1 mov EAX, DWORD PTR [RDI] ret get4_nospill: # @get4_nospill mov EAX, DWORD PTR [RDI] ret get2_1: # @get2_1 movzx EAX, WORD PTR [RDI] ret get2_nospill: # @get2_nospill movzx EAX, WORD PTR [RDI] ret valInPtr: # @valInPtr mov EDI, EDI mov ESI, 4 jmp backend # TAILCALL https://gcc.godbolt.org/z/rwq2UY
[Bug middle-end/93487] Missed tail-call optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93487 --- Comment #3 from pskocik at gmail dot com --- The gist of this along with https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93540 is "please make trivial aggregates (i.e., aggregates, which are ultimately a native type) a true zero-cost abstraction". I feel like we shouldn't have to pay for `struct Int { int _; };` (or a union of int with some <= types) over `int`, but on gcc (contrast with clang), you effectively have to: / int intfunc(void); int intfuncwrap(void) { return intfunc(); } => jmp5 / struct Int { int x; }; struct Int intfuncwrap2(void) { return (struct Int){intfunc()}; } => push rax call 6 poprdx ret / Clang has been doing this right since clang 3 (and Compiler Explorer doesn't have an older version): https://gcc.godbolt.org/z/VSUHs_ . Here's a related, but opposite, example where a trivial (one-member) union gets optimized better than its contained type when used directly: https://gcc.godbolt.org/z/egXRjJ . These trivial type wrappings shouldn't affect codegen positively or negatively, but they do on gcc.
[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #38 from pskocik at gmail dot com --- I like this behavior. I use (void) casts to suppress warnings about unused parameters and variables, but I'd rather suppressing WUR weren't that simple because of functions whose return result represents an allocated resource (allocated memory, FILE, filedescriptor, etc.), in which case the suppression is in 99% cases erroneous. Of course, WUR is also useful as an aid in enforcing consistent error checking but a codebase using WUR like that might as well define an custom IGNORE macro (which assigns the result to a properly typed temporary and then voids it) and make sure such a macro only works on return values which are truly safe to ignore (e.g., rather than returning plain int, long, etc., you might return struct ignorable_int { int ignorable_retval; };, struct ignorable_long { long ignorable_retval; }, etc. and have your ignore macro try and access the specifically named member). (An ability to directly attach WUR to such types, which clang has gcc currently doesn't (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94379), would also go nicely with this un-void-able WURs feature (although WURs are void-able on clang)).
[Bug c/94379] New: Feature request: like clang, support __attribute((__warn_unused_result__)) on structs, unions, and enums
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94379 Bug ID: 94379 Summary: Feature request: like clang, support __attribute((__warn_unused_result__)) on structs, unions, and enums Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Clang supports applying the warn_unused_result attribute to enums, structs, and unions, which has the effect that functions returning such an attributed enum/struct/union behaves as if it itself had the warn_unused_result attribute. Example: typedef struct __attribute__((__warn_unused_result__)) aStructType{ int x; } aStructType; aStructType getStruct(void); typedef union __attribute__((__warn_unused_result__)) aUnionType{ int x; } aUnionType; aUnionType getUnion(void); typedef enum __attribute__((__warn_unused_result__)) anEnumType{ anEnumarationConstant } anEnumType; anEnumType getEnum(void); int main() { getEnum(); getStruct(); getUnion(); } // https://gcc.godbolt.org/z/jyHhLx I find this to be a very useful feature, and it would be nice if gcc had it (along with its current un-void-able warn_unused_result (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425)).
[Bug tree-optimization/87313] attribute malloc not used for alias analysis when it could be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87313 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #4 from pskocik at gmail dot com --- If (when?) this optimization is implemented, it would also be great if returning `type *restrict`, `struct somestruct { /*...*/ type *restrict p; /*...*/ }`, or an equivalent of these via a pointer (e.g., as in `void my_malloc(void *restrict*Result, size_t Sz);`) resulted in the same optimization being applied ( unless I'm mistaken in that `restrict` applied in these context implies the same (__attribute((malloc))-like) semantics).
[Bug c/93540] New: Attributes pure and const not working with aggregate return types, even trivial ones
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93540 Bug ID: 93540 Summary: Attributes pure and const not working with aggregate return types, even trivial ones Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Example: #define SIMPLE 0 #include #if SIMPLE typedef int TYPE; #else typedef struct TYPE { int a; } TYPE; #endif //__attribute((pure)) __attribute((const)) TYPE get(void); void TEST(void) { #if !SIMPLE // :( generates repeated calls if(get().a==0) abort(); if(get().a==0) abort(); if(get().a==0) abort(); if(get().a==0) abort(); #else //OK, 1 call if(get()==0) abort(); if(get()==0) abort(); if(get()==0) abort(); if(get()==0) abort(); #endif } /// https://gcc.godbolt.org/z/N79MCx
[Bug c/93487] New: Missed tail-call optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93487 Bug ID: 93487 Summary: Missed tail-call optimizations Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Given, for example: #include typedef union lp_tp { long l; void *p; } lp_tp ; typedef union ilp_tp{ int i; long l; void *p; lp_tp lp; } ilp_tp; long lcallee(void); int icallee(void); void *pcallee(void); lp_tp lpcallee(void); //tail calls on clang but not on gcc int l2i(void){ return lcallee(); } ilp_tp l_caller(void) { long rc = lcallee(); return (ilp_tp){.l=rc}; } ilp_tp p_caller(void) { void *rc= pcallee(); return (ilp_tp){.p=rc}; } ilp_tp lp_caller(void) { lp_tp rc = lpcallee(); return (ilp_tp){.lp = rc}; } ilp_tp lp_caller2(void) { lp_tp rc = lpcallee(); return (ilp_tp){.p = rc.p}; } struct foo* p2p_caller(void) { return pcallee(); } //optimized on both uintptr_t p2up_caller(void) { return (uintptr_t)pcallee(); } //optimized on both //not optimized by either ilp_tp i_caller(void) { int rc = icallee(); ilp_tp r; r.i=rc; return r; } clang (x86_64) is able to turn all of these calls except the last one into tail calls but gcc tailcall-optimizes only the pointer-to-pointer conversions. https://gcc.godbolt.org/z/Lw9-D2
[Bug tree-optimization/93447] New: Value range propagation not working at -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93447 Bug ID: 93447 Summary: Value range propagation not working at -Os Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- I have a lot of cases where I'd like to translate boolean conditions to negated errno codes (possibly wrapped in an struct or a trivial union). If this translation is inlinable, I'd expect that undoing it to get a boolean again (bool => negated errno => bool) would eliminate the roundtrip. GCC indeed does this at -O2 and -O3, but the optimization's failing to kick in at -Os, leading to code size increases. Clang succeeds at eliminating the roundrip at -Os (and it does this optimization already at -O1). A simple example that generates unnecessary code at -Os: #include _Bool addb_simple(int A, int B, int *Rp) { int ec=0; if(__builtin_add_overflow(A,B,Rp)) ec = -ERANGE; return !!ec; } https://gcc.godbolt.org/z/tGkbtD Thanks for looking into it!
[Bug c/93441] New: _Generic selections ought to be treated as parenthesized expressions as far as -Wlogical-not-parentheses is concerned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93441 Bug ID: 93441 Summary: _Generic selections ought to be treated as parenthesized expressions as far as -Wlogical-not-parentheses is concerned Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- int x [ _Generic(0,int: !0) < 10 ]; //falsely triggers -Wlogical-not-parentheses int y [ (_Generic(0,int: !0)) < 10 ]; //OK
[Bug middle-end/26724] __builtin_constant_p fails to recognise function with constant return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26724 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #3 from pskocik at gmail dot com --- I don't know if this is related, but what's been bugging me is that gcc's __builtin_constant_p (clang's also, but in different situations) fails to recognize the constness/non-constness of a `memcmp` call when it's used for equality comparisons. In such situations, I would like to use __builtin_constant_p(!memcmp(...))?!memcmp(...):!my_memcmp(...) to call a custom backend function (one not named `memcmp`) if real `memcmp` would be called, but inline a constant otherwise. Unfortunately there seem to be edge cases where this doesn't work and while the assembly for a !memcmp(...) expression shows it's been folded to a constant, a __builtin_constant_p around such an expression doesn't reflect that. E.g., in functions eq_eh{2,3}_cexprEh in https://gcc.godbolt.org/z/6oefRX (on clang the correspondence breaks in eq_eh1_cexprEh).
[Bug target/91298] $ at the beginging causing Error: junk `(%rip)' after expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91298 --- Comment #7 from pskocik at gmail dot com --- (In reply to CVS Commits from comment #6) > The master branch has been updated by Jakub Jelinek : Thank you for the fix!
[Bug target/91298] $ at the beginging causing Error: junk `(%rip)' after expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91298 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #4 from pskocik at gmail dot com --- Related https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45591 . I've played with it and this simple patch diff --git a/gcc/final.c b/gcc/final.c index fefc4874b24a..ba7425afa667 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -4087,11 +4087,20 @@ output_addr_const (FILE *file, rtx x) case SYMBOL_REF: if (SYMBOL_REF_DECL (x)) assemble_external (SYMBOL_REF_DECL (x)); -#ifdef ASM_OUTPUT_SYMBOL_REF - ASM_OUTPUT_SYMBOL_REF (file, x); -#else - assemble_name (file, XSTR (x, 0)); -#endif + + { + bool dollar_eh = XSTR(x,0)[0] == '$'; + if (dollar_eh) fputc('(',file); + + #ifdef ASM_OUTPUT_SYMBOL_REF + ASM_OUTPUT_SYMBOL_REF (file, x); + #else + assemble_name (file, XSTR (x, 0)); + #endif + + if (dollar_eh) fputc(')',file); + } + break; case LABEL_REF: seems to fix it, at least for x86-64. Basically you need parentheses around names of globals (at least those that start with `$`) when they're used as operands. The parentheses is what clang does. Both clang and tinycc have no problem with this. It would be great if gcc could catch up.
[Bug c/93239] Enhancement: allow unevaluated statement expressions at filescope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93239 --- Comment #1 from pskocik at gmail dot com --- Fixing this seems as simple as removing/commenting-out: gcc/c/c-parser.c:8195 /* If we've not yet started the current function's statement list, gcc/c/c-parser.c:8196or we're in the parameter scope of an old-style function gcc/c/c-parser.c:8197declaration, statement expressions are not allowed. */ gcc/c/c-parser.c:8198 if (!building_stmt_list_p () || old_style_parameter_scope ()) gcc/c/c-parser.c:8199 { gcc/c/c-parser.c:8200 error_at (loc, "braced-group within expression allowed " gcc/c/c-parser.c:8201 "only inside a function"); gcc/c/c-parser.c:8202 parser->error = true; gcc/c/c-parser.c:8203 c_parser_skip_until_found (parser, CPP_CLOSE_BRACE, NULL); gcc/c/c-parser.c:8204 c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL); gcc/c/c-parser.c:8205 expr.set_error (); gcc/c/c-parser.c:8206 break; gcc/c/c-parser.c:8207 } This would be both very useful, and it makes all kind of sense, because other expression constructs (function calls, comma expressions, ...) aren't restricted syntactically either (just semantically), which means they _can_ be inside untaken branches of constant-forming _Generic/__builtin_choose_expr, and I can think of no good reason why statement expressions shouldn't be allowed there too.
[Bug c/92935] typeof() on an atomic type doesn't always return the corresponding unqualified type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92935 --- Comment #3 from pskocik at gmail dot com --- jos...@codesourcery.com, that's interesting, but it seems like an unnecessary, weird special case, considering that gcc already has a qualifier-dropping mechanism that doesn't necessitate special-casing __typeof for _Atomic-qualified types. Casting an expression to its own type (which on gcc works for aggregates too) doe it. Compilable example: #if __clang__ #define DROP_Q(X) ((void)0,X) //clang rejects the casts for aggregate types #else #define DROP_Q(X) (__extension__({ (__typeof(X))(X) ; })) //__extension__ so aggregates are accepted //even under -pedantic #endif int main(void) { #define TEST_RVAL_CONV(Tp) \ do{ \ _Atomic const volatile Tp abar; \ const volatile Tp bar; \ __typeof(DROP_Q(bar)) noqualif_bar; \ __typeof(DROP_Q(abar)) noqualif_abar; \ _Generic(_bar, Tp*: (void)0); \ _Generic(_abar, Tp*: (void)0); \ }while(0) TEST_RVAL_CONV(int); TEST_RVAL_CONV(__typeof(int*)); typedef struct s_tp { int x; } s_tp; TEST_RVAL_CONV(s_tp); } https://gcc.godbolt.org/z/UtMyxM I think all lvalue-ness-dropping expressions (e.g., the comma operator or ?: ) ought to drop top-level qualifs too (and they do on clang), and such a qualif-dropping operation wouldn't then be dependent on the gcc extension of allowing casts to non-scalar types, but unfortunately, gcc does not drop top-level qualifs in rvalue-conversions, which means the clang implementation of the qualifier-dropping macro doesn't work on gcc.
[Bug c/92935] typeof() on an atomic type doesn't always return the corresponding unqualified type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92935 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #1 from pskocik at gmail dot com --- I don't think typeof is supposed to lose qualifiers. _Generic(Expr,...) loses them for Expr in an rvalue conversion (also decays arrays to pointer), but __typeof is supposed to preserve everything--it does preserve qualifiers in other compilers (clang/tinycc) and in gcc: _Atomic const int aci=0; _Generic(, _Atomic const int*: (void)0); //ok _Atomic typeof(const int) aci2=0; _Generic(, _Atomic const int*: (void)0); //ok but there does seem to be a bug in gcc in how typeof combines with pointer symbols (*) and other qualifiers where gcc appears to be curiously dropping all qualifiers if (and only if) one of the original qualifiers was _Atomic _Generic((typeof(aci)*)0, _Atomic const int*: (void)0); //gcc error (int*), ok on clang _Generic((typeof(aci2)*)0, _Atomic const int*: (void)0); //gcc error (int*), ok on clang _Generic((typeof(aci2) volatile*)0, _Atomic const volatile int*: (void)0); //gcc error (int volatile*), ok on clang Clang doesn't do this, and neither gcc or clang typeof drops any qualifiers if there's no _Atomic among them: //no qualifs dropped if no _Atomic was involved const int ci=0; _Generic(, const int*: (void)0); //ok typeof(const int) ci2=0; _Generic(, const int*: (void)0); //ok _Generic((typeof(ci)*)0, const int*: (void)0); //ok _Generic((typeof(ci2)*)0, const int*: (void)0); //ok _Generic((typeof(ci2) volatile*)0, const volatile int*: (void)0); //ok https://gcc.godbolt.org/z/TwtEGP
[Bug c/93265] New: memcmp comparisons of structs wrapping a primitive type not as compact/efficient as direct comparisons of the underlying primitive type under -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93265 Bug ID: 93265 Summary: memcmp comparisons of structs wrapping a primitive type not as compact/efficient as direct comparisons of the underlying primitive type under -Os Product: gcc Version: 9.2.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- `memcmp` comparisons of types such as `struct Int { int x; };` generate full `memcmp` calls under `-Os` (also `-O1`) These are much larger (/less efficient) than the direct primitive-type comparisons that could have been used. Example code: #include //a contiguous struct wrapping a primitive type typedef struct a_tp { int x; }a_tp; _Static_assert(sizeof(a_tp)==sizeof(int),""); //compare a contiguous lvalue #define CONTIG_EQ_EH(Ap,Bp) (!memcmp(Ap,Bp,sizeof(*(1?(Ap):(Bp) / //>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> //A FULL MEMCPY :( under -Os (and -O1) _Bool a_is42(a_tp X) {return CONTIG_EQ_EH(,&(a_tp const){42});} //>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> / _Bool i_is42(int X) {return X==42; } //direct cmp _Bool i2_is42(a_tp X) {return X.x==42; } //same _Bool i3_is42(a_tp X) {return CONTIG_EQ_EH(,&(int const){42});} //still a direct cmp https://gcc.godbolt.org/z/BC_QsN
[Bug c/93241] New: _Bool casts in dead branches of integer constant expressions cause undesirable warnings under -pedantic iff the dead branch contains overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93241 Bug ID: 93241 Summary: _Bool casts in dead branches of integer constant expressions cause undesirable warnings under -pedantic iff the dead branch contains overflow Product: gcc Version: 5.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- About the simplest example of this I could get: //erroneously warns about non-constness under -pedantic _Static_assert( 0? (_Bool)(INT_MAX+1) : 1 ,""); https://gcc.godbolt.org/z/W_tvTS The problem seems to have existed since gcc 5.
[Bug c/93239] New: Enhancement: allow unevaluated statement expressions at filescope
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93239 Bug ID: 93239 Summary: Enhancement: allow unevaluated statement expressions at filescope Product: gcc Version: 7.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- I've noticed gcc seems to syntactically disallow statement expressions at filescope even in contexts where they wouldn't be evaluated such as: 1) inside sizeof/__typeof/_Alignof 2) inside _Generic branches that aren't taken I'm currently trying to use 2) to implement some generic numerical macros that evaluate to an integer constant expression iff their arguments are integer constant expressions and at the same time don't double-evaluate their arguments (if those aren't integer constant expressions). A simple example would be: #define SQ(X) _Generic(0?(void*)((X)*0):(int*)0, \ int*: /*isconstexpr(X)==1*/ (X)*(X), \ void *: /*isconstexpr(X)==0*/ (__extension__({ __typeof(X) SQ = (X); SQ *= (X); })) ) Interestingly this works on tinycc (a much more primitive compiler) where it can be used in filescope to give enum values, array sizes, or bit-field widths or inside static asserts at filescope, but on gcc/clang, all of these must be inside a function. Of course, this can worked around by using an (inline) function for each integer type and a second _Generic in the non-constexpr branch of the macro that enumerates the helper functions, but that seems like a rather bloated workaround necessitated only by what seems to be an unnecessary restriction in the compiler.
[Bug c/93180] const function pointers placed in a custom section are causing that custom section to become writable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93180 --- Comment #5 from pskocik at gmail dot com --- Jakub Jelinek, I later asked how this worked on Stack Overflow (https://stackoverflow.com/questions/59629946/why-do-gcc-and-clang-place-custom-sectioned-const-funcptr-symbols-into-writable). Got no answer there (yet), but your comment explains it nicely! Thanks!
[Bug c/93180] const function pointers placed in a custom section are causing that custom section to become writable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93180 --- Comment #3 from pskocik at gmail dot com --- Thanks for explaining. Yes, -fPIC does cause the section to become writable on clang. I'm currently toying with using a custom section to gather const function-pointers, but this -fPIC stuff is causing these const-pointers to be effectively writable via __start_mysection/__stop_mysection, which is weird. I thought the const data would get relocated all once at load time and then become readonly, but it is staying writable with -fPIC. Anyway, apologies for the false alarm. Best regards, Petr Skocik
[Bug c/93180] New: const function pointers placed in a custom section are causing that custom section to become writable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93180 Bug ID: 93180 Summary: const function pointers placed in a custom section are causing that custom section to become writable Product: gcc Version: 7.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- __attribute((__section__("mysection"))) int const cx = -42; (or with multiple const data variables in the `mysection` section) results in assembly output containing .sectionmysection,"a",@progbits Adding a const function pointer as in #include __attribute((__section__("mysection"))) int (* const p)(char const*) = causes the section to be mapped to a writable segment .sectionmysection,"aw",@progbits Since the pointer is const, I think the section ought to remain read-only (on clang it does).
[Bug c/91669] New: #pragma's and _Pragma's work but _Pragma's used in an equivalent macro don't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91669 Bug ID: 91669 Summary: #pragma's and _Pragma's work but _Pragma's used in an equivalent macro don't Product: gcc Version: 5.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- The problem appears to exist on all gcc versions. Example Code: #define BX_gcc_push(Category,...) BX_pragma(GCC Category push ) BX_pragma(GCC Category __VA_ARGS__) #define BX_gcc_pop(Category) BX_pragma(GCC Category pop) #define BX_nodiag_push(DiagStr) BX_gcc_push(diagnostic, ignored DiagStr) #define BX_nodiag_pop() BX_gcc_pop(diagnostic) #define BX_pragma(...) _Pragma(#__VA_ARGS__) int foo(void) { //This silences -Wreturn-type on the closing curly as it should BX_nodiag_push("-Wreturn-type") } BX_nodiag_pop() #define BX_retundef(Rbr) /*{{{*/ \ BX_nodiag_push("-Wreturn-type") \ Rbr \ BX_nodiag_pop() /*}}}*/ int bar(void) { //This FAILS to silence -Wreturn on the closing curly //(works on clang and the code obtained from text-expanding the macro (gcc -E) //works on gcc too) BX_retundef(})
[Bug c/90552] New: attribute((optimize(3))) not overriding -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90552 Bug ID: 90552 Summary: attribute((optimize(3))) not overriding -Os Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- I test-compiled ( https://gcc.godbolt.org/z/8bhbNa ): __attribute((optimize(3))) int div(int X) { return X/3; } with -O{0,1,2,3,s}, expecting to get the same assembly in all cases, but __attribute((optimize(3))) is failing to override the last case, namely -Os. (I'd like the function to not use the idiv instruction even if the rest of the file is compiled with -Os). Please correct me if I'm wrong to expect `__attribute((optimize(3)))` to be able to override `-Os`. This behavior appears to exist on all gcc versions.
[Bug c/39985] Type qualifiers not actually ignored on function return type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39985 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #8 from pskocik at gmail dot com --- (In reply to Eric Gallager from comment #6) > (In reply to jos...@codesourcery.com from comment #5) > > In C, in C11 mode, type qualifiers are completely ignored on function > > return types, including not affecting type compatibility, after my commit: > > > > r236231 | jsm28 | 2016-05-13 21:35:39 + (Fri, 13 May 2016) | 46 lines > > > > Implement C11 DR#423 resolution (ignore function return type qualifiers). > > So can this be closed then? As of 8.2, it doesn't appear to work properly yet. It looks like the top level qualifs on the return type aren't being ignored if the return type is sealed in a typedef or __typeof. typedef int const ic_tp; int const f(); //ignores the const here ic_tp f(); //breaks because the const isn't ignored here Same with: int const f(); //ignored here __typeof(int const) f(); //not ignored here The examples in Godbolt: https://gcc.godbolt.org/z/GVvkmJ
[Bug c/65455] typeof _Atomic fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65455 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #22 from pskocik at gmail dot com --- (In reply to Marek Polacek from comment #18) > So this looks like a dup of PR39985. It seems that, if anything, we should > modify __typeof to drop all qualifiers. I.e. that all of the following > __typeofs yield "int": > > const int a; > volatile int b; > const volatile c; > _Atomic int d; > int *restrict e; > __typeof (a) x; > __typeof (b) y; > __typeof (c) q; > __typeof (d) r; > __typeof (const int) z; > __typeof (volatile const int) w; > __typeof (volatile int) v; > __typeof (_Atomic volatile int) t; > __typeof (*e) *s; > > Or is that not so? > > What should we do for C++? As a user, I can always force top-level-qualifier dropping by rvalue conversion (e.g., with , or ?:) but it(In reply to Jens Gustedt from comment #20) > I would be much happier with a generic operator that makes any object into > an rvalue. One way that comes close would be `1 ? (X) : (X)`. This is an > expression that transforms any expression `X` that is not a narrow integer > type into an rvalue. > > Unfortunately it is too ugly that anybody ever will systematically write > `__typeof__(1?(X):(X))`. But a macro > > #define __typeof_unqual__(X) __typeof__(1?(X):(X)) > > could do. (And one could fix the finite number of cases that are not covered > with `_Generic`.) > > I'd like to have prefix `+` for that. This could be useful in `__typeof__` > but also in `_Generic`. Maybe gcc could extend that operator to be > applicable to all types. (In reply to Jens Gustedt from comment #20) > I would be much happier with a generic operator that makes any object into > an rvalue. One way that comes close would be `1 ? (X) : (X)`. This is an > expression that transforms any expression `X` that is not a narrow integer > type into an rvalue. > > Unfortunately it is too ugly that anybody ever will systematically write > `__typeof__(1?(X):(X))`. But a macro > > #define __typeof_unqual__(X) __typeof__(1?(X):(X)) > > could do. (And one could fix the finite number of cases that are not covered > with `_Generic`.) > > I'd like to have prefix `+` for that. This could be useful in `__typeof__` > but also in `_Generic`. Maybe gcc could extend that operator to be > applicable to all types. I agree __typeof should keep all top level qualifs (clang's __typeof does). But I'd rather the unary + were not extended to non-numeric types. I frequently rely on it to throw comptime errors when applied to non-numerics. I think the comma should be able to accomplish the job (__typeof(0,X)) with similar brevity as that of the unary +.
[Bug c/66918] Disable "inline function declared but never defined" warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66918 --- Comment #8 from pskocik at gmail dot com --- I'd also very much welcome a way to silence this (like with -Wno-undefined-inline on clang). My reason for wanting it is I'd like to prototype a non-static inline function in one header (a fast-to-include header), define it in another (a slower-to-parse header that might not always be needed), and have both headers includable in the same translation unit. Dummy example: /*first.h*/ inline void f(void); /*second.h*/ //#include "first.h" inline void f(void){} Unfortunately, if only the first header is included, gcc's generating this unsilencable warning unless I drop the `inline` from the prototype, but if I do and if I then also include the second header with the definition, then the prototype without the inline will turn into an unwanted instantiation and linker errors down the road.
[Bug c/66918] Disable "inline function declared but never defined" warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66918 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #7 from pskocik at gmail dot com --- I'd also very much welcome a way to silence this (like with -Wno-undefined-inline on clang). My reason for wanting it is I'd like to prototype a non-static inline function in one header (a fast-to-include header), define it in another (a slower-to-parse header that might not always be needed), and have both headers includable in the same translation unit. Dummy example: /*first.h*/ inline void f(void); /*second.h*/ //#include "first.h" inline void f(void){} Unfortunately, if only the first header is included, gcc's generating this unsilencable warning unless I drop the `inline` from the prototype, but if I do and if I then also include the second header with the definition, then the prototype without the inline will turn into an unwanted instantiation and linker errors down the road.
[Bug c/89264] New: Incorrect bitfield type in -Wconversion warnings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89264 Bug ID: 89264 Summary: Incorrect bitfield type in -Wconversion warnings Product: gcc Version: 7.4.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- void f() { struct{ unsigned x:1; }x = { (unsigned){0} }; } warns about a conversion to `unsigned char:1`. It should say `unsigned int:1`.
[Bug c/89265] New: Incorrect bitfield type in -Wconversion warnings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89265 Bug ID: 89265 Summary: Incorrect bitfield type in -Wconversion warnings Product: gcc Version: 7.4.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- void f() { struct{ unsigned x:1; }x = { (unsigned){0} }; } warns about a conversion to `unsigned char:1`. It should say `unsigned int:1`.
[Bug target/45591] gcc generates illegal asm at -O2 with -fdollars-in-identifiers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45591 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #3 from pskocik at gmail dot com --- I think I've run into the same problem. If I compile int $ident(int X) { return X; } int main() { return $ident(1); } the generated assembly won't translate. gcc generates call$ident where clang would have parenethesized the $-containing identifier. The missing parens result in assembler error "operand type mismatch for call".
[Bug c/88301] New: Optimization regression with undefined unsigned overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88301 Bug ID: 88301 Summary: Optimization regression with undefined unsigned overflow Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- I noticed gcc 7.* did a really nice optimization that allowed me to communicate I want even some unsigned overflows to be undefined: #define ADD_NW(A,B) (__extension__({ __typeof(A+B) R; if(__builtin_add_overflow(A,B,)) __builtin_unreachable(); R ;})) _Bool a_b(unsigned A, unsigned B) { return A+B >= B; } _Bool a_b2(unsigned A, unsigned B) { return ADD_NW(A,B) >= B; } resulted in: a_b: add edi, esi setnc al ret a_b2: mov eax, 1 ret But on gcc 8.* it's a_b: add edi, esi setnc al ret a_b2: add edi, esi setnc al ret again.
[Bug c/88131] New: `gcc -S pp_assembly.S - o OutputFile.s` writes to STDOUT instead of `OutputFile.s`
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88131 Bug ID: 88131 Summary: `gcc -S pp_assembly.S - o OutputFile.s` writes to STDOUT instead of `OutputFile.s` Product: gcc Version: 7.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- `gcc -S pp_assembly.S -o OutFile.s` or `gcc -S pp_assembly.sx -o OutFile.s` or should behave the same as `gcc -E pp_assembly.S -o OutFile.s` or `gcc -E pp_assembly.sx -o OutFile.s` respectively but in the `-S` case, the `-o` option is ignored. (Clang does it correctly.)
[Bug preprocessor/82335] Incorrect _Pragma expansion in complex macro expressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82335 --- Comment #1 from pskocik at gmail dot com --- This problem still persists in gcc 7.3.0. It appears pasting a macro containing `_Pragma`s into another macro is what's causing the displacement of the generated `#pragma`s. I've cleaned up the example to make it clearer: #define PRAGMA(...) _Pragma(#__VA_ARGS__) #define PASTE(Expr) Expr #define PUSH_IGN(X) PRAGMA(GCC diagnostic push) PRAGMA(GCC diagnostic ignored X) #define POP() PRAGMA(GCC diagnostic pop) #define SIGNED_EH(X) \ ({ PUSH_IGN("-Wtype-limits") \ _Bool SIGNED_EH = ((__typeof(X))-1 < 0); \ POP() \ SIGNED_EH; }) int main(); { unsigned x; SIGNED_EH(x); //OK; #pragmas generated around the assignment: #if 0 //generated: ({ #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wtype-limits" _Bool SIGNED_EH = ((__typeof(x))-1 < 0); #pragma GCC diagnostic pop SIGNED_EH; }); #endif PASTE(SIGNED_EH(x)); //OOPS generates: #if 0 //generated: #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wtype-limits" #pragma GCC diagnostic pop ({ _Bool SIGNED_EH = ((__typeof(x))-1 < 0); SIGNED_EH; }); #endif } Clang's preprocessor generates correct code even for the `PASTE(SIGNED_EH(x))` case.
[Bug c/82335] New: Incorrect _Pragma expansion with in complex macro expressions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82335 Bug ID: 82335 Summary: Incorrect _Pragma expansion with in complex macro expressions Product: gcc Version: 5.4.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Created attachment 42239 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42239=edit reproducer for "Incorrect _Pragma expansion with in complex macro expressions"; compile with -Wall -Wextra Basically, _Pragma's expansions (as shown with gcc -E) seem to get shifted out of some slightly "more complex" macro expressions, which renders them ineffective. Attached is one piece of code that reproduces the problem. When compiled with `-Wall -Wextra`, the warning which should've been silenced isn't, because the #pragma push-pop pair gets shifted out of the expression. It's hard to pinpoint exactly what causes this, and the problem goes away with minor complexity reductions (such as replacing a macro (e.g., the tof macro) with what it expands to), but it seems to stick in more complex contexts. clang handles everything fine.
[Bug pch/15351] Add option for caching headers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15351 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #3 from pskocik at gmail dot com --- >From my reading of the manual, which talks about a *.gch precompiled-header directory, it seems to me like gcc would be in the perfect position to manage that directory itself. If the directory has a header matching the current compiler config, it should use it, otherwise, it should create a new entry and use that. The user could simply tell gcc which header it wants precompiled and gcc could take care of creating the precompiled versions in the appropriate gch directory as needed. (If gcc were to manage the *.gch directory itself, it wouldn't also need to try all directory entries until a match is found -- it could aim directly, based its established naming system for the entries. The naming system could be such so that entries from uninstalled compiler versions could be automatically or manually deleted.) The cache directories could be in the same directory as the found header, or in a per-user system-directory that pararellizes the path of the found header in case the directory of the found header isn't writable by the current user.
[Bug c/78036] New: -MM suppresses error detection
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78036 Bug ID: 78036 Summary: -MM suppresses error detection Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- Example: touch in.h gcc -x c -include in.h - -MD -MF /dev/stdout <<<'int main(){x; return 42;} fails as it should. Changing -MD to -MM causes the failure to go undetected (no stderr output, no nonzero exit status), making it look as if the compilation succeeded. (Notes: Changing -MF /dev/stdout to -MF regular_file makes no difference. Clang has this behavior too)
[Bug c/77487] gcc reports "file shorter than expected" for regular files on stdin when the offset of fd 0 isn't 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77487 pskocik at gmail dot com changed: What|Removed |Added CC||pskocik at gmail dot com --- Comment #1 from pskocik at gmail dot com --- Created attachment 39564 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39564=edit example test a self-compiling c script -- passes self at an offset to gcc and then runs the a.out it should print: c: hello world but instead, there's also the cc1: warning: is shorter than expected [enabled by default] line in there
[Bug c/77487] New: gcc reports "file shorter than expected" for regular files on stdin when the offset of fd 0 isn't 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77487 Bug ID: 77487 Summary: gcc reports "file shorter than expected" for regular files on stdin when the offset of fd 0 isn't 0 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: pskocik at gmail dot com Target Milestone: --- My program calls `gcc -x c - ` with the offset of filedescriptor 0 being larger than 0. Consequently I get: cc1: warning: is shorter than expected [enabled by default] (clang reports no errors). I think this is caused by: libcpp/files.c:741 if (regular && total != size && STAT_SIZE_RELIABLE (file->st)) cpp_error_at (pfile, CPP_DL_WARNING, loc, "%s is shorter than expected", file->path); not taking the filedescriptor offset into account and that changing it to if (regular && total != size && STAT_SIZE_RELIABLE (file->st) - lseek(file->fd, 0, SEEK_CUR) /*should always succeed?*/ ) cpp_error_at (pfile, CPP_DL_WARNING, loc, "%s is shorter than expected", file->path); should fix it. I hope I'm making sense. Best regards, Petr Skocik