https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79748
Bug ID: 79748 Summary: [Enhancement] no_callee_saved_registers function attribute (on x86) Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: katsunori.kumatani at gmail dot com Target Milestone: --- This is not a bug but a feature request/enhancement (hopefully simple?). On x86(-64), upcoming GCC has added the 'no_caller_saved_registers' function attribute which preserves all registers that are used in a function (for e.g. interrupt handlers). This request would be for an attribute with the *opposite* effect: to not save *any* register if it's used in a function, for better code generation. Rationale: GCC since version 5 does nice interprocedural optimizations if it can see the function's definition. It knows which registers a function clobbers, and which it does not, and so it knows which ones it doesn't have to spill before calling the function and reload them after. For example 'rcx' is a register on x86-64 that does not get preserved across functions in the ABI, yet GCC can see if it's not clobbered in a function and not have to spill/reload it in the *caller*. Unfortunately, GCC will *always* save the registers that are callee-saved by the ABI/calling convention within the *callee*, even if the function is really only called from a place that doesn't use those registers at all. How would this attribute help: GCC will *not* save any registers in the function marked with this attribute at all, and callers will assume all registers are dead after the function call... this may seem like bad code, but that's *except* with the interprocedural optimization enabled. This only really works best across the same compilation unit or with LTO (which works great). Then GCC will *only* save/spill the registers it actually uses in the caller and sees clobbered in the callee (because it sees its definition), which results in better code especially for leaf functions with a lot of parameters. It allows GCC more freedom to generate optimal code given the actual circumstances of those function usages. Obviously this attribute should *not* be used on functions that are externally visible (or callbacks), or otherwise interface with other libraries/programs. It's for better code generation within local function calls to the project only. It's the responsibility of the programmer, of course, to use it properly. Please keep in mind that the interprocedural optimizations seem to already be in place (seeing which register gets used across functions) so I'm not asking to implement any optimizations of any kind here. I'm simply asking if it's easy and feasible to implement an attribute to indicate whether a function does *not* save any registers at all, and let GCC's existing optimizations handle the rest. Yes it deviates from the ABI but the ABI doesn't have to be followed within an internal/local function call of a compilation unit, because they don't interface with anything else except the code (which GCC sees with LTO!). That's the whole point of interprocedural optimizations anyway. Please understand I am not claiming this would result in extreme runtime difference, just overall slightly better general code. I am *hoping* it would be an easy addition because we already have no_caller_saved_registers attribute and this is practically the total opposite of that. Plus you do *not* have to worry about x87/SSE/AVX registers at all because they already don't get saved anyway. If that's not the case and it would actually be difficult to add, I can understand if it's hesitant to implement it. I would try to submit a patch myself and see whether it's easily doable, but unfortunately I am not familiar with GCC's internals (to be honest the internals documentation is pretty hard to follow, unlike Clang's; any extra advice about this would be helpful too). LLVM for instance has a lot more calling conventions than GCC, and something similar to this is called the "cc 11" HiPE calling convention (High Performance Erlang), where there are no callee-saved registers at all on x86 (also uses more to pass as parameters and other things since it's a full-blown calling convention, but that's a different subject of course). Would this be a simple addition to add 'no_callee_saved_registers' attribute? It would definitely have its uses if so. (one last thing: I'm aware of the command-line options -fcall-used-reg, but those are not specific to a function only, which makes them hard to use in a source file which has an externally visible function that needs to follow the ABI and other "helper" functions that don't get inlined (used often and are not trivial)... and most people won't bother at all then, which is a shame. Also, they apply even to sub-functions called within the functions compiled with it which basically means GCC will assume any function called will clobber all registers which can be a bad thing for code generation if you have to call a library function -- that's why a function attribute that applies ONLY to the function it is marked with is requested, or at least restrict the '-fcall-used-reg' to one function only (and not to called functions from it). I hope that makes it clear!)