https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84443
--- Comment #3 from Nicholas Piggin <npiggin at gmail dot com> --- (In reply to Segher Boessenkool from comment #2) > If you want some specific machine code for something complex like this, it > is much easier to write the machine code than to fight the compiler. Yes, understood. This may be the way we go eventually. I thought the test case could be interesting for reference. Often times we do juggle around the C/inline asm code to get a good pattern, which can be a bit easier than writing it completely in asm. It does not have to be absolutely optimal of course. Thanks for taking a look and providing some explanation of the problem. Any improvement would be great. > > That said: > > 1) "flags" is stored in the same register everywhere. This is a problem in > expand: it puts the return value in the same pseudo in the whole function. > This is a bad idea because (if it did not) after optimisation the return > value is just a hard register (r3) and putting all _other_ uses in the same > register is a pessimisation; like here (and in many other cases, there are > other PRs for this!) it causes that pseudo to be put in a callee-save > register (r30). > > 2) That should be fixed in expand (to enable other optimisations with the > return pseudo), but IRA should also be smarter about live-range splitting > for shrink-wrapping. > > 3) Separate shrink-wrapping should deal with CRFs. I have a prototype, > it has a few small problems (we need to handle things a little differently > for pretty much every ABI, sigh), and it does not help very much until the > other problems are solved; GCC 9 work); > > 4) Shrink-wrapping itself could also do more live-range splitting than it > currently does.