On 6/26/23 08:50, Kito Cheng wrote:
LLVM will try to find scratch register even after RA to resolve the long jump issue. so maybe we could consider similar approach? And I guess the most complicate part would be the scratch register is not found, and require spill/reload after RA.
Right. And the spill/reload after RA is ta problem unless you pre-allocate the space. Of course in a function near 1M in size, odds are there were some calls in there and thus $ra would be saved. In the exceedingly rare case where it wasn't, allocating a single stack slot isn't going to be a major performance driver.

There's other things you can do as well. Register scavenging, jump trampolines, etc. Examples of both exist.

The point I'm trying to make is that I suspect we're better off burning $ra right now to address the correctness issue, then coming back to one of the schemes noted above when the cost/benefit analysis shows it's a reasonably high priority relative to other optimizations we could be doing.

Jeff

Reply via email to