On 6/26/23 08:50, Kito Cheng wrote:
LLVM will try to find scratch register even after RA to resolve the long
jump issue. so maybe we could consider similar approach? And I guess the
most complicate part would be the scratch register is not found, and
require spill/reload after RA.
Right. And the spill/reload after RA is ta problem unless you
pre-allocate the space. Of course in a function near 1M in size, odds
are there were some calls in there and thus $ra would be saved. In the
exceedingly rare case where it wasn't, allocating a single stack slot
isn't going to be a major performance driver.
There's other things you can do as well. Register scavenging, jump
trampolines, etc. Examples of both exist.
The point I'm trying to make is that I suspect we're better off burning
$ra right now to address the correctness issue, then coming back to one
of the schemes noted above when the cost/benefit analysis shows it's a
reasonably high priority relative to other optimizations we could be doing.
Jeff