> That fixes a pair of issues. @TobiHartmann fixed one where some native > function call would mess up registers. Pushing and popping them solves it. > > The second part is about using `x29`, aka `fp` (frame pointer), as a regular > register. Even when not used as the frame pointer register `x29` is > nevertheless saved on the stack when setting up the frame, for when it's used > as actually the frame pointer. On exit, `x29` and `x30` are restored from > there. But with Valhalla, they can be saved twice on the stack! When the > non-scalarized entry point of a C2-compiled method is used, one might need a > bit more space to scalarize the arguments. This leads to shape such as drawn > here. > > https://github.com/openjdk/valhalla/blob/63314117aa0c30f1a5928b56d21d944de063b8c6/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L6107-L6126 > > Currently, `LR` and `FP` are restored from the copy `#1`. But the comment is > not quite exact: `FP #1` and `FP #2` won't be always the same in case it > contained an oop to an object that the GC moved between frame creation and > frame destruction: the GC knows only one spilled location for each register, > and in the case of `x29`, it's `FP #2.` > > It seems difficult to adapt the GC mechanisms to handle more than one > location to update for a register. Likewise, we know that `LR #1` is update, > but `LR #2` doesn't seem to be: it's fine to update only one, if we know > which one is the trustworthy one. Updating only one `FP` won't mess with the > cases where we are actually using `x29` as frame pointer since then, the GC > won't update any of them as they won't contain oops. > > The process is rather simple: since we need to load `sp_inc` to know how much > to pop from the stack, by changing the `ldr` into a `ldp`, we can load `FP > #2` into `x29` at the same time. Then, we reduce the stack by `sp_inc`, we > read LR#1 (always 1 word under the new top), and we pop the two last words > left. This works whether we need some stack extension or not, and makes sure > `sp` stays 16-byte aligned when reading `LR #1` (I've learned the hard way, > the SIGBUS way, it's a thing...). This requires only one more instruction > than the old way. Let's call that a lesser evil. > > Thanks, > Marc
Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision: Review ------------- Changes: - all: https://git.openjdk.org/valhalla/pull/1540/files - new: https://git.openjdk.org/valhalla/pull/1540/files/6bf70658..cf666298 Webrevs: - full: https://webrevs.openjdk.org/?repo=valhalla&pr=1540&range=01 - incr: https://webrevs.openjdk.org/?repo=valhalla&pr=1540&range=00-01 Stats: 3 lines in 3 files changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/valhalla/pull/1540.diff Fetch: git fetch https://git.openjdk.org/valhalla.git pull/1540/head:pull/1540 PR: https://git.openjdk.org/valhalla/pull/1540
