On Mon, 31 Jul 2023 12:22:00 GMT, Yasumasa Suenaga <ysuen...@openjdk.org> wrote:
> In FFM, native function would be called via `nep_invoker_blob`. If the > function has two arguments, it would be following: > > > Decoding RuntimeStub - nep_invoker_blob 0x00007fcae394cd10 > -------------------------------------------------------------------------------- > 0x00007fcae394cd80: pushq %rbp > 0x00007fcae394cd81: movq %rsp, %rbp > 0x00007fcae394cd84: subq $0, %rsp > ;; { argument shuffle > 0x00007fcae394cd88: movq %r8, %rax > 0x00007fcae394cd8b: movq %rsi, %r10 > 0x00007fcae394cd8e: movq %rcx, %rsi > 0x00007fcae394cd91: movq %rdx, %rdi > ;; } argument shuffle > 0x00007fcae394cd94: callq *%r10 > 0x00007fcae394cd97: leave > 0x00007fcae394cd98: retq > > > `subq $0, %rsp` is for shadow space on stack, and `movq %r8, %rax` is number > of args for variadic function. So they are not necessary in some case. They > should be remove following if they are not needed: > > > Decoding RuntimeStub - nep_invoker_blob 0x00007fd8778e2810 > -------------------------------------------------------------------------------- > 0x00007fd8778e2880: pushq %rbp > 0x00007fd8778e2881: movq %rsp, %rbp > ;; { argument shuffle > 0x00007fd8778e2884: movq %rsi, %r10 > 0x00007fd8778e2887: movq %rcx, %rsi > 0x00007fd8778e288a: movq %rdx, %rdi > ;; } argument shuffle > 0x00007fd8778e288d: callq *%r10 > 0x00007fd8778e2890: leave > 0x00007fd8778e2891: retq > > > All java/foreign jtreg tests are passed. > > We can see these stub code on [ffmasm > testcase](https://github.com/YaSuenag/ffmasm/tree/ef7a466ca9607164dbe7be7e68ea509d4bdac998/examples/cpumodel) > with `-XX:+UnlockDiagnosticVMOptions -XX:+PrintStubCode` and hsdis library. > This testcase linked the code with `Linker.Option.isTrivial()`. > > After this change, FFM performance on [another ffmasm > testcase](https://github.com/YaSuenag/ffmasm/tree/ef7a466ca9607164dbe7be7e68ea509d4bdac998/benchmarks/funccall) > was improved: > > before: > > Benchmark Mode Cnt Score Error > Units > FuncCallComparison.invokeFFMRDTSC thrpt 3 106664071.816 ± 14396524.718 > ops/s > FuncCallComparison.rdtsc thrpt 3 108024079.738 ± 13223921.011 > ops/s > > > after: > > Benchmark Mode Cnt Score Error > Units > FuncCallComparison.invokeFFMRDTSC thrpt 3 107622971.525 ± 12249767.134 > ops/s > FuncCallComparison.rdtsc thrpt 3 107695741.608 ± 23983281.346 > ops/s > > > Environment: > * CPU: AMD Ryzen 3 3300X > * OS: Fedora 38 x86_64 (Kernel 6.3.8-200.fc38.x86_64) > * Hyper-V 4vCPU, 8GB mem PING: could you review this PR? I need one more reviewer to push. This PR has passed java/foreign jtreg tests and CI in Oracle. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15089#issuecomment-1676545828