On 11/19/2025 10:55 AM, Leon Hwang wrote:


On 19/11/25 10:47, Menglong Dong wrote:
On 2025/11/19 08:28, Alexei Starovoitov wrote:
On Tue, Nov 18, 2025 at 4:36 AM Menglong Dong <[email protected]> wrote:

As we can see above, the performance of fexit increase from 80.544M/s to
136.540M/s, and the "fmodret" increase from 78.301M/s to 159.248M/s.

Nice! Now we're talking.

I think arm64 CPUs have a similar RSB-like return address predictor.
Do we need to do something similar there?
The question is not targeted to you, Menglong,
just wondering.

I did some research before, and I find that most arch
have such RSB-like stuff. I'll have a look at the loongarch
later(maybe after the LPC, as I'm forcing on the English practice),
and Leon is following the arm64.

Yep, happy to take this on.

I'm reviewing the arm64 JIT code now and will experiment with possible
approaches to handle this as well.


Unfortunately, the arm64 trampoline uses a tricky approach to bypass BTI
by using ret instruction to invoke the patched function. This conflicts
with the current approach, and seems there is no straightforward solution.

Thanks,
Leon


For the other arch, we don't have the machine, and I think
it needs some else help.

Thanks!
Menglong Dong





Reply via email to