On Thu, 20 Oct 2022 20:26:47 GMT, Vladimir Ivanov <vliva...@openjdk.org> wrote:
> That sounds like a very interesting idea. > > It would be very helpful to get an understanding how much overhead `STMXCSR` > plus a branch adds in JNI stub to decide whether it's worth optimizing for. It's not just Intel's implementation of x86, though. Apple M1 takes a big hit when writing the FPCR: It seems to me to wait for all instructions in progress to retire. Given that there are 600 entries in the M1 reorder buffer (!) that's a lot. Of course they could rename the FPCR like anything else, but I guess they don't. ------------- PR: https://git.openjdk.org/jdk/pull/10661