On Mon, 30 Mar 2026 16:05:43 GMT, Andrew Haley <[email protected]> wrote:
>> src/hotspot/cpu/aarch64/c1_FrameMap_aarch64.cpp line 205:
>>
>>> 203: if (ProfileCaptureRatio > 1) {
>>> 204: // Use the highest remaining register for r_profile_rng.
>>> 205: r_profile_rng = *remaining.rbegin();
>>
>> OK, so we reserve `r26` or `r27` for RNG counter, right?
>>
>> Is this a good trade for level=1 C1 code that users run with
>> `-XX:TieredStopAtLevel=1` for that sake of startup performance? Why can't /
>> shouldn't we use the PRNG state straight in `JavaThread`? That would also
>> obviate the need to save/restore this register, thus simplifying the
>> machinery and avoiding subtle bugs.
>>
>> It is even worse on x86 that does not have too many registers to begin with.
>> I wonder if there is a way to sense on this level if we are compiling for
>> tier=2,3 or tier=1, and only reserve on tier=2,3?
>>
>> If we going to reserve more registers, maybe start writing up release note
>> with possible caveats.
>
>> OK, so we reserve `r26` or `r27` for RNG counter, right?
>>
>> Is this a good trade for level=1 C1 code that users run with
>> `-XX:TieredStopAtLevel=1` for that sake of startup performance?
>
> No. I'll fix it.
>
>> Why can't / shouldn't we use the PRNG state straight in `JavaThread`? That
>> would also obviate the need to save/restore this register, thus simplifying
>> the machinery and avoiding subtle bugs.
>
> The random generator is _extremely_ hot, probably(?) more so than any local
> variable.
>
> (Re AArch64, a confession: when I wrote the C1 port I forgot to allocate any
> upper registers. No one noticed until Dmitry Chyuko fixed it in February
> 2015.) I guess 16 regs is usually enough.
>
>> It is even worse on x86 that does not have too many registers to begin with.
>> I wonder if there is a way to sense on this level if we are compiling for
>> tier=2,3 or tier=1, and only reserve on tier=2,3?
>
> I'll check that this works correctly.
>
>> If we going to reserve more registers, maybe start writing up release note
>> with possible caveats.
>
> There is one other thing I could do: use a vector register instead of a core
> register. Vector registers tend to be used far less than core registers, so
> (arch-dependent) this could be a win. I'll try this on x86.
I've added an option (x86 only) to use a vector register for the random
generator state. It's less efficient than using a core register, but we don't
lose a core register.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28541#discussion_r3046467506