On Thu, 28 May 2026 21:56:58 GMT, Vladimir Ivanov <[email protected]> wrote:
>> On bytecode level booleans are represented as ints and HotSpot JVM >> normalizes boolean values on memory accesses. It unconditionally applies >> normalization on boolean stores, but trusts on-heap boolean locations to >> hold normalized values. Normalization is applied on loads for off-heap and >> mismatched unsafe accesses . >> >> There are 2 normalization procedures used: (1) cast int to byte and test it >> against zero; and (2) truncation to least-significant bit. Truncation is >> preferred (due to performance considerations), but JNI mandates testing >> against zero and, historically, `#1` was used for off-heap unsafe accesses >> as well. It complicated the implementation (leading to subtle bugs) and >> introduced divergence in behavior at runtime (depending on execution mode >> and JIT-compilation peculiarities). >> >> The fix uses truncation uniformly across all execution modes. It simplifies >> implementation and eliminates possible divergence in behavior between >> execution modes. Also, it drastically simplifies future Unsafe API >> refactorings. >> >> There's one scenario left when it's possible to observe non-normalized >> values: when mismatched access pollutes the Java heap with a bogus boolean >> value, but then the value is read with a well-typed boolean access. >> >> Testing: hs-tier1 - hs-tier6 >> >> - [x] I confirm that I make this contribution in accordance with the >> [OpenJDK Interim AI Policy](https://openjdk.org/legal/ai). > > Vladimir Ivanov has updated the pull request incrementally with one > additional commit since the last revision: > > normalize_for_read/normalize_for_write => normalize src/hotspot/share/opto/library_call.cpp line 2504: > 2502: heap_base_oop == top() || // - > heap_base_oop is null or > 2503: (can_access_non_heap && field == nullptr)) // - > heap_base_oop is potentially null > 2504: // and the > unsafe access is made to large offset Same issues of slowness and word size here as before. Also, the logic gating this extra normalization code is nonsense. It says "if there is anything we don't understand here use the `x!=0` rule". There are three different scenarios, and some of them conflict with a standing decision to use the `x&1` rule for Java heap variables. This is what I meant in my top-level comment about the old logic being impossible to understand because it is self-contradictory. The net result is that C2 picks one or the two boolean normalization rules by a coin flip, in some cases. Yes, a coin flip, because it depends on how the IR types have propagated at the time the intrinsic is expanded. src/hotspot/share/prims/unsafe.cpp line 233: > 231: T get() { > 232: GuardUnsafeAccess guard(_thread); > 233: return normalize(*addr()); Excellent simplifications here. Makes it clear that cleaning a boolean is an operation on a VALUE not a MEMORY LOCATION. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/31249#discussion_r3320994129 PR Review Comment: https://git.openjdk.org/jdk/pull/31249#discussion_r3320988698
