On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo 
<[email protected]> wrote:

>> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a 
>> virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable 
>> mount/unmount transitions. However, this only works if a JVMTI agent has 
>> attached to the VM, otherwise virtual threads don’t honor the disable 
>> request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file` 
>> and `HotSpotDiagnosticMXBean` which don’t require a JVMTI agent to be 
>> present, getting the snapshot of a virtual thread in such cases can lead to 
>> crashes.
>> 
>> This patch moves the transition-disabling mechanism out of JVMTI and into a 
>> new class, `MountUnmountDisabler`. The code has been updated so that 
>> transitions can be disabled independently of JVMTI, making JVMTI just one 
>> user of the API rather than the owner of the mechanism. Here is a summary of 
>> the key changes:
>> 
>> - Currently when a virtual thread starts a mount/unmount transition we only 
>> need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to 
>> go to the slow path. With these changes, JVMTI is now only one user of the 
>> API, so we instead check the actual transition disabling counters, i.e the 
>> per-vthread counter and the global counter. Since these can be set at any 
>> time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or 
>> during a safepoint in case of late binding agents), we follow the classic 
>> Dekker pattern for the required synchronization. That is, the virtual thread 
>> sets the “in transition” bits for the carrier and vthread *before* reading 
>> the “transition disabled” counters. The thread requesting to disable 
>> transitions increments the “transition disabled” counter *before* reading 
>> the “in transition” bits. 
>> An alternative that avoids the extra fence in `startTransition` would be to 
>> place extra overhead on the thread requesting to disable transitions (e.g. 
>> using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance 
>> analysis show no difference with current mainline so for now I kept this 
>> simpler version.
>> 
>> - Ending the transition doesn’t need to check if transitions are disabled 
>> (equivalent to not need to poll when transitioning from unsafe to safe 
>> safepoint state). But we still require to go through the slow path if there 
>> is a JVMTI agent present, since we need to check for event posting and JVMTI 
>> state rebinding. As such, the end transition follows the same pattern that 
>> we have today of only needing to check `_VTMS_notify_jvmti_events`.
>> 
>> - The code was previously structured in t...
>
> Patricio Chilano Mateo has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Rename VM methods for endFirstTransition/startFinalTransition

Hi Patricio, this is another significant piece of work. I have taken an initial 
pass through trying to digest the main parts - can't comment on the C2 code or 
the Java side. I have made a few minor comments/suggestions.

Thanks

src/hotspot/share/prims/jvm.cpp line 3668:

> 3666:   if (!DoJVMTIVirtualThreadTransitions) {
> 3667:     assert(!JvmtiExport::can_support_virtual_threads(), "sanity check");
> 3668:     return;

Does this not still need checking somewhere?

src/hotspot/share/runtime/mountUnmountDisabler.cpp line 162:

> 160:   // be executed once we go back to Java. If this is an unmount, the 
> handshake that the
> 161:   // disabler executed against this carrier thread already provided the 
> needed synchronization.
> 162:   // This matches the release fence in 
> xx_enable_for_one()/xx_enable_for_all().

Subtle. Do we have comments where the fences are to ensure people realize the 
fence is serving this purpose?

src/hotspot/share/runtime/mountUnmountDisabler.cpp line 277:

> 275: 
> 276:   // Start of the critical region. Prevent future memory
> 277:   // operations to be ordered before we read the transition flag.

Does this refer to `java_lang_Thread::is_in_VTMS_transition(_vthread())`? If so 
perhaps that should internally perform the `load_acquire`?

src/hotspot/share/runtime/mountUnmountDisabler.cpp line 278:

> 276:   // Start of the critical region. Prevent future memory
> 277:   // operations to be ordered before we read the transition flag.
> 278:   // This matches the release fence in end_transition().

Suggestion:

  // This pairs with the release fence in end_transition().

src/hotspot/share/runtime/mountUnmountDisabler.cpp line 307:

> 305:   // Block while some mount/unmount transitions are in progress.
> 306:   // Debug version fails and prints diagnostic information.
> 307:   for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt = 
> jtiwh.next(); ) {

This looks very odd, having an assignment in the loop condition check and no 
actual loop-update expression.

src/hotspot/share/runtime/mountUnmountDisabler.cpp line 316:

> 314:   // operations to be ordered before we read the transition flags.
> 315:   // This matches the release fence in end_transition().
> 316:   OrderAccess::acquire();

Surely the use of the iterator already provides the necessary ordering 
guarantee here as well. ?

src/hotspot/share/runtime/mountUnmountDisabler.cpp line 327:

> 325:   // End of the critical section. Prevent previous memory operations to
> 326:   // be ordered after we clear the clear the disable transition flag.
> 327:   // This matches the equivalent acquire fence in start_transition().

Suggestion:

  // This pairs with the acquire in start_transition().

I just realized you are using "fence" to describe release and acquire memory 
barrier semantics. Given we have an operation `fence` I find this confusing for 
the reader - especially when we also have a `release_store_fence` operation 
which might be confused with "release fence".

src/hotspot/share/runtime/mountUnmountDisabler.cpp line 370:

> 368:   assert(VTMSTransition_lock->owned_by_self() || 
> SafepointSynchronize::is_at_safepoint(), "Must be locked");
> 369:   assert(_global_start_transition_disable_count >= 0, "");
> 370:   AtomicAccess::store(&_global_start_transition_disable_count, 
> _global_start_transition_disable_count + 1);

Suggestion:

  AtomicAccess::inc(&_global_start_transition_disable_count);

src/hotspot/share/runtime/mountUnmountDisabler.cpp line 376:

> 374:   assert(VTMSTransition_lock->owned_by_self() || 
> SafepointSynchronize::is_at_safepoint(), "Must be locked");
> 375:   assert(_global_start_transition_disable_count > 0, "");
> 376:   AtomicAccess::store(&_global_start_transition_disable_count, 
> _global_start_transition_disable_count - 1);

Suggestion:

  AtomicAccess::dec(&_global_start_transition_disable_count);

src/hotspot/share/runtime/mountUnmountDisabler.hpp line 52:

> 50:   // parameter is_SR: suspender or resumer
> 51:   MountUnmountDisabler(bool exlusive = false);
> 52:   MountUnmountDisabler(oop thread_oop);

What does the comment mean here?

-------------

PR Review: https://git.openjdk.org/jdk/pull/28361#pullrequestreview-3490207826
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547887801
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548145054
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548157390
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548150552
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548160373
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548161340
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548168223
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548169846
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548170787
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548174392

Reply via email to