If a thread tries to initialize a class that is already being initialized by 
another thread, it will block until notified. Since at this blocking point 
there are native frames on the stack, a virtual thread cannot be unmounted and 
is pinned to its carrier. Besides harming scalability, this can, in some 
pathological cases, lead to a deadlock, for example, if the thread executing 
the class initialization method is blocked waiting for some unmounted virtual 
thread to run, but all carriers are blocked waiting for that class to be 
initialized.

As of JDK-8338383, virtual threads blocked in the VM on `ObjectMonitor` 
operations can be unmounted. Since synchronization on class initialization is 
implemented using `ObjectLocker`, we can reuse the same mechanism to unmount 
virtual threads on these cases too.

This patch adds support for unmounting virtual threads on some of the most 
common class initialization paths, specifically when calling 
`InterpreterRuntime::_new` (`new` bytecode), and 
`InterpreterRuntime::resolve_from_cache` for `invokestatic`, `getstatic` or 
`putstatic` bytecodes. In the future we might consider extending this mechanism 
to include initialization calls originating from native methods such as 
`Class.forName0`.

### Summary of implementation

The ObjectLocker class was modified to not pin the continuation if we are 
coming from a preemptable path, which will be the case when calling 
`InstanceKlass::initialize_impl` from new method 
`InstanceKlass::initialize_preemptable`. This means that for these cases, a 
virtual thread can now be unmounted either when contending for the init_lock in 
the `ObjectLocker` constructor, or in the call to `wait_uninterruptibly`. Also, 
since the call to initialize a class includes a previous call to `link_class` 
which also uses `ObjectLocker` to protect concurrent calls from multiple 
threads, we will allow preemption there too.

If preempted, we will throw a pre-allocated exception which will get propagated 
with the `TRAPS/CHECK` macros all the way back to the VM entry point. The 
exception will be cleared and on return back to Java the virtual thread will go 
through the preempt stub and unmount. When running again, at the end of the 
thaw call we will identify this preemption case and redo the original VM call 
(either `InterpreterRuntime::_new` or 
`InterpreterRuntime::resolve_from_cache`). 

### Notes

`InterpreterRuntime::call_VM_preemptable` used previously only for 
`InterpreterRuntime::monitorenter`, was renamed to 
`InterpreterMacroAssembler::call_VM_preemptable_helper` and generalized for 
calls that take more than one argument, and that can return oops and throw 
exceptions. Method `InterpreterMacroAssembler::call_VM_preemptable` is now a 
wrapper that calls the helper, following the pattern of 
`MacroAssembler::call_VM` and `MacroAssembler::call_VM_helper` methods.

As with platform threads, a virtual thread preempted at `wait_uninterruptibly` 
that is interrupted will not throw IE, and will preserve the interrupted 
status. Member `_interruptible` was added to `ObjectWaiter` to differentiate 
this case against `Object.wait`. Also field `interruptableWait` was added to 
VirtualThread class, mainly to avoid an interrupted virtual thread in 
`wait_uninterruptibly` to keep looping and submitting the continuation to the 
scheduler queue until the class is waiting for is initialized.

Currently (and still with this change), when the thread responsible for 
initializing a class finishes executing the class initializer, it will set the 
initialization lock to null so the object can be GC'ed. For platform threads 
blocked waiting on the initialization lock, the `Handle` in 
`InstanceKlass::initialize_impl` will still protect the object from being 
collected until the last thread exits the monitor. For preempted virtual 
threads though, that `Handle` would have already been destroyed. In order to 
protect the init_lock from being collected while there are still virtual 
threads using the associated `ObjectMonitor`, the first preempted virtual 
thread will put the oop in an `OopHandle` in the `ObjectMonitor` (see 
`ObjectMonitor::set_object_strong()`), which will be released later when the 
monitor is deflated.

Preempting at `invokestatic` means the top frame in the `stackChunk` can now 
have the callee’s arguments at the top of the expression stack, which during 
gc, will need to be processed as part of that frame (no callee yet). Class 
`SmallRegisterMap` was therefore modified so that we now have two static 
instances, one where `include_argument_oops()` returns true and is used when 
processing the top frame on this case, and the regular one where it return 
false and it’s used everywhere else. Also, because 
`InterpretedArgumentOopFinder` calculates the address of oops as offsets from 
the top of the expression stack, we need to correct possible added alignment 
after the top frame is thawed, since we can safepoint while redoing the VM 
call. Class `AnchorMark` was added to deal with this.

### Testing

The changes have been running in the Loom pipeline for several months now. They 
include new test `KlassInit.java` which exercises preemption on different class 
initialization cases. Also, the current patch has been run through mach5 tiers 
1-8. I'll keep running tests periodically until integration time.

-------------

Commit messages:
 - RISC-V support
 - Fix whitespaces
 - v1

Changes: https://git.openjdk.org/jdk/pull/27802/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27802&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8369238
  Stats: 1979 lines in 94 files changed: 1628 ins; 86 del; 265 mod
  Patch: https://git.openjdk.org/jdk/pull/27802.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/27802/head:pull/27802

PR: https://git.openjdk.org/jdk/pull/27802

Reply via email to