> A number of tests that use JDI invokeMethod() support occasionally fail when 
> run using virtual threads. The tests fail with:
> 
> 
> # ERROR: TEST FAILED: wrong invocation:
> # ERROR: invoking debuggee thread instance of 
> java.lang.VirtualThread(name='invokemethod010tThr', id=272)
> # ERROR: is not suspended again after the invocation 
> 
> 
> Although as explained later, this is a misleading message, and does not 
> accurately reflect why the test is failing. More on that below.
> 
> The root cause of the failure is due to these tests using JDI invokeMethod 
> support with the  INVOKE_SINGLE_THREADED flag. This is not always going to 
> work with virtual threads because the invoked method is doing a 
> Thread.sleep(400), and that is at risk of blocking indefinitely if all other 
> threads are blocked form making progress. Note that technically platform 
> threads could fail in the same manner. However, the reason the failure only 
> happens now with virtual threads is because the implementation of 
> Thread.sleep() differs for virtual threads, and may require ownership of a 
> monitor that sometimes is held by another thread.
> 
> Another issue is that the tests do not do a very good job of error handling 
> when this happens, and give the misleading failure reason of the invoked 
> thread not being suspended after the invoke completed. The reason it is not 
> suspended is because the invoke has actually not completed. There was a 
> timeout that the test did not properly note as the cause of the failure. The 
> test (debugger side) spawns a thread to do the JDI invokeMethod with, and 
> then waits for it with:
> 
>                 invThr.join(argHandler.getWaitTime()*60000);
> 
> This join() times out, but the test assumes once it returns the invoke is 
> complete, even though the invoked thread is actually still in the middle of 
> the invoke. So that is the reason debuggee invokemethod thread is not 
> currently suspended. I've fixed this by having a test check if invThr is 
> still alive after the join. If it is, then the test is made to fail at that 
> point, rather than continuing on and checking the debuggee threads status. 
> The failure then becomes:
> 
> nsk.share.TestFailure: TEST FAILED: invoke never completed
> 
> At that point a vm.resume() is done to allow the invoke to complete, and the 
> test will exit with this failure.
> 
> As for avoiding the failure in the first place (the deadlock in the debuggee 
> during the invoke), this is really a test bug for relying on 
> INVOKE_SINGLE_THREADED and assuming that the invoked thread won't become 
> deadlocked. Since there is a Thread.sleep() call in the invoked method, it 
> can't make this assumption. From the ObjectReference.invoke() spec:
> 
> "By default, all threads in the target VM are resumed while the method is 
> being invoked if they were previously suspended by an event or by 
> VirtualMachine.suspend() or ThreadReference.suspend(). This is done to 
> prevent the deadlocks that will occur if any of the threads own monitors that 
> will be needed by the invoked method."
> 
> "The resumption of other threads during the invocation can be prevented by 
> specifying the INVOKE_SINGLE_THREADED bit flag in the options argument; 
> however, there is no protection against or recovery from the deadlocks 
> described above, so this option should be used with great caution."
> 
> For platform threads, sleep() doesn't require any monitors, so these tests 
> never ran into problems before. For virtual threads however there is some 
> synchronization done, and potential reliance on other threads not being 
> suspended. A way around this is to always use sleep(0), which will at least 
> attempt to yield the thread. For platform threads an actual yield is likely. 
> For a virtual thread it will not yield in this particular case because the 
> virtual thread is pinned to the carrier thread due to the jvmti breakpoint 
> callback that is currently in the call chain of the invoked thread. So for 
> virtual threads this effectively the same as sitting in a spin loop with no 
> yielding. This is ok. CPU wasting is not a concern.

Chris Plummer has updated the pull request incrementally with one additional 
commit since the last revision:

  Get rid of sleep(0). Add warning comment to invoked method.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/12420/files
  - new: https://git.openjdk.org/jdk/pull/12420/files/c1bcaef5..7d6cb24b

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12420&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12420&range=01-02

  Stats: 36 lines in 6 files changed: 30 ins; 6 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/12420.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12420/head:pull/12420

PR: https://git.openjdk.org/jdk/pull/12420

Reply via email to