On Mon, 6 Feb 2023 21:57:11 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:

>> A number of tests that use JDI invokeMethod() support occasionally fail when 
>> run using virtual threads. The tests fail with:
>> 
>> 
>> # ERROR: TEST FAILED: wrong invocation:
>> # ERROR: invoking debuggee thread instance of 
>> java.lang.VirtualThread(name='invokemethod010tThr', id=272)
>> # ERROR: is not suspended again after the invocation 
>> 
>> 
>> Although as explained later, this is a misleading message, and does not 
>> accurately reflect why the test is failing. More on that below.
>> 
>> The root cause of the failure is due to these tests using JDI invokeMethod 
>> support with the  INVOKE_SINGLE_THREADED flag. This is not always going to 
>> work with virtual threads because the invoked method is doing a 
>> Thread.sleep(400), and that is at risk of blocking indefinitely if all other 
>> threads are blocked form making progress. Note that technically platform 
>> threads could fail in the same manner. However, the reason the failure only 
>> happens now with virtual threads is because the implementation of 
>> Thread.sleep() differs for virtual threads, and may require ownership of a 
>> monitor that sometimes is held by another thread.
>> 
>> Another issue is that the tests do not do a very good job of error handling 
>> when this happens, and give the misleading failure reason of the invoked 
>> thread not being suspended after the invoke completed. The reason it is not 
>> suspended is because the invoke has actually not completed. There was a 
>> timeout that the test did not properly note as the cause of the failure. The 
>> test (debugger side) spawns a thread to do the JDI invokeMethod with, and 
>> then waits for it with:
>> 
>>                 invThr.join(argHandler.getWaitTime()*60000);
>> 
>> This join() times out, but the test assumes once it returns the invoke is 
>> complete, even though the invoked thread is actually still in the middle of 
>> the invoke. So that is the reason debuggee invokemethod thread is not 
>> currently suspended. I've fixed this by having a test check if invThr is 
>> still alive after the join. If it is, then the test is made to fail at that 
>> point, rather than continuing on and checking the debuggee threads status. 
>> The failure then becomes:
>> 
>> nsk.share.TestFailure: TEST FAILED: invoke never completed
>> 
>> At that point a vm.resume() is done to allow the invoke to complete, and the 
>> test will exit with this failure.
>> 
>> As for avoiding the failure in the first place (the deadlock in the debuggee 
>> during the invoke), this is really a test bug for relying on 
>> INVOKE_SINGLE_THREADED and assuming that the invoked thread won't become 
>> deadlocked. Since there is a Thread.sleep() call in the invoked method, it 
>> can't make this assumption. From the ObjectReference.invoke() spec:
>> 
>> "By default, all threads in the target VM are resumed while the method is 
>> being invoked if they were previously suspended by an event or by 
>> VirtualMachine.suspend() or ThreadReference.suspend(). This is done to 
>> prevent the deadlocks that will occur if any of the threads own monitors 
>> that will be needed by the invoked method."
>> 
>> "The resumption of other threads during the invocation can be prevented by 
>> specifying the INVOKE_SINGLE_THREADED bit flag in the options argument; 
>> however, there is no protection against or recovery from the deadlocks 
>> described above, so this option should be used with great caution."
>> 
>> For platform threads, sleep() doesn't require any monitors, so these tests 
>> never ran into problems before. For virtual threads however there is some 
>> synchronization done, and potential reliance on other threads not being 
>> suspended. A way around this is to always use sleep(0), which will at least 
>> attempt to yield the thread. For platform threads an actual yield is likely. 
>> For a virtual thread it will not yield in this particular case because the 
>> virtual thread is pinned to the carrier thread due to the jvmti breakpoint 
>> callback that is currently in the call chain of the invoked thread. So for 
>> virtual threads this effectively the same as sitting in a spin loop with no 
>> yielding. This is ok. CPU wasting is not a concern.
>
> Chris Plummer has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Get rid of Thread.sleep() in the invoked method can add warning comment.

Okay so these tests are incredibly fragile - as you note display() may be a 
problem in theory, as could any potential class-loading or initialization. But 
without the sleep you have a busy-wait loop that may cause problems of its own 
(e.g. it might trigger on-stack-replacement but the compiler thread may require 
a resource held by one of the suspended threads!). It is impossible to know 
that something is 100% safe to do in this kind of situation.

-------------

PR: https://git.openjdk.org/jdk/pull/12420

Reply via email to