On Thu, 24 Jun 2021 05:17:02 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:

> Re-enable the assert that was disabled (with some overhead) by 
> [JDK-8265683](https://bugs.openjdk.java.net/browse/JDK-8265683). Explanation 
> is in the CR and also in comments included with the changes.
> 
> I tested by running `vmTestbase/nsk/jdb/suspend/suspend001/suspend001.java` 
> and `vmTestbase/nsk/jdb/wherei/wherei001/wherei001.java` 100's of times, and 
> did not see any failures. I also verified the original issue was still 
> reproducible by temporarily not setting `gdata->handlingVMDeath = JNI_TRUE`, 
> which did trigger the assert as expected.

There are two threads to consider here. The one that `VM_DEATH` was received on 
and the thread that reads JDWP commands from the debugger. For the latter, a 
`VM.Resume` command has been received. The debugger does this as the test wraps 
up, allowing all debuggee threads to resume and the debugeee to exit. 
`VM.resume()` calls `threadControl_resumeAll()`, which calls 
`commonResumeList()`, which walks the `runningThreads` list to build an array 
of jthreads that need to be resumed. This array is passed to JVMTI 
`ResumeThreadList()`. Any time after this point a `VM_DEATH` can be triggered 
since the debuggee main thread calls `System.exit()` and does not wait for all 
the threads it creates to exit first (no join is done). The `VM_DEATH` event 
leads to clearing all the JVMTI callbacks, so the debug agent is no longer 
notified of `THREAD_END` events as the other threads created by the debuggee 
exit. This leads to threads still being on `runningThreads`, even though they 
have exited and had t
 heir TLS cleared by JVMTI. Keep in mind we are still in `commonResumeList()` 
when all this is happening. After calling JVMTI `ResumeThreadList()`, it then 
walks that same array of jthreads that were resumed, doing a `findThread()` on 
each. This is done mainly so it can update the `suspendCount` the 
`ThreadNodes`. This is when the assert was getting triggered. Some of the 
debuggee threads had already exited and had their TLS cleared, but were still 
in `runningThreads` since the `THREAD_END` was never received. The fix is to 
not assert (and instead look up the thread in `runningThreads`) if we are 
dealing with a `VM_DEATH` event.

You mentioned `gdata->handlingVMDeath` should be volatile. I will fix that. I 
also think I will rename it to `gdata->jvmtiCallBacksCleared`. There is no need 
to ever set it back false, so I will remove that line of code. Once true, we 
know the vm is exiting anyway, and the callbacks will never be setup again. 
Also, I'm going to set it true before actually clearing the callbacks rather 
than after. Doing it early is fine. In fact if it was always true, 
`findThread()` would still work, but just be slower.

As far as any other synchronizing or memory barriers being needed, I'm a bit 
unclear on that part. The only thing that matters is that 
`gdata->jvmtiCallBacksCleared` being set true is visible to other threads 
before the callbacks get cleared. It seems the call to `SetEventCallbacks()` 
should ensure this.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4580

Reply via email to