On Fri, 4 Dec 2020 21:01:13 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:
>> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying >> problem, rather than fix the tests. >> >> The problem is that a number of JDI tests create objects on the debugger >> side with calls to `newInstance()`. However, on the debugee side, these new >> instances will only be held on to by a `JNIGlobalWeakRef`, which means they >> could be collected at any time, even before `newInstace()` returns. A number >> of JDI tests get spurious `ObjectCollectedException` thrown at them, which >> results in test failures. To make these objects stick around, a call to >> `disableCollection()` is typically needed. >> >> However, as pointer out by @plummercj in >> [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987): >> >>> Going back to the spec, ObjectReference.disableCollection() says: >>> >>> "By default all ObjectReference values returned by JDI may be collected at >>> any time the target VM is running" >>> >>> and >>> >>> "Note that while the target VM is suspended, no garbage collection will >>> occur because all threads are suspended." >>> >>> But no where does is say what is meant by the VM running or being >>> suspended, or how to get it in that state. One might assume that this ties >>> in with VirtualMachine.suspend(), but it says: >>> >>> "Suspends the execution of the application running in this virtual machine. >>> All threads currently running will be suspended." >>> >>> No mention of suspending the VM, but that certainly seems to be what is >>> implied by the method name and also by the loose wording in >>> disableCollection(). >> >> Most of our spuriously failing tests do actually make a call to >> `VirtualMachine.suspend()`, presumably to prevent objects from being garbage >> collected. However, the current implementation of `VirtualMachine.suspend()` >> will only suspend all Java threads. That is not enough to prevent objects >> from being garbage collected. The GC can basically run at any time, and >> there is no relation to whether all Java threads are suspended or not. >> >> However, as suggested by @plummercj, we could emulate the behaviour implied >> by the spec by letting a call to `VirtualMachine.suspend()` also convert all >> existing JDI objects references to be backed by a (strong) `JNIGlobalRef` >> rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from >> running, but it will prevent any object visible to a JDI client from being >> garbage collected. Of course, a call to `VirtualMachine.resume()` would >> convert all references back to being weak again. >> >> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" >> all objects. These new functions are then used by the underpinnings of >> `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the >> behaviour described above. >> >> Note that there are still a few tests that needed adjustments to guard >> against `ObjectCollectionException`. These are: >> - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This >> test seems to have been forgotten by >> [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a >> similar fix in the other `ArrayType/newinstance` tests. >> - >> *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* >> - We just want to allocate as much as we can, so catching an ignoring >> `ObjectCollectedException` seems reasonable here. >> - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to >> prevent `TestClassLoader` from being unloaded to avoid invalidating code >> locations. >> - >> *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* >> - This test keeps the VM suspended, and then expects objects to be garbage >> collected, which they now won't. >> >> Testing: >> - More than 50 iterations of the `vmTestbase/nsk/jdi` and >> `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and >> locally. > > src/jdk.jdwp.agent/share/native/libjdwp/commonRef.c line 632: > >> 630: if (weakRef == NULL) { >> 631: >> EXIT_ERROR(AGENT_ERROR_NULL_POINTER,"NewWeakGlobalRef"); >> 632: } > > I'm not so sure I agree that having a fatal error here is the right thing to > do. The only other user of `weakenNode()` is > `ObjectReference.disableCollection()`. It returns an error to the debugger if > `weakenNode()` returns `NULL`. However, I'm not so sure that's a good thing > to do here either, since it means the `VM.resume()` will need to fail. > Possibly the error should just be ignored, and we live with the ref staying > strong. Another options is to save away the weakref in the node when strengthening. This would benefit `ObjectReference.disableCollection()` also, since it would no longer need to deal with a potential OOM. However, I'm not so sure it's actually worth doing. Trying to keep the debug session alive will having allocation errors is probably a fools errand. ------------- PR: https://git.openjdk.java.net/jdk/pull/1595