> The imasm::remove_activation() call does not deal with safepoints very well. > However, when the MethodExit JVMTI event is being called, we call into the > runtime in the middle of remove_activation(). If the value being returned is > an object type, then the top-of-stack contains the oop. However, the GC does > not traverse said oop in any oop map, because it is simply not expected that > we safepoint in the middle of remove_activation(). > > The JvmtiExport::post_method_exit() function we end up calling, reads the > top-of-stack oop, and puts it in a handle. Then it calls JVMTI callbacks, > that eventually call Java and a bunch of stuff that safepoints. So after the > JVMTI callback, we can expect the top-of-stack oop to be broken. > Unfortunately, when we continue, we therefore end up returning a broken oop. > > Notably, the fact that InterpreterRuntime::post_method_exit is a JRT_ENTRY, > is wrong, as we can safepoint on the way back to Java, which will break the > return oop in a similar way. So this patch makes it a JRT_BLOCK_ENTRY, moving > the transition to VM and back, into a block of code that is protected against > GC. Before the JRT_BLOCK is called, we stash away the return oop, and after > the JRT_BLOCK_END, we restore the top-of-stack oop. In the path when > InterpreterRuntime::post_method_exit is called when throwing an exception, we > don't have the same problem of retaining an oop result, and hence the > JRT_BLOCK/JRT_BLOCK_END section is not performed in this case; the logic is > the same as before for this path. > > This is a JVMTI bug that has probably been around for a long time. It crashes > with all GCs, but was discovered recently after concurrent stack processing, > as StefanK has been running better GC stressing code in JVMTI, and the bug > reproduced more easily with concurrent stack processing, as the timings were > a bit different. The following reproducer failed pretty much 100% of the time: > while true; do make test JTREG="RETAIN=all" > TEST=test/hotspot/jtreg/vmTestbase/nsk/jdi/MethodExitEvent/returnValue/returnValue003/returnValue003.java > TEST_OPTS_JAVA_OPTIONS="-XX:+UseZGC -Xmx2g -XX:ZCollectionInterval=0.0001 > -XX:ZFragmentationLimit=0.01 -XX:+VerifyOops -XX:+ZVerifyViews -Xint" ; done > > With my fix I can run this repeatedly without any more failures. I have also > sanity checked the patch by running tier 1-5, so that it does not introduces > any new issues on its own. I have also used Stefan's nice external GC > stressing with jcmd technique that was used to trigger crashes with other > GCs, to make sure said crashes no longer reproduce either.
Erik Österlund has updated the pull request incrementally with one additional commit since the last revision: Serguei CR2: Don't check interpreted only ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/930/files - new: https://git.openjdk.java.net/jdk/pull/930/files/4d68c624..95306514 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=930&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=930&range=02-03 Stats: 5 lines in 1 file changed: 0 ins; 5 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/930.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/930/head:pull/930 PR: https://git.openjdk.java.net/jdk/pull/930