Re: RFR: 8274196: Crashes in VM_HeapDumper::work after JDK-8252842 [v2]
On Mon, 27 Sep 2021 09:39:59 GMT, Per Liden wrote: > > The root cause for crash in ZGC is that the JNIHandles are processed before > > object iteration. And ZGC would update the JNIHandles at object iteration > > with read barrier. So the crash is cause by accessing the invalid address > > which can be dummy info after zgc, and hence crash. > > The fix here should not be to change the order of stuff, so that heap > iteration happens first, that will just hide the real bug. The real bug is > that the `JNIGlobalsDumper::do_oop()` is missing a load barrier. In other > words, keep the order and just make sure to add a load barrier, like this: > > ``` > void JNIGlobalsDumper::do_oop(oop* obj_p) { > oop o = NativeAccess::oop_load(obj_p); > ... > ``` Hi Per @pliden , Thanks a lot! Correct!I am just puzzling why the sequency of root type dump is a must as there is no such request in spec, and your suggestion definitely help to answer that, I took the wrong fix and neglect that there is a read barrier missed. I will make the change. BRs, Lin - PR: https://git.openjdk.java.net/jdk/pull/5681
Re: RFR: 8274196: Crashes in VM_HeapDumper::work after JDK-8252842 [v2]
On Sun, 26 Sep 2021 08:02:33 GMT, Lin Zang wrote: >> The root cause for crash in ZGC is that the JNIHandles are processed before >> object iteration. And ZGC would update the JNIHandles at object iteration >> with read barrier. So the crash is cause by accessing the invalid address >> which can be dummy info after zgc, and hence crash. >> >> The lock rank issue can be fixed because the related mutexes are acquired in >> safepoint. so the safepoint_check_required could be safepoint_check_always. >> >> The Epsilon issue is caused by wrong _num_dumper_thread calculated when the >> gang==NULL. > > Lin Zang has updated the pull request with a new target base due to a merge > or a rebase. The pull request now contains three commits: > > - un-ProblemList BasicJMapTest.java > - Merge branch 'master' into pd-fix > - 8274196: Crashes in VM_HeapDumper::work after JDK-8252842 > The root cause for crash in ZGC is that the JNIHandles are processed before > object iteration. And ZGC would update the JNIHandles at object iteration > with read barrier. So the crash is cause by accessing the invalid address > which can be dummy info after zgc, and hence crash. The fix here should not be to change the order of stuff, so that heap iteration happens first, that will just hide the real bug. The real bug is that the `JNIGlobalsDumper::do_oop()` is missing a load barrier. In other words, keep the order and just make sure to add a load barrier, like this: void JNIGlobalsDumper::do_oop(oop* obj_p) { oop o = NativeAccess::oop_load(obj_p); ... - Changes requested by pliden (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/5681
Re: RFR: 8274196: Crashes in VM_HeapDumper::work after JDK-8252842 [v2]
> The root cause for crash in ZGC is that the JNIHandles are processed before > object iteration. And ZGC would update the JNIHandles at object iteration > with read barrier. So the crash is cause by accessing the invalid address > which can be dummy info after zgc, and hence crash. > > The lock rank issue can be fixed because the related mutexes are acquired in > safepoint. so the safepoint_check_required could be safepoint_check_always. > > The Epsilon issue is caused by wrong _num_dumper_thread calculated when the > gang==NULL. Lin Zang has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - un-ProblemList BasicJMapTest.java - Merge branch 'master' into pd-fix - 8274196: Crashes in VM_HeapDumper::work after JDK-8252842 - Changes: https://git.openjdk.java.net/jdk/pull/5681/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk=5681=01 Stats: 53 lines in 3 files changed: 22 ins; 28 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/5681.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/5681/head:pull/5681 PR: https://git.openjdk.java.net/jdk/pull/5681