Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Tue, 19 Mar 2024 17:46:31 GMT, Tobias Holenstein wrote: >> I've asked this myself (after making the change). >> Being in `WXWrite` mode would be wrong if the thread would execute >> dynamically generated code. There's not too much happening outside the scope >> of the `ThreadInVMfromNative` instances. I see jni calls >> (`GetObjectArrayElement`, `ExceptionOccurred`) and a jvmti call >> (`RetransformClasses`) but these are safe because the callees enter the vm >> right away. We even avoid switching to `WXWrite` and back there. >> So I thought it would be ok to coarsen the WXMode switching. >> But maybe it's still better to avoid any risk especially since there's >> likely no performance effect. > > Or could the `ThreadInVMfromNative tvmfn(THREAD);` in > `check_exception_and_log` be move out to > `JfrJvmtiAgent::retransform_classes`? And then only use one > `ThreadInVMfromNative` in `JfrJvmtiAgent::retransform_classes` I think this would require hoisting the `ThreadInVMfromNative` out of the loop with the `check_exception_and_log` call but then the thread would be in `_thread_in_vm` when doing the `GetObjectArrayElement` jni call which would be wrong. - PR Review Comment: https://git.openjdk.org/jdk/pull/18238#discussion_r1532121487
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and >> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to > a merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains three additional > commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm Not ideal but it fixes a real problem. - Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947550278
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and >> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to > a merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains three additional > commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm Marked as reviewed by aph (Reviewer). - PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947320660
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Tue, 19 Mar 2024 15:17:50 GMT, Thomas Stuefe wrote: > > Not necessarily. It may well remove some transitions from paths that don't > > need it, but if you move the state change too low down the call chain you > > could end up transitioning much more often in code that does need it e.g. > > if a transitioning method is called in a loop. > > Not if you do the switching lazily. The first iteration would switch to the > needed state; subsequent iterations would not do anything since the state > already matches. Unless you interleave writes and execs, but then you would > need the state changes anyway. Exactly. You do the transition when it's needed. - PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2008070652
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and >> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to > a merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains three additional > commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm Marked as reviewed by tholenstein (Reviewer). - PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947241693
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and >> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to > a merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains three additional > commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm +1 - Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947160377
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and >> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to > a merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains three additional > commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm I think this patch makes sense, and does not compete with a long-term solution. - Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947069985
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Wed, 13 Mar 2024 13:53:46 GMT, Richard Reingruber wrote: >> src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp line 160: >> >>> 158: ResourceMark rm(THREAD); >>> 159: // WXWrite is needed before entering the vm below and in callee >>> methods. >>> 160: MACOS_AARCH64_ONLY(ThreadWXEnable __wx(WXWrite, THREAD)); >> >> I understand you placed this here to cover the transition inside >> `create_classes_array` and the immediate one at line 170, but doesn't this >> risk having the wrong WX state for code in between? > > I've asked this myself (after making the change). > Being in `WXWrite` mode would be wrong if the thread would execute > dynamically generated code. There's not too much happening outside the scope > of the `ThreadInVMfromNative` instances. I see jni calls > (`GetObjectArrayElement`, `ExceptionOccurred`) and a jvmti call > (`RetransformClasses`) but these are safe because the callees enter the vm > right away. We even avoid switching to `WXWrite` and back there. > So I thought it would be ok to coarsen the WXMode switching. > But maybe it's still better to avoid any risk especially since there's likely > no performance effect. Or could the `ThreadInVMfromNative tvmfn(THREAD);` in `check_exception_and_log` be move out to `JfrJvmtiAgent::retransform_classes`? And then only use one `ThreadInVMfromNative` in `JfrJvmtiAgent::retransform_classes` - PR Review Comment: https://git.openjdk.org/jdk/pull/18238#discussion_r1530831165
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Tue, 19 Mar 2024 17:08:52 GMT, Richard Reingruber wrote: > I see it differently. This PR is just a simple attempt to get clean test runs > with AssertWXAtThreadSync (after fixing an actual crash > https://bugs.openjdk.org/browse/JDK-8327036). While the violating locations > in this PR might be unlikely to produce actual crashes I think it is > worthwhile to have clean testing with AssertWXAtThreadSync because this will > help prevent regressions that are more likely. I agree. Fixing the current state with this PR makes sense to me. Changing the logic of W^X will take more time and discussion. So from my point of view this PR is ready and should be integrated. If no-one disagrees I will approve - PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007742855
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Tue, 19 Mar 2024 12:17:22 GMT, David Holmes wrote: > But all this discussion suggests to me that this PR is not really worth > pursuing at this time - IIUC no actual failures are observed other than those > pertaining to AssertWXAtThreadSync and that flag will be gone if we do decide > to be more fine-grained about WX management. I see it differently. This PR is just a simple attempt to get clean test runs with AssertWXAtThreadSync (after fixing an actual crash https://bugs.openjdk.org/browse/JDK-8327036). While the violating locations in this PR might be unlikely to produce actual crashes I think it is worthwhile to have clean testing with AssertWXAtThreadSync because this will help prevent regressions that are more likely. Beyond the trivial fixes of this PR I'm very much in favor of further enhancements as the aforementioned https://bugs.openjdk.org/browse/JDK-8307817. My recommendation would be to remove as much non-constant data from the code cache as possible. - PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007709981
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Tue, 19 Mar 2024 12:17:22 GMT, David Holmes wrote: >> Instead, could we tag code that needs one or the other, keep track of the >> current WX state in thread-local memory, and flip WX only when we know we >> need to? The first part we already do. I wonder wheter we could - at least as workaround for if we missed a spot - do wx switching as a reaction to a SIBBUS related to WX violation in code cache. Switch state around, return from signal handler and retry operation. (Edit: tested it, does not seem to work. I guess when the SIGBUS is triggered in the kernel thread WX state had already been processed somehow). > > That's very odd. The example there doesn't even involve MAP_JIT memory, so > > what does it have to do with WX? > > @theRealAph that is the mystery we hope will be resolved once we know the > nature of the underlying OS bug. Somehow switching to exec mode > fixes/works-around the issue. I can imagine a missing conditional to check if > the region is MAP_JIT. > > > Changing WX at VM state transitions is a form of temporal coupling, a > > classic design smell that has caused problems for decades. > > The original introducers of WXEnable made the decision that the VM should be > in WRITE mode unless it needs EXEC. That is the state we are presently trying > to achieve with this change. If that original design choice is wrong then ... > > > Instead, could we tag code that needs one or the other, keep track of the > > current WX state in thread-local memory, and flip WX only when we know we > > need to? > > And I've asked about this every time a missing WXEnable has had to be added. > We seem to be generically able to describe what kind of code needs which > mode, but we seem to struggle to pin it down. Though that is what > https://bugs.openjdk.org/browse/JDK-8307817 is looking at doing. > > > That'd (by definition) reduce the number of transitions to the minimum if > > we were through. > > Not necessarily. It may well remove some transitions from paths that don't > need it, but if you move the state change too low down the call chain you > could end up transitioning much more often in code that does need it e.g. if > a transitioning method is called in a loop. Not if you do the switching lazily. The first iteration would switch to the needed state; subsequent iterations would not do anything since the state already matches. Unless you interleave writes and execs, but then you would need the state changes anyway. - PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007469410
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Tue, 19 Mar 2024 12:17:22 GMT, David Holmes wrote: > IIUC no actual failures are observed other than those pertaining to > AssertWXAtThreadSync We saw sporadic crashes in our jtreg (maybe also jck?) runs; only **_later_** we enabled AssertWXAtThreadSync for more checking. - PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007159560
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Tue, 19 Mar 2024 09:30:52 GMT, Andrew Haley wrote: > That's very odd. The example there doesn't even involve MAP_JIT memory, so > what does it have to do with WX? @theRealAph that is the mystery we hope will be resolved once we know the nature of the underlying OS bug. Somehow switching to exec mode fixes/works-around the issue. I can imagine a missing conditional to check if the region is MAP_JIT. > Changing WX at VM state transitions is a form of temporal coupling, a classic > design smell that has caused problems for decades. The original introducers of WXEnable made the decision that the VM should be in WRITE mode unless it needs EXEC. That is the state we are presently trying to achieve with this change. If that original design choice is wrong then ... > Instead, could we tag code that needs one or the other, keep track of the > current WX state in thread-local memory, and flip WX only when we know we > need to? And I've asked about this every time a missing WXEnable has had to be added. We seem to be generically able to describe what kind of code needs which mode, but we seem to struggle to pin it down. Though that is what https://bugs.openjdk.org/browse/JDK-8307817 is looking at doing. > That'd (by definition) reduce the number of transitions to the minimum if we > were through. Not necessarily. It may well remove some transitions from paths that don't need it, but if you move the state change too low down the call chain you could end up transitioning much more often in code that does need it e.g. if a transitioning method is called in a loop. We need to optimise the actual call paths as well as identify specific methods. But all this discussion suggests to me that this PR is not really worth pursuing at this time - IIUC no actual failures are observed other than those pertaining to `AssertWXAtThreadSync` and that flag will be gone if we do decide to be more fine-grained about WX management. - PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007029830
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Tue, 19 Mar 2024 04:37:41 GMT, David Holmes wrote: > That said we have had a lot of people looking at the overall WX state > management logic in the past week or so due to > https://bugs.openjdk.org/browse/JDK-8327860. The workaround there requires us > to be in EXEC mode That's very odd. The example there doesn't even involve MAP_JIT memory, so what does it have to do with WX? > and there are a number of folk who are questioning why "we" chose WRITE mode > as the default with a switch to EXEC, instead of EXEC as the default with a > switch to WRITE. > But whichever way that goes I think the VM state transitions are the places > to enforce that choice. Hmm. Changing WX at VM state transitions is a form of temporal coupling, a classic design smell that has caused problems for decades. It's causing problems for us now. Instead, could we tag code that needs one or the other, keep track of the current WX state in thread-local memory, and flip WX only when we know we need to? That'd (by definition) reduce the number of transitions to the minimum if we were through. - PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2006498201
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber wrote: >> This pr changes `JfrJvmtiAgent::retransform_classes()` and >> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm. >> >> Testing: >> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java >> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync >> >> More tests are pending. > > Richard Reingruber has updated the pull request with a new target base due to > a merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains three additional > commits since the last revision: > > - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite > - Set WXWrite at more locations > - Switch to WXWrite before entering the vm I still feel that this should be fixed inside the `ThreadInVMfromNative` transition - the number of callsites that need it just reinforces that for me. Granted we then need to look at where we would now have redundant calls. That said we have had a lot of people looking at the overall WX state management logic in the past week or so due to https://bugs.openjdk.org/browse/JDK-8327860. The workaround there requires us to be in EXEC mode and there are a number of folk who are questioning why "we" chose WRITE mode as the default with a switch to EXEC, instead of EXEC as the default with a switch to WRITE. But whichever way that goes I think the VM state transitions are the places to enforce that choice. - PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1945158370
Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]
> This pr changes `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` > to switch to `WXWrite` before transitioning to the vm. > > Testing: > make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java > TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java > TEST_VM_OPTS=-XX:+AssertWXAtThreadSync > > More tests are pending. Richard Reingruber has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite - Set WXWrite at more locations - Switch to WXWrite before entering the vm - Changes: - all: https://git.openjdk.org/jdk/pull/18238/files - new: https://git.openjdk.org/jdk/pull/18238/files/9075f4be..1b4d7606 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=18238=02 - incr: https://webrevs.openjdk.org/?repo=jdk=18238=01-02 Stats: 293771 lines in 434 files changed: 6197 ins; 6221 del; 281353 mod Patch: https://git.openjdk.org/jdk/pull/18238.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18238/head:pull/18238 PR: https://git.openjdk.org/jdk/pull/18238