Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-20 Thread Richard Reingruber
On Tue, 19 Mar 2024 17:46:31 GMT, Tobias Holenstein  
wrote:

>> I've asked this myself (after making the change).
>> Being in `WXWrite` mode would be wrong if the thread would execute 
>> dynamically generated code. There's not too much happening outside the scope 
>> of the `ThreadInVMfromNative` instances. I see jni calls 
>> (`GetObjectArrayElement`, `ExceptionOccurred`) and a jvmti call 
>> (`RetransformClasses`) but these are safe because the callees enter the vm 
>> right away. We even avoid switching to `WXWrite` and back there.
>> So I thought it would be ok to coarsen the WXMode switching.
>> But maybe it's still better to avoid any risk especially since there's 
>> likely no performance effect.
>
> Or could the  `ThreadInVMfromNative tvmfn(THREAD);` in 
> `check_exception_and_log` be move out to 
> `JfrJvmtiAgent::retransform_classes`?  And then only use one 
> `ThreadInVMfromNative` in `JfrJvmtiAgent::retransform_classes`

I think this would require hoisting the `ThreadInVMfromNative` out of the loop 
with the `check_exception_and_log` call but then the thread would be in 
`_thread_in_vm` when doing the `GetObjectArrayElement` jni call which would be 
wrong.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18238#discussion_r1532121487


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread David Holmes
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber  wrote:

>> This pr changes  `JfrJvmtiAgent::retransform_classes()` and 
>> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm.
>> 
>> Testing:
>> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> 
>> More tests are pending.
>
> Richard Reingruber has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains three additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite
>  - Set WXWrite at more locations
>  - Switch to WXWrite before entering the vm

Not ideal but it fixes a real problem.

-

Marked as reviewed by dholmes (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947550278


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Andrew Haley
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber  wrote:

>> This pr changes  `JfrJvmtiAgent::retransform_classes()` and 
>> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm.
>> 
>> Testing:
>> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> 
>> More tests are pending.
>
> Richard Reingruber has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains three additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite
>  - Set WXWrite at more locations
>  - Switch to WXWrite before entering the vm

Marked as reviewed by aph (Reviewer).

-

PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947320660


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Andrew Haley
On Tue, 19 Mar 2024 15:17:50 GMT, Thomas Stuefe  wrote:

> > Not necessarily. It may well remove some transitions from paths that don't 
> > need it, but if you move the state change too low down the call chain you 
> > could end up transitioning much more often in code that does need it e.g. 
> > if a transitioning method is called in a loop.
> 
> Not if you do the switching lazily. The first iteration would switch to the 
> needed state; subsequent iterations would not do anything since the state 
> already matches. Unless you interleave writes and execs, but then you would 
> need the state changes anyway.

Exactly. You do the transition when it's needed.

-

PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2008070652


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Tobias Holenstein
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber  wrote:

>> This pr changes  `JfrJvmtiAgent::retransform_classes()` and 
>> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm.
>> 
>> Testing:
>> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> 
>> More tests are pending.
>
> Richard Reingruber has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains three additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite
>  - Set WXWrite at more locations
>  - Switch to WXWrite before entering the vm

Marked as reviewed by tholenstein (Reviewer).

-

PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947241693


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Martin Doerr
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber  wrote:

>> This pr changes  `JfrJvmtiAgent::retransform_classes()` and 
>> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm.
>> 
>> Testing:
>> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> 
>> More tests are pending.
>
> Richard Reingruber has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains three additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite
>  - Set WXWrite at more locations
>  - Switch to WXWrite before entering the vm

+1

-

Marked as reviewed by mdoerr (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947160377


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Thomas Stuefe
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber  wrote:

>> This pr changes  `JfrJvmtiAgent::retransform_classes()` and 
>> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm.
>> 
>> Testing:
>> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> 
>> More tests are pending.
>
> Richard Reingruber has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains three additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite
>  - Set WXWrite at more locations
>  - Switch to WXWrite before entering the vm

I think this patch makes sense, and does not compete with a long-term solution.

-

Marked as reviewed by stuefe (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1947069985


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Tobias Holenstein
On Wed, 13 Mar 2024 13:53:46 GMT, Richard Reingruber  wrote:

>> src/hotspot/share/jfr/instrumentation/jfrJvmtiAgent.cpp line 160:
>> 
>>> 158:   ResourceMark rm(THREAD);
>>> 159:   // WXWrite is needed before entering the vm below and in callee 
>>> methods.
>>> 160:   MACOS_AARCH64_ONLY(ThreadWXEnable __wx(WXWrite, THREAD));
>> 
>> I understand you placed this here to cover the transition inside 
>> `create_classes_array` and the immediate one at line 170, but doesn't this 
>> risk having the wrong WX state for code in between?
>
> I've asked this myself (after making the change).
> Being in `WXWrite` mode would be wrong if the thread would execute 
> dynamically generated code. There's not too much happening outside the scope 
> of the `ThreadInVMfromNative` instances. I see jni calls 
> (`GetObjectArrayElement`, `ExceptionOccurred`) and a jvmti call 
> (`RetransformClasses`) but these are safe because the callees enter the vm 
> right away. We even avoid switching to `WXWrite` and back there.
> So I thought it would be ok to coarsen the WXMode switching.
> But maybe it's still better to avoid any risk especially since there's likely 
> no performance effect.

Or could the  `ThreadInVMfromNative tvmfn(THREAD);` in 
`check_exception_and_log` be move out to `JfrJvmtiAgent::retransform_classes`?  
And then only use one `ThreadInVMfromNative` in 
`JfrJvmtiAgent::retransform_classes`

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18238#discussion_r1530831165


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Tobias Holenstein
On Tue, 19 Mar 2024 17:08:52 GMT, Richard Reingruber  wrote:

> I see it differently. This PR is just a simple attempt to get clean test runs 
> with AssertWXAtThreadSync (after fixing an actual crash 
> https://bugs.openjdk.org/browse/JDK-8327036). While the violating locations 
> in this PR might be unlikely to produce actual crashes I think it is 
> worthwhile to have clean testing with AssertWXAtThreadSync because this will 
> help prevent regressions that are more likely.

I agree. Fixing the current state with this PR makes sense to me. Changing the 
logic of W^X will take more time and discussion. So from my point of view this 
PR is ready and should be integrated. If no-one disagrees I will approve

-

PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007742855


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Richard Reingruber
On Tue, 19 Mar 2024 12:17:22 GMT, David Holmes  wrote:

> But all this discussion suggests to me that this PR is not really worth 
> pursuing at this time - IIUC no actual failures are observed other than those 
> pertaining to AssertWXAtThreadSync and that flag will be gone if we do decide 
> to be more fine-grained about WX management.

I see it differently. This PR is just a simple attempt to get clean test runs 
with AssertWXAtThreadSync (after fixing an actual crash 
https://bugs.openjdk.org/browse/JDK-8327036). While the violating locations in 
this PR might be unlikely to produce actual crashes I think it is worthwhile to 
have clean testing with AssertWXAtThreadSync because this will help prevent 
regressions that are more likely.

Beyond the trivial fixes of this PR I'm very much in favor of further 
enhancements as the aforementioned https://bugs.openjdk.org/browse/JDK-8307817.
My recommendation would be to remove as much non-constant data from the code 
cache as possible.

-

PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007709981


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Thomas Stuefe
On Tue, 19 Mar 2024 12:17:22 GMT, David Holmes  wrote:

>> Instead, could we tag code that needs one or the other, keep track of the 
>> current WX state in thread-local memory, and flip WX only when we know we 
>> need to? 

The first part we already do.

I wonder wheter we could - at least as workaround for if we missed a spot - do 
wx switching as a reaction to a SIBBUS related to WX violation in code cache. 
Switch state around, return from signal handler and retry operation.

(Edit: tested it, does not seem to work. I guess when the SIGBUS is triggered 
in the kernel thread WX state had already been processed somehow).

> > That's very odd. The example there doesn't even involve MAP_JIT memory, so 
> > what does it have to do with WX?
> 
> @theRealAph that is the mystery we hope will be resolved once we know the 
> nature of the underlying OS bug. Somehow switching to exec mode 
> fixes/works-around the issue. I can imagine a missing conditional to check if 
> the region is MAP_JIT.
> 
> > Changing WX at VM state transitions is a form of temporal coupling, a 
> > classic design smell that has caused problems for decades.
> 
> The original introducers of WXEnable made the decision that the VM should be 
> in WRITE mode unless it needs EXEC. That is the state we are presently trying 
> to achieve with this change. If that original design choice is wrong then ...
> 
> > Instead, could we tag code that needs one or the other, keep track of the 
> > current WX state in thread-local memory, and flip WX only when we know we 
> > need to?
> 
> And I've asked about this every time a missing WXEnable has had to be added. 
> We seem to be generically able to describe what kind of code needs which 
> mode, but we seem to struggle to pin it down. Though that is what 
> https://bugs.openjdk.org/browse/JDK-8307817 is looking at doing.
> 
> > That'd (by definition) reduce the number of transitions to the minimum if 
> > we were through.
> 
> Not necessarily. It may well remove some transitions from paths that don't 
> need it, but if you move the state change too low down the call chain you 
> could end up transitioning much more often in code that does need it e.g. if 
> a transitioning method is called in a loop. 

Not if you do the switching lazily. The first iteration would switch to the 
needed state; subsequent iterations would not do anything since the state 
already matches. Unless you interleave writes and execs, but then you would 
need the state changes anyway.

-

PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007469410


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Matthias Baesken
On Tue, 19 Mar 2024 12:17:22 GMT, David Holmes  wrote:

>  IIUC no actual failures are observed other than those pertaining to 
> AssertWXAtThreadSync

We saw sporadic crashes in our jtreg (maybe also jck?) runs; only **_later_** 
we enabled AssertWXAtThreadSync for more checking.

-

PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007159560


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread David Holmes
On Tue, 19 Mar 2024 09:30:52 GMT, Andrew Haley  wrote:

> That's very odd. The example there doesn't even involve MAP_JIT memory, so 
> what does it have to do with WX?

@theRealAph that is the mystery we hope will be resolved once we know the 
nature of the underlying OS bug. Somehow switching to exec mode 
fixes/works-around the issue. I can imagine a missing conditional to check if 
the region is MAP_JIT.

> Changing WX at VM state transitions is a form of temporal coupling, a classic 
> design smell that has caused problems for decades.

The original introducers of WXEnable made the decision that the VM should be in 
WRITE mode unless it needs EXEC. That is the state we are presently trying to 
achieve with this change. If that original design choice is wrong then ...

> Instead, could we tag code that needs one or the other, keep track of the 
> current WX state in thread-local memory, and flip WX only when we know we 
> need to?

And I've asked about this every time a missing WXEnable has had to be added. We 
seem to be generically able to  describe what kind of code needs which mode, 
but we seem to struggle to pin it down. Though that is what 
https://bugs.openjdk.org/browse/JDK-8307817 is looking at doing.

> That'd (by definition) reduce the number of transitions to the minimum if we 
> were through.

Not necessarily. It may well remove some transitions from paths that don't need 
it, but if you move the state change too low down the call chain you could end 
up transitioning much more often in code that does need it e.g. if a 
transitioning method is called in a loop. We need to optimise the actual call 
paths as well as identify specific methods.

But all this discussion suggests to me that this PR is not really worth 
pursuing at this time - IIUC no actual failures are observed other than those 
pertaining to `AssertWXAtThreadSync` and that flag will be gone if we do decide 
to be more fine-grained about WX management.

-

PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2007029830


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-19 Thread Andrew Haley
On Tue, 19 Mar 2024 04:37:41 GMT, David Holmes  wrote:

> That said we have had a lot of people looking at the overall WX state 
> management logic in the past week or so due to 
> https://bugs.openjdk.org/browse/JDK-8327860. The workaround there requires us 
> to be in EXEC mode

That's very odd. The example there doesn't even involve MAP_JIT memory, so what 
does it have to do with WX?

> and there are a number of folk who are questioning why "we" chose WRITE mode 
> as the default with a switch to EXEC, instead of EXEC as the default with a 
> switch to WRITE.
> But whichever way that goes I think the VM state transitions are the places 
> to enforce that choice.

Hmm. Changing WX at VM state transitions is a form of temporal coupling, a 
classic design smell that has caused problems for decades. It's causing 
problems for us now. Instead, could we tag code that needs one or the other, 
keep track of the current WX state in thread-local memory, and flip WX only 
when we know we need to? That'd (by definition) reduce the number of 
transitions to the minimum if we were through.

-

PR Comment: https://git.openjdk.org/jdk/pull/18238#issuecomment-2006498201


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-18 Thread David Holmes
On Mon, 18 Mar 2024 13:14:41 GMT, Richard Reingruber  wrote:

>> This pr changes  `JfrJvmtiAgent::retransform_classes()` and 
>> `jfr_set_enabled` to switch to `WXWrite` before transitioning to the vm.
>> 
>> Testing:
>> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java 
>> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
>> 
>> More tests are pending.
>
> Richard Reingruber has updated the pull request with a new target base due to 
> a merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains three additional 
> commits since the last revision:
> 
>  - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite
>  - Set WXWrite at more locations
>  - Switch to WXWrite before entering the vm

I still feel that this should be fixed inside the `ThreadInVMfromNative` 
transition - the number of callsites that need it just reinforces that for me. 
Granted we then need to look at where we would now have redundant calls.

That said we have had a lot of people looking at the overall WX state 
management logic in the past week or so due to 
https://bugs.openjdk.org/browse/JDK-8327860. The workaround there requires us 
to be in EXEC mode and there are a number of folk who are questioning why "we" 
chose WRITE mode as the default with a switch to EXEC, instead of EXEC as the 
default with a switch to WRITE. But whichever way that goes I think the VM 
state transitions are the places to enforce that choice.

-

PR Review: https://git.openjdk.org/jdk/pull/18238#pullrequestreview-1945158370


Re: RFR: 8327990: [macosx-aarch64] Various tests fail with -XX:+AssertWXAtThreadSync [v3]

2024-03-18 Thread Richard Reingruber
> This pr changes  `JfrJvmtiAgent::retransform_classes()` and `jfr_set_enabled` 
> to switch to `WXWrite` before transitioning to the vm.
> 
> Testing:
> make test TEST=jdk/jfr/event/runtime/TestClassLoadEvent.java 
> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
> make test TEST=compiler/intrinsics/klass/CastNullCheckDroppingsTest.java 
> TEST_VM_OPTS=-XX:+AssertWXAtThreadSync
> 
> More tests are pending.

Richard Reingruber has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains three additional 
commits since the last revision:

 - Merge branch 'master' into 8327990_jfr_enters_vm_without_wxwrite
 - Set WXWrite at more locations
 - Switch to WXWrite before entering the vm

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18238/files
  - new: https://git.openjdk.org/jdk/pull/18238/files/9075f4be..1b4d7606

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=18238=02
 - incr: https://webrevs.openjdk.org/?repo=jdk=18238=01-02

  Stats: 293771 lines in 434 files changed: 6197 ins; 6221 del; 281353 mod
  Patch: https://git.openjdk.org/jdk/pull/18238.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18238/head:pull/18238

PR: https://git.openjdk.org/jdk/pull/18238