from:"Jaroslav Bachorik"

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11]

2023-12-06 Thread Jaroslav Bachorik

On Mon, 4 Dec 2023 23:32:52 GMT, David Holmes  wrote:

> The skara tooling does not currently support our rules but it remains as 
> always that non-trivial Hotspot changes require two reviewers.

Thanks, I will keep this in mind. And I apologise for not following the 
process, though not intentionally.

-

PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1842611183

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11]

2023-12-04 Thread Jaroslav Bachorik

On Fri, 1 Dec 2023 05:15:15 GMT, Thomas Stuefe  wrote:

>> Jaroslav Bachorik has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Restrict cleanup to obsolete methods only
>
> I won't be able to review this this week, too snowed in atm. I can take a 
> look next week. We can always just revert the change if needed.
> 
> Thinking about Skara, I think as long as we have this confusing mixture of 
> rules (hotspot wants 2 reviewers that are Reviewer/Committer, but some jdk 
> libs only want one, but then you need two for desktop I think otherwise Phil 
> gets angry) - we should hard-code the 2-reviewer rule into skara as default 
> since it affects the lion's share of all changes.

@tstuefe I got confused by the Skara tooling. I had a vague memory of some 
discussions going on about relaxing the requirement of 2 reviewers for some 
parts to the code base and I thought I was in a good shape seeing the Skara 
checkbox.
![Screenshot 2023-12-04 at 11 04 
00](https://github.com/openjdk/jdk/assets/738413/a5e363ee-a9e0-4121-9677-c059aa299dd4)

As for not having a review for the final version - I am not that restless. I 
specifically dismissed the previous review to avoid incidentally integrating 
based on a review of a version that was not actual. Then I asked @coleenp to 
re-do the review on the final bits 
(https://github.com/openjdk/jdk/pull/16662#issuecomment-1827432032)

@dholmes-ora  
>Okay so why does this happen and is it a reasonable thing to be happening? On 
>the surface it sounds wrong to deallocate anything associated with a live 
>classloader.

This is happening for previous versions of retransformed methods. As long as 
those methods are still on stack they are kept alive. But once they are not 
executing they are free to be destroyed. And this is where the problem was 
happening - the previous versions of methods were being destroyed but the 
associated jmethodIDs were not updated not to point to what became an invalid 
memory block.

-

PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1838221097

Integrated: 8313816: Accessing jmethodID might lead to spurious crashes

2023-11-29 Thread Jaroslav Bachorik

On Tue, 14 Nov 2023 17:56:09 GMT, Jaroslav Bachorik  
wrote:

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> ~Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.~
> 
> Therefore, we need to perform `jmethodID` lookup for each method in an old 
> class version that is getting purged, and null out the pointer of that 
> `jmethodID` to break the link from `jmethodID` to the method instance that is 
> about to get deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

This pull request has now been integrated.

Changeset: cdd1a6e8
Author:Jaroslav Bachorik 
URL:   
https://git.openjdk.org/jdk/commit/cdd1a6e851bcaf4a25d4a405b8ee0b0d5b83a4a9
Stats: 206 lines in 8 files changed: 206 ins; 0 del; 0 mod

8313816: Accessing jmethodID might lead to spurious crashes

Reviewed-by: coleenp

-

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11]

2023-11-29 Thread Jaroslav Bachorik

On Wed, 29 Nov 2023 11:49:31 GMT, Jaroslav Bachorik  
wrote:

>> Please, review this fix for a corner case handling of `jmethodID` values.
>> 
>> The issue is related to the interplay between `jmethodID` values and method 
>> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
>> instance. Once that method gets redefined, the `jmethodID` is updated to 
>> point to the last `Method` version. 
>> Unless the method is still on stack/running, in which case the original 
>> `jmethodID` will be redirected to the latest `Method` version and at the 
>> same time the 'previous' `Method` version will receive a new `jmethodID` 
>> pointing to that previous version.
>> 
>> If we happen to capture stacktrace via `GetStackTrace` or 
>> `GetAllStackTraces` JVMTI calls while this previous `Method` version is 
>> still on stack we will have the corresponding frame identified by a 
>> `jmethodID` pointing to that version.
>> However, sooner or later the 'previous' class version becomes eligible for 
>> cleanup at what time all contained `Method` instances. The cleanup process 
>> will not perform the `jmethodID` pointer maintenance and we will end up with 
>> pointers to deallocated memory. 
>> This is caused by the fact that the `jmethodID` lifecycle is bound to 
>> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
>> batch-updated when the class loader is being released and all its classes 
>> are getting unloaded. 
>> 
>> This means that we need to make sure that if a `Method` instance is being 
>> deallocate the associated `jmethodID` (if any) must not point to the 
>> deallocated instance once we are finished. Unfortunately, we can not just 
>> update the `jmethodID` values in bulk when purging an old class version - 
>> the per `InstanceKlass` jmethodID cache is present only for the main class 
>> version and contains `jmethodID` values for both the old and current method 
>> versions. 
>> 
>> ~Therefore we need to perform `jmethodID` lookup when we are about to 
>> deallocate a `Method` instance and clean up the pointer only if that 
>> `jmethodID` is pointing to the `Method` instance which is being deallocated.~
>> 
>> Therefore, we need to perform `jmethodID` lookup for each method in an old 
>> class version that is getting purged, and null out the pointer of that 
>> `jmethodID` to break the link from `jmethodID` to the method instance that 
>> is about to get deallocated.
>> 
>> _(For anyone interested, a much lengthier writeup is available in [my 
>> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_
>
> Jaroslav Bachorik has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Restrict cleanup to obsolete methods only

Thanks everyone involved in reviewing this PR! You were awesome and helped me 
drive the PR to better place than it started!

-

PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1832386439

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v10]

2023-11-29 Thread Jaroslav Bachorik

On Wed, 29 Nov 2023 14:57:34 GMT, Jaroslav Bachorik  
wrote:

>> Can you confirm my observation above, that EMCP jmethodIDs are replaced?  I 
>> haven't looked at this code in a while.  Thanks.
>
> I am going to take a look.

Ok, I found it.
The reason for the jmethodID not being cleaned out is this assignment of a new 
jmethodID to obsolete methods - 
https://github.com/openjdk/jdk/blob/a2c5f1fc914ef5c28d044b75598f895cf6097138/src/hotspot/share/prims/jvmtiRedefineClasses.cpp#L3887

Since this is not done for EMCP (or more generic, non-obsolete) methods 
restricting the cleanup process to obsolete methods still sounds like the right 
thing to do.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1409436435

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v10]

2023-11-29 Thread Jaroslav Bachorik

On Wed, 29 Nov 2023 13:29:18 GMT, Coleen Phillimore  wrote:

>> Restricting to obsolete methods sounds like a good idea.
>
> Can you confirm my observation above, that EMCP jmethodIDs are replaced?  I 
> haven't looked at this code in a while.  Thanks.

I am going to take a look.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1409410538

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v10]

2023-11-29 Thread Jaroslav Bachorik

On Tue, 28 Nov 2023 21:30:16 GMT, Coleen Phillimore  wrote:

>> Jaroslav Bachorik has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Remove unnecessary assert
>
> src/hotspot/share/oops/instanceKlass.cpp line 4236:
> 
>> 4234: if (method != nullptr) {
>> 4235:   method->clear_jmethod_id();
>> 4236: }
> 
> This loops through the methods in the InstanceKlass that was a previous 
> version klass, and clears the jmethodIDs for all the methods.  Will it clear 
> the jmethodIDs for the EMCP methods also, and should it?
> The jmethodID for EMCP methods are replaced with a the new version, so the 
> Method* in this list won't find a matching jmethodID.  Maybe this can be 
> restricted to obsolete methods?

Restricting to obsolete methods sounds like a good idea.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1409164812

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11]

2023-11-29 Thread Jaroslav Bachorik

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> ~Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.~
> 
> Therefore, we need to perform `jmethodID` lookup for each method in an old 
> class version that is getting purged, and null out the pointer of that 
> `jmethodID` to break the link from `jmethodID` to the method instance that is 
> about to get deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

Jaroslav Bachorik has updated the pull request incrementally with one 
additional commit since the last revision:

  Restrict cleanup to obsolete methods only

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16662/files
  - new: https://git.openjdk.org/jdk/pull/16662/files/81e31dae..aae367fb

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16662=10
 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=09-10

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v10]

2023-11-27 Thread Jaroslav Bachorik

On Mon, 20 Nov 2023 22:09:47 GMT, Coleen Phillimore  wrote:

>> Jaroslav Bachorik has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Remove unnecessary assert
>
> Good analysis for a very subtle bug.  I have a couple of comments, and maybe 
> the test can be simplified but approving the change.

@coleenp I am sorry for bothering you again - the implementation changed 
slightly based on @tstuefe comments so I would like get your   before 
proceeding with this version.
Thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1827432032

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v8]

2023-11-27 Thread Jaroslav Bachorik

On Sat, 25 Nov 2023 02:20:18 GMT, David Holmes  wrote:

>> This has gotten a lot more complicated. All I was suggesting was if this:
>> 
>> if (with_method_holders) {
>>   method->clear_jmethod_id();
>> }
>> 
>> could be changed to 
>> 
>> if (method->method_holder() == nullptr) {
>>   method->clear_jmethod_id();
>> }
>> 
>> Now I'm not at all sure what you are doing.
>
>> @dholmes-ora Unfortunately, I can not just do `method->method_holder() == 
>> nullptr` as `method_holder()` is expanding to 
>> `Method::constants()->pool_holder()` and `Method::constants()` is expanding 
>> to `Method::constMethod()->constants()`. This can cause SIGSEGV if either 
>> `Method::_constMethod` or `ConstMethod::_constants` is NULL.
> 
> I'm getting a strange sense of deja-vu here. This API is flawed if you cannot 
> even ask for the method holder without some intervening value causing a SEGV. 
> I've lost sight of the big picture here in terms of the lifecycle of the 
> Method we are querying, the methodID we may be clearing and the existence or 
> not of a method_holder().

@dholmes-ora I removed the assert. It is not necessary any more as the call to 
`Method::clear_jmethod_id()` was moved from the more generic 
`Method::deallocate_contents()` to `InstanceKlass::clear_jmethod_ids()` which 
is called if and only if the previous class versions are being purged. Because 
the issue with the method holder is related to `ClassParser` and not fully 
initialized classes only, the assert can safely be removed. The 
`Method::clear_jmethod_id()` will never be called in a context when the link to 
its method holder is broken.

> I've lost sight of the big picture here in terms of the lifecycle of the 
> Method we are querying, the methodID we may be clearing and the existence or 
> not of a method_holder().

I have updated the PR description to correspond to the actual state of affairs 
- the change is that instead of doing the `jmethodID` pointer maintenance for 
each `Method::deallocate_contents()` call it will be done only for methods 
contained by the old class versions that are getting purged. This has two 
benefits compared to the original proposal:
- we add the overhead of `jmethodID` lookup only to the problematic case of 
purging old class versions
- method holder is always valid when purging old class versions so we don't 
need to have checks/asserts for that

-

PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1827409695
PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1827430093

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v10]

2023-11-27 Thread Jaroslav Bachorik

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> ~Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.~
> 
> Therefore, we need to perform `jmethodID` lookup for each method in an old 
> class version that is getting purged, and null out the pointer of that 
> `jmethodID` to break the link from `jmethodID` to the method instance that is 
> about to get deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

Jaroslav Bachorik has updated the pull request incrementally with one 
additional commit since the last revision:

  Remove unnecessary assert

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16662/files
  - new: https://git.openjdk.org/jdk/pull/16662/files/554b3ae0..81e31dae

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16662=09
 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=08-09

  Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v9]

2023-11-24 Thread Jaroslav Bachorik

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

Jaroslav Bachorik has updated the pull request incrementally with one 
additional commit since the last revision:

  Comment adjustments

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16662/files
  - new: https://git.openjdk.org/jdk/pull/16662/files/46eff8d3..554b3ae0

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16662=08
 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=07-08

  Stats: 12 lines in 2 files changed: 7 ins; 4 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v8]

2023-11-24 Thread Jaroslav Bachorik

On Fri, 24 Nov 2023 06:49:14 GMT, Thomas Stuefe  wrote:

>I wondered about that but IIUC the pointers inside IK->_methods_jmethod_ids 
>may refer to jmethodID slots that have been reused for different methods?

Yes. The reason is that if a class has previous versions, these versions do not 
contain their own jmethodID cache but rather delegate to the 'main' version. So 
we have nothing to iterate over when a previous class version gets unloaded - 
also, to make things more interesting, the previous method versions are 
pointing to the main class version as their `method_holder`. Making it 
impossible to iterate over the full jmethodID cache in the main class version 
and nulling out a jmethodID only if the method it points to has `method_holder` 
pointing to the class being unloaded (it will always point to the main class 
version ...). 

If there is an easy way out of this without incurring O(n^2) complexity, please 
speak up. I was not able to figure out an alternative without eg. keeping an 
explicit method->jmethodID link but that would increase the Method instance 
footprint and it felt like a more invasive change for what I want to achieve.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1404260866

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v8]

2023-11-24 Thread Jaroslav Bachorik

On Fri, 24 Nov 2023 05:47:14 GMT, David Holmes  wrote:

>> Jaroslav Bachorik has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Reinstate mistakenly deleted comment
>
> This has gotten a lot more complicated. All I was suggesting was if this:
> 
> if (with_method_holders) {
>   method->clear_jmethod_id();
> }
> 
> could be changed to 
> 
> if (method->method_holder() == nullptr) {
>   method->clear_jmethod_id();
> }
> 
> Now I'm not at all sure what you are doing.

@dholmes-ora Unfortunately, I can not just do `method->method_holder() == 
nullptr` as `method_holder()` is expanding to 
`Method::constants()->pool_holder()` and `Method::constants()` is expanding to 
`Method::constMethod()->constants()`. This can cause SIGSEGV if either 
`Method::_constMethod` or `ConstMethod::_constants` is NULL.

-

PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1825532431

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v8]

2023-11-23 Thread Jaroslav Bachorik

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

Jaroslav Bachorik has updated the pull request incrementally with one 
additional commit since the last revision:

  Reinstate mistakenly deleted comment

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16662/files
  - new: https://git.openjdk.org/jdk/pull/16662/files/4c6a57e8..46eff8d3

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16662=07
 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=06-07

  Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v2]

2023-11-23 Thread Jaroslav Bachorik

On Thu, 23 Nov 2023 16:02:18 GMT, Thomas Stuefe  wrote:

> sorry for not reading the description carefully.

No worries. This is a rather convoluted issue. I don't mind being challenged - 
I am quite new to this part of the code and I don't want to accidentally break 
something ...

-

PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1824691458

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v6]

2023-11-23 Thread Jaroslav Bachorik

On Thu, 23 Nov 2023 15:46:10 GMT, Jaroslav Bachorik  
wrote:

>> Sadly, this is not async-profiler specific. The same issue can be observed 
>> by JVMTI only code grabbing a stacktrace.
>> What do you mean exactly by 'conditional'? Introducing a new JVM flag or 
>> something else?
>
> Ok, I see now - I could do the jmethodID maintenance only from 
> `purge_previous_version_list()` call, leaving the proper metaspace 
> deallocation untouched, therefore not adding unnecessary overhead there.

I have modified the code to do jmethodID cleanup only when in 
`purge_previous_version_list()` - this should help with the added overhead.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403578138

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v7]

2023-11-23 Thread Jaroslav Bachorik

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

Jaroslav Bachorik has updated the pull request incrementally with one 
additional commit since the last revision:

  Do expensive cleanup only in `purge_previous_version_list()`

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16662/files
  - new: https://git.openjdk.org/jdk/pull/16662/files/eaf2720e..4c6a57e8

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16662=06
 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=05-06

  Stats: 25 lines in 3 files changed: 17 ins; 6 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v6]

2023-11-23 Thread Jaroslav Bachorik

On Thu, 23 Nov 2023 15:19:43 GMT, Jaroslav Bachorik  
wrote:

>> src/hotspot/share/oops/instanceKlass.cpp line 541:
>> 
>>> 539:   assert (!method->on_stack(), "shouldn't be called with methods 
>>> on stack");
>>> 540:   // Do the pointer maintenance before releasing the metadata
>>> 541:   method->clear_jmethod_id();
>> 
>> IIUC this is O(n^2) over number of methods. Seeing that this is a workaround 
>> for a special case (an app that does a lot of retransforms *and* uses async 
>> profiler), I'd opt for making it conditional.
>
> Sadly, this is not async-profiler specific. The same issue can be observed by 
> JVMTI only code grabbing a stacktrace.
> What do you mean exactly by 'conditional'? Introducing a new JVM flag or 
> something else?

Ok, I see now - I could do the jmethodID maintenance only from 
`purge_previous_version_list()` call, leaving the proper metaspace deallocation 
untouched, therefore not adding unnecessary overhead there.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403544462

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v2]

2023-11-23 Thread Jaroslav Bachorik

On Thu, 23 Nov 2023 14:15:54 GMT, Thomas Stuefe  wrote:

>> @dholmes-ora 
>>> Can't we just check method->method_holder() for null rather than passing in 
>>> a parameter like this?
>> 
>> I have removed the argument and I am now performing the check for 
>> `method_holder() != nullptr` as recommended. The code is a bit simpler and 
>> the cost of resolving the method holder for each method is probably quite 
>> low so we should be ok here.
>
> @jbachorik You are aware that this fix only works for some uncommon corner 
> cases, right?
> 
> It only works if the Method is explicitly deallocated. The vast bulk of 
> Method aren't. Method, as a Metaspace object, is released in bulk when the 
> class is unloaded. The `::deallocate` path you fixed up - that eventually 
> ends up in `MetaspaceArena::deallocate()` - is a rare case that only gets 
> invoked if
> - a class cannot be loaded but parts of it have already been loaded into 
> Metaspace. 
> - a class gets transformed
> 
> In case the class gets unloaded via conventional means, your fix won't get 
> invoked (nor should it; releasing in bulk without having to care for 
> individual allocations is the point of Metaspace).

@tstuefe 

> In case the class gets unloaded via conventional means, your fix won't get 
> invoked (nor should it; releasing in bulk without having to care for 
> individual allocations is the point of Metaspace).

Yes. This is intended. The deallocation path via Metaspace is fine. It is just 
the code that is purging previous versions of a redefined class that has this 
bug.

-

PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1824641628

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v6]

2023-11-23 Thread Jaroslav Bachorik

On Thu, 23 Nov 2023 14:20:48 GMT, Thomas Stuefe  wrote:

>> Jaroslav Bachorik has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Add exhaustive check for method holder availability (1)
>
> src/hotspot/share/oops/instanceKlass.cpp line 541:
> 
>> 539:   assert (!method->on_stack(), "shouldn't be called with methods on 
>> stack");
>> 540:   // Do the pointer maintenance before releasing the metadata
>> 541:   method->clear_jmethod_id();
> 
> IIUC this is O(n^2) over number of methods. Seeing that this is a workaround 
> for a special case (an app that does a lot of retransforms *and* uses async 
> profiler), I'd opt for making it conditional.

Sadly, this is not async-profiler specific. The same issue can be observed by 
JVMTI only code grabbing a stacktrace.
What do you mean exactly by 'conditional'? Introducing a new JVM flag or 
something else?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403515842

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v6]

2023-11-23 Thread Jaroslav Bachorik

On Mon, 20 Nov 2023 22:08:49 GMT, Coleen Phillimore  wrote:

>> Jaroslav Bachorik has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Add exhaustive check for method holder availability (1)
>
> src/hotspot/share/classfile/classFileParser.cpp line 5579:
> 
>> 5577: 
>> 5578:   if (_methods != nullptr) {
>> 5579: // Free methods - those methods are not fully wired and miss the 
>> method holder
> 
> How about saying: for methods whose InstanceKlass as method holder is not yet 
> created?

And is now back, but I applied your suggestion, @coleenp

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403513054

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v6]

2023-11-23 Thread Jaroslav Bachorik

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

Jaroslav Bachorik has updated the pull request incrementally with one 
additional commit since the last revision:

  Add exhaustive check for method holder availability (1)

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16662/files
  - new: https://git.openjdk.org/jdk/pull/16662/files/00f644a0..eaf2720e

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16662=05
 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=04-05

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v5]

2023-11-23 Thread Jaroslav Bachorik

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

Jaroslav Bachorik has updated the pull request incrementally with one 
additional commit since the last revision:

  Add exhaustive check for method holder availability

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16662/files
  - new: https://git.openjdk.org/jdk/pull/16662/files/9f61d8e0..00f644a0

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16662=04
 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=03-04

  Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v4]

2023-11-23 Thread Jaroslav Bachorik

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

Jaroslav Bachorik has updated the pull request incrementally with one 
additional commit since the last revision:

  Move assert

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16662/files
  - new: https://git.openjdk.org/jdk/pull/16662/files/967813b6..9f61d8e0

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16662=03
 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=02-03

  Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v3]

2023-11-23 Thread Jaroslav Bachorik

On Wed, 22 Nov 2023 01:21:15 GMT, David Holmes  wrote:

>> Jaroslav Bachorik has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Rewerite the test to use RedefineClassHelper
>
> test/hotspot/jtreg/serviceability/jvmti/thread/GetStackTrace/GetStackTraceAndRetransformTest/GetStackTraceAndRetransformTest.java
>  line 2:
> 
>> 1: /*
>> 2:  * Copyright (c) 2023 Oracle and/or its affiliates. All rights reserved.
> 
> An Oracle copyright is not needed here if you wrote this test from scratch. 
> If it is present then we need a comma after the copyright year please.

Fixed

> test/hotspot/jtreg/serviceability/jvmti/thread/GetStackTrace/GetStackTraceAndRetransformTest/libGetStackTraceAndRetransformTest.cpp
>  line 2:
> 
>> 1: /*
>> 2:  * Copyright (c) 2023 Oracle and/or its affiliates. All rights reserved.
> 
> Ditto comment about Oracle copyright.

Fixed

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403054811
PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403055037

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v3]

2023-11-23 Thread Jaroslav Bachorik

On Mon, 20 Nov 2023 22:08:49 GMT, Coleen Phillimore  wrote:

>> Jaroslav Bachorik has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Rewerite the test to use RedefineClassHelper
>
> src/hotspot/share/classfile/classFileParser.cpp line 5579:
> 
>> 5577: 
>> 5578:   if (_methods != nullptr) {
>> 5579: // Free methods - those methods are not fully wired and miss the 
>> method holder
> 
> How about saying: for methods whose InstanceKlass as method holder is not yet 
> created?

This comment went away with the change @dholmes-ora recommended

> test/hotspot/jtreg/serviceability/jvmti/thread/GetStackTrace/GetStackTraceAndRetransformTest/GetStackTraceAndRetransformTest.java
>  line 53:
> 
>> 51: import java.util.List;
>> 52: import java.util.concurrent.CyclicBarrier;
>> 53: import java.util.concurrent.locks.LockSupport;
> 
> Do you need all these imports?
> 
> There's a simple RedefineClassHelper class that does most of the work, but 
> maybe you need the explicit agent code to reproduce the crash?  See 
> test/hotspot/jtreg/serviceability/jvmti/RedefineClasses/RedefineRunningMethodsWithBacktrace.java
>  as an example.

I did clean up the imports and switched to `RedefineClassHelper`. Thanks for 
pointing that out!

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403064332
PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403054332

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v3]

2023-11-23 Thread Jaroslav Bachorik

On Wed, 22 Nov 2023 01:27:25 GMT, David Holmes  wrote:

>> I see, holder is the right word and concept.  So the parameter means 
>> has_method_holder, in that the InstanceKlass has been fully parsed at the 
>> point of clearing the jmethodIDs.
>
> Can't we just check `method->method_holder()` for null rather than passing in 
> a parameter like this?

Yes. I changed the code to do that. We will attempt to resolve the method 
holder for each method instead of just using one single boolean argument but I 
believe the resolution should be fast enough not to matter in this context.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403068144

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v2]

2023-11-23 Thread Jaroslav Bachorik

On Thu, 23 Nov 2023 08:32:36 GMT, Jaroslav Bachorik  
wrote:

>> Please, review this fix for a corner case handling of `jmethodID` values.
>> 
>> The issue is related to the interplay between `jmethodID` values and method 
>> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
>> instance. Once that method gets redefined, the `jmethodID` is updated to 
>> point to the last `Method` version. 
>> Unless the method is still on stack/running, in which case the original 
>> `jmethodID` will be redirected to the latest `Method` version and at the 
>> same time the 'previous' `Method` version will receive a new `jmethodID` 
>> pointing to that previous version.
>> 
>> If we happen to capture stacktrace via `GetStackTrace` or 
>> `GetAllStackTraces` JVMTI calls while this previous `Method` version is 
>> still on stack we will have the corresponding frame identified by a 
>> `jmethodID` pointing to that version.
>> However, sooner or later the 'previous' class version becomes eligible for 
>> cleanup at what time all contained `Method` instances. The cleanup process 
>> will not perform the `jmethodID` pointer maintenance and we will end up with 
>> pointers to deallocated memory. 
>> This is caused by the fact that the `jmethodID` lifecycle is bound to 
>> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
>> batch-updated when the class loader is being released and all its classes 
>> are getting unloaded. 
>> 
>> This means that we need to make sure that if a `Method` instance is being 
>> deallocate the associated `jmethodID` (if any) must not point to the 
>> deallocated instance once we are finished. Unfortunately, we can not just 
>> update the `jmethodID` values in bulk when purging an old class version - 
>> the per `InstanceKlass` jmethodID cache is present only for the main class 
>> version and contains `jmethodID` values for both the old and current method 
>> versions. 
>> 
>> Therefore we need to perform `jmethodID` lookup when we are about to 
>> deallocate a `Method` instance and clean up the pointer only if that 
>> `jmethodID` is pointing to the `Method` instance which is being deallocated.
>> 
>> _(For anyone interested, a much lengthier writeup is available in [my 
>> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_
>
> Jaroslav Bachorik has updated the pull request incrementally with three 
> additional commits since the last revision:
> 
>  - Clean up imports
>  - Simplify Method::clear_jmethod_id()
>  - Use correct copyrights

@dholmes-ora 
> Can't we just check method->method_holder() for null rather than passing in a 
> parameter like this?

I have removed the argument and I am now performing the check for 
`method_holder() != nullptr` as recommended. The code is a bit simpler and the 
cost of resolving the method holder for each method is probably quite low so we 
should be ok here.

-

PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1823997622

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v3]

2023-11-23 Thread Jaroslav Bachorik

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

Jaroslav Bachorik has updated the pull request incrementally with one 
additional commit since the last revision:

  Rewerite the test to use RedefineClassHelper

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16662/files
  - new: https://git.openjdk.org/jdk/pull/16662/files/f47e9499..967813b6

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16662=02
 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=01-02

  Stats: 124 lines in 2 files changed: 15 ins; 93 del; 16 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v2]

2023-11-23 Thread Jaroslav Bachorik

> Please, review this fix for a corner case handling of `jmethodID` values.
> 
> The issue is related to the interplay between `jmethodID` values and method 
> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
> instance. Once that method gets redefined, the `jmethodID` is updated to 
> point to the last `Method` version. 
> Unless the method is still on stack/running, in which case the original 
> `jmethodID` will be redirected to the latest `Method` version and at the same 
> time the 'previous' `Method` version will receive a new `jmethodID` pointing 
> to that previous version.
> 
> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
> JVMTI calls while this previous `Method` version is still on stack we will 
> have the corresponding frame identified by a `jmethodID` pointing to that 
> version.
> However, sooner or later the 'previous' class version becomes eligible for 
> cleanup at what time all contained `Method` instances. The cleanup process 
> will not perform the `jmethodID` pointer maintenance and we will end up with 
> pointers to deallocated memory. 
> This is caused by the fact that the `jmethodID` lifecycle is bound to 
> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
> batch-updated when the class loader is being released and all its classes are 
> getting unloaded. 
> 
> This means that we need to make sure that if a `Method` instance is being 
> deallocate the associated `jmethodID` (if any) must not point to the 
> deallocated instance once we are finished. Unfortunately, we can not just 
> update the `jmethodID` values in bulk when purging an old class version - the 
> per `InstanceKlass` jmethodID cache is present only for the main class 
> version and contains `jmethodID` values for both the old and current method 
> versions. 
> 
> Therefore we need to perform `jmethodID` lookup when we are about to 
> deallocate a `Method` instance and clean up the pointer only if that 
> `jmethodID` is pointing to the `Method` instance which is being deallocated.
> 
> _(For anyone interested, a much lengthier writeup is available in [my 
> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

Jaroslav Bachorik has updated the pull request incrementally with three 
additional commits since the last revision:

 - Clean up imports
 - Simplify Method::clear_jmethod_id()
 - Use correct copyrights

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/16662/files
  - new: https://git.openjdk.org/jdk/pull/16662/files/b2ca5e69..f47e9499

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=16662=01
 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=00-01

  Stats: 25 lines in 6 files changed: 5 ins; 13 del; 7 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes

2023-11-17 Thread Jaroslav Bachorik

On Fri, 17 Nov 2023 22:53:58 GMT, Coleen Phillimore  wrote:

>> Please, review this fix for a corner case handling of `jmethodID` values.
>> 
>> The issue is related to the interplay between `jmethodID` values and method 
>> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
>> instance. Once that method gets redefined, the `jmethodID` is updated to 
>> point to the last `Method` version. 
>> Unless the method is still on stack/running, in which case the original 
>> `jmethodID` will be redirected to the latest `Method` version and at the 
>> same time the 'previous' `Method` version will receive a new `jmethodID` 
>> pointing to that previous version.
>> 
>> If we happen to capture stacktrace via `GetStackTrace` or 
>> `GetAllStackTraces` JVMTI calls while this previous `Method` version is 
>> still on stack we will have the corresponding frame identified by a 
>> `jmethodID` pointing to that version.
>> However, sooner or later the 'previous' class version becomes eligible for 
>> cleanup at what time all contained `Method` instances. The cleanup process 
>> will not perform the `jmethodID` pointer maintenance and we will end up with 
>> pointers to deallocated memory. 
>> This is caused by the fact that the `jmethodID` lifecycle is bound to 
>> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
>> batch-updated when the class loader is being released and all its classes 
>> are getting unloaded. 
>> 
>> This means that we need to make sure that if a `Method` instance is being 
>> deallocate the associated `jmethodID` (if any) must not point to the 
>> deallocated instance once we are finished. Unfortunately, we can not just 
>> update the `jmethodID` values in bulk when purging an old class version - 
>> the per `InstanceKlass` jmethodID cache is present only for the main class 
>> version and contains `jmethodID` values for both the old and current method 
>> versions. 
>> 
>> Therefore we need to perform `jmethodID` lookup when we are about to 
>> deallocate a `Method` instance and clean up the pointer only if that 
>> `jmethodID` is pointing to the `Method` instance which is being deallocated.
>> 
>> _(For anyone interested, a much lengthier writeup is available in [my 
>> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_
>
> src/hotspot/share/oops/instanceKlass.cpp line 541:
> 
>> 539:   // The previous version will point to them so they're not totally 
>> dangling
>> 540:   assert (!method->on_stack(), "shouldn't be called with methods on 
>> stack");
>> 541:   // Do the pointer maintenance before releasing the metadata, but 
>> not for incomplete methods
> 
> I'm confused by what you mean by method holder, which I think of as 
> methodHandle.  Or InstanceKlass is the holder of the methods.  Maybe this 
> should be more explicit that it's talking about clearing any associated 
> jmethodIDs.

The method holder is an `InstanceKlass` object which can be retrieved as 
`method->method_holder()` (I apologize if I am using not completely correct 
terms - this is what I grokked from the sources). And incomplete methods 
created by the `ClassParser` from the class data stream will not have the link 
to that `InstanceKlass` set up if the `ClassParser` is already having its 
`_klass` field set to a non-null value.

If we are talking about clearing any jmetbodIDs associated with an 
`InstanceKlass` instance it is not really possible for old method versions 
because only the current `InstanceKlass` version has the jmethodID cache 
associated with it and it contains jmethodIDs pointing to bot the old and 
current methods.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1398009493

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes

2023-11-16 Thread Jaroslav Bachorik

On Wed, 15 Nov 2023 18:47:19 GMT, Jaroslav Bachorik  
wrote:

>> src/hotspot/share/oops/instanceKlass.cpp line 531:
>> 
>>> 529: 
>>> 530: void InstanceKlass::deallocate_methods(ClassLoaderData* loader_data,
>>> 531:Array* methods, 
>>> InstanceKlass* klass) {
>> 
>> An explicit boolean parameter would be cleaner/clearer.
>
> I just removed the `klass` argument. It is not really used anyway.

I actually ended up with a boolean parameter here.

>> src/hotspot/share/oops/method.hpp line 730:
>> 
>>> 728:   // so handles are not used to avoid deadlock.
>>> 729:   jmethodID find_jmethod_id_or_null()   {
>>> 730: return method_holder() != nullptr ? 
>>> method_holder()->jmethod_id_or_null(this) : nullptr;
>> 
>> If `method_holder()` is null at this point what does that mean for the 
>> lifecycle of the Method?
>
> Please, ignore this part of code for the time being. I had a crash in CI 
> which was pointing vaguely to this code - unfortunately, the hs_err.log files 
> are not preserved in the test archives and I am not able to reproduce the 
> failure locally. I need to debug the crash and make sure I understand the 
> root cause.

_Update:_ I was able to get to the bottom of the methods not having method 
holder associated with them.

The `ClassFileParser` does not finalize initialization of the `InstanceKlass` 
it has created if `_klass != nullptr` 
(https://github.com/openjdk/jdk/blob/9727f4bdddc071e6f59806087339f345405ab004/src/hotspot/share/classfile/classFileParser.cpp#L5161).
 This also means, that the `Method` instances are not wired to their method 
holders via 'constant method'->'constant pool'->'pool holder' chain. However, 
they need to be deallocated and as such I really need a distinguishing argument 
for `InstanceKlass::deallocate_methods` call such that I don't attempt to 
resolve `jmethodid` values in that case.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1396296501
PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1396175846

Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes

2023-11-16 Thread Jaroslav Bachorik

On Wed, 15 Nov 2023 05:35:49 GMT, David Holmes  wrote:

>> Please, review this fix for a corner case handling of `jmethodID` values.
>> 
>> The issue is related to the interplay between `jmethodID` values and method 
>> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
>> instance. Once that method gets redefined, the `jmethodID` is updated to 
>> point to the last `Method` version. 
>> Unless the method is still on stack/running, in which case the original 
>> `jmethodID` will be redirected to the latest `Method` version and at the 
>> same time the 'previous' `Method` version will receive a new `jmethodID` 
>> pointing to that previous version.
>> 
>> If we happen to capture stacktrace via `GetStackTrace` or 
>> `GetAllStackTraces` JVMTI calls while this previous `Method` version is 
>> still on stack we will have the corresponding frame identified by a 
>> `jmethodID` pointing to that version.
>> However, sooner or later the 'previous' class version becomes eligible for 
>> cleanup at what time all contained `Method` instances. The cleanup process 
>> will not perform the `jmethodID` pointer maintenance and we will end up with 
>> pointers to deallocated memory. 
>> This is caused by the fact that the `jmethodID` lifecycle is bound to 
>> `ClassLoaderData` instance and all relevant `jmethodID`s will get 
>> batch-updated when the class loader is being released and all its classes 
>> are getting unloaded. 
>> 
>> This means that we need to make sure that if a `Method` instance is being 
>> deallocate the associated `jmethodID` (if any) must not point to the 
>> deallocated instance once we are finished. Unfortunately, we can not just 
>> update the `jmethodID` values in bulk when purging an old class version - 
>> the per `InstanceKlass` jmethodID cache is present only for the main class 
>> version and contains `jmethodID` values for both the old and current method 
>> versions. 
>> 
>> Therefore we need to perform `jmethodID` lookup when we are about to 
>> deallocate a `Method` instance and clean up the pointer only if that 
>> `jmethodID` is pointing to the `Method` instance which is being deallocated.
>> 
>> _(For anyone interested, a much lengthier writeup is available in [my 
>> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_
>
> src/hotspot/share/oops/instanceKlass.cpp line 531:
> 
>> 529: 
>> 530: void InstanceKlass::deallocate_methods(ClassLoaderData* loader_data,
>> 531:Array* methods, 
>> InstanceKlass* klass) {
> 
> An explicit boolean parameter would be cleaner/clearer.

I just removed the `klass` argument. It is not really used anyway.

> src/hotspot/share/oops/instanceKlass.cpp line 542:
> 
>> 540:   if (klass) {
>> 541: jmethodID jmid = method->find_jmethod_id_or_null();
>> 542: // Do the pointer maintenance before releasing the metadata, 
>> just in case
> 
> I assume there should be a period after 'case`. But just in case of what?

The code was moved to`method.cpp` and this particular comment line became 
obsolete

> src/hotspot/share/oops/instanceKlass.cpp line 549:
> 
>> 547: if (jmid != nullptr && *((Method**)jmid) == method) {
>> 548:   *((Method**)jmid) = nullptr;
>> 549: }
> 
> This should be abstracted behind a utility function e.g. 
> `method->clear_jmethod_id()`.

Done

> src/hotspot/share/oops/method.cpp line 2277:
> 
>> 2275:   }
>> 2276: }
>> 2277: 
> 
> Can this race with redefinition?

The cleanup of previous versions is executed in VM_Operation at a safepoint - 
therefore we should be safe against races with class redefinitions. 
I am adding an assert to `clear_jmethod_id()` to check for being at a safepoint.

> src/hotspot/share/oops/method.hpp line 730:
> 
>> 728:   // so handles are not used to avoid deadlock.
>> 729:   jmethodID find_jmethod_id_or_null()   {
>> 730: return method_holder() != nullptr ? 
>> method_holder()->jmethod_id_or_null(this) : nullptr;
> 
> If `method_holder()` is null at this point what does that mean for the 
> lifecycle of the Method?

Please, ignore this part of code for the time being. I had a crash in CI which 
was pointing vaguely to this code - unfortunately, the hs_err.log files are not 
preserved in the test archives and I am not able to reproduce the failure 
locally. I need to debug the crash and make sure I understand the root cause.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1394642247
PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1394647034
PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1394647173
PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1395100685
PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1395102362

RFR: 8313816: Accessing jmethodID might lead to spurious crashes

2023-11-16 Thread Jaroslav Bachorik

Please, review this fix for a corner case handling of `jmethodID` values.

The issue is related to the interplay between `jmethodID` values and method 
redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` 
instance. Once that method gets redefined, the `jmethodID` is updated to point 
to the last `Method` version. 
Unless the method is still on stack/running, in which case the original 
`jmethodID` will be redirected to the latest `Method` version and at the same 
time the 'previous' `Method` version will receive a new `jmethodID` pointing to 
that previous version.

If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` 
JVMTI calls while this previous `Method` version is still on stack we will have 
the corresponding frame identified by a `jmethodID` pointing to that version.
However, sooner or later the 'previous' class version becomes eligible for 
cleanup at what time all contained `Method` instances. The cleanup process will 
not perform the `jmethodID` pointer maintenance and we will end up with 
pointers to deallocated memory. 
This is caused by the fact that the `jmethodID` lifecycle is bound to 
`ClassLoaderData` instance and all relevant `jmethodID`s will get batch-updated 
when the class loader is being released and all its classes are getting 
unloaded. 

This means that we need to make sure that if a `Method` instance is being 
deallocate the associated `jmethodID` (if any) must not point to the 
deallocated instance once we are finished. Unfortunately, we can not just 
update the `jmethodID` values in bulk when purging an old class version - the 
per `InstanceKlass` jmethodID cache is present only for the main class version 
and contains `jmethodID` values for both the old and current method versions. 

Therefore we need to perform `jmethodID` lookup when we are about to deallocate 
a `Method` instance and clean up the pointer only if that `jmethodID` is 
pointing to the `Method` instance which is being deallocated.

_(For anyone interested, a much lengthier writeup is available in [my 
blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_

-

Commit messages:
 - 8313816: Accessing jmethodID might lead to spurious crashes

Changes: https://git.openjdk.org/jdk/pull/16662/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk=16662=00
  Issue: https://bugs.openjdk.org/browse/JDK-8313816
  Stats: 282 lines in 9 files changed: 278 ins; 0 del; 4 mod
  Patch: https://git.openjdk.org/jdk/pull/16662.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662

PR: https://git.openjdk.org/jdk/pull/16662

Re: RFR: 8304725: AsyncGetCallTrace can cause SIGBUS on M1 [v3]

2023-03-24 Thread Jaroslav Bachorik

On Thu, 23 Mar 2023 12:45:27 GMT, Johannes Bechberger  
wrote:

> Yes, but while JFR interrupts threads too, its sampler runs in its own 
> thread, so the async-safety of the interrupted code should not matter, or?

And this sampling approach is quite subpar when compared with the signal based 
profiling - both in terms of bias and precision. Meaning that if/when JFR would 
implement a state-of-the-art CPU profiler it will have to deal with the same 
issues we are seeing here.

-

PR Comment: https://git.openjdk.org/jdk/pull/13144#issuecomment-1482370219

Re: [External] : Re: Disallowing the dynamic loading of agents by default

2023-03-20 Thread Jaroslav Bachorik

Hi,

On Mon, Mar 20, 2023 at 11:11 AM Ron Pressler 
wrote:

> Hi.
>
> The majority of serviceability tools don’t require dynamically loading an
> agent, and the majority of applications never load an agent dynamically.
>

The majority of the JDK built-in tools, I would say. What about eg. the JMC
agent?

>
> True, there are some tools that will be affected, which is why the
> decision was to introduce the flag in JDK 9 and to announce this change,
> but change the default in a later version to give tools ample time to
> prepare their users. The rationale for this change then hasn’t changed, but
> will be reiterated in a JEP (we just wanted to announce this ahead of the
> JEP to give tool authors another reminder more than six months ahead of JDK
> 21). The only change between then and now is that even fewer use cases
> require dynamically loaded agents, and so the impact is even smaller.
>

As a maintainer of one of such tools I can confidently say that this change
will either kill the tool as the ease of use will be gone or the workaround
(eg. using JAVA_TOOL_OPTIONS) will completely defeat the purpose of this
change. Having to put a flag when starting the JVM to allow dynamic loading
of agents sounds a bit nonsensical to me - it would be much easier to
directly add the agent to the JVM startup and then implement a lightweight
control protocol over socket/shared memory to enabled/disable the agent
features dynamically.

>
> It is also true that, when starting an application you don’t know that you
> *will* need to load an agent, but in most situations you know that you
> might. E.g. processes that are too critical to bring down even for deep
> maintenance (although not many of these are written in modern version of
> Java anyone) or canary services that are under trial. The relatively few
> sophisticated users who know how to write ad-hoc agents can even opt to
> enable dynamic agent loading on all their servers; these users are better
> equipped to can weigh the risks and tradeoffs involved.
>

Wouldn't having this enabled system-wide actually defeat the purpose of
having this flag? Considering that the dynamic attach can be performed only
on the same host under the same user as the target process there seems to
be a very small chance of loading agents accidentally. In the end people
would set up their systems to enabled dynamic agent loading via eg.
JAVA_TOOL_OPTIONS and we will be in the same place as before, with the
additional hurdle of setting everything up.

> Finally, some tools that require a dynamically loaded JVM TI agents, such
> as profilers that profile native code, are so tied to the VM's internals
> that the best place for them is in the JDK. If anything, the bigger problem
> is not that profilers are used too much in production, but too little,
> including less advanced ones that don’t require an agent. There is plenty
> of time to enhance the JDK’s built-in profiling capabilities ahead of
> demand.
>

I think this is an overly optimistic view. It is *much more* difficult to
enhance the JDK's built-in profiling capabilities than do the same in an
external profiling agent.

Overall, I don't seem to understand the anticipated attack vectors this
change is supposed to prevent. AFAIK, in order to do the dynamic agent load
one needs to have full access to the target process. That means that there
are more convenient and straightforward ways to do anything nefarious than
loading a JVMTI agent. Am I missing some other usages where the JVMTI agent
would actually give access to something which would be otherwise
inaccessible considering that the attacher and attachee must be on the same
host and under the same user?

Cheers,

-JB-

>
> — Ron
>
> On 20 Mar 2023, at 01:21, Andrei Pangin  wrote:
>
> Hi all,
>
> Serviceability has been one of the biggest Java strengths, but the
> proposed change is going to have a large negative impact on it.
>
> Disallowing dynamic agents by default means it will no longer be possible
> to attach a profiler to a running app in runtime. JFR cannot close this gap
> due to lack of capabilities modern Java profilers have (that's a separate
> topic though).
>
> When an issue happens with a live app, it's already too late to add a
> command line argument. Furthermore, it may not be even feasible to add an
> agent at startup in containerized applications. Starting profiler on demand
> from the host OS or from a sidecar is the only viable solution in these
> cases.
>
> Next, it's hard to predict beforehand what tools exactly might be useful
> for troubleshooting: e.g., one tool may be better for finding memory leaks,
> a different one for analyzing CPU performance. Adding all possible tools at
> startup does not seem a reasonable approach, especially when tools may
> conflict with each other.
>
> The most important aspect of dynamic agents is the possibility to make a
> special tool just in time for solving a particular problem. A typical
> example is to get a value of

37 matches

Mail list logo