Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11]
On Mon, 4 Dec 2023 23:32:52 GMT, David Holmes wrote: > The skara tooling does not currently support our rules but it remains as > always that non-trivial Hotspot changes require two reviewers. Thanks, I will keep this in mind. And I apologise for not following the process, though not intentionally. - PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1842611183
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11]
On Fri, 1 Dec 2023 05:15:15 GMT, Thomas Stuefe wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Restrict cleanup to obsolete methods only > > I won't be able to review this this week, too snowed in atm. I can take a > look next week. We can always just revert the change if needed. > > Thinking about Skara, I think as long as we have this confusing mixture of > rules (hotspot wants 2 reviewers that are Reviewer/Committer, but some jdk > libs only want one, but then you need two for desktop I think otherwise Phil > gets angry) - we should hard-code the 2-reviewer rule into skara as default > since it affects the lion's share of all changes. @tstuefe I got confused by the Skara tooling. I had a vague memory of some discussions going on about relaxing the requirement of 2 reviewers for some parts to the code base and I thought I was in a good shape seeing the Skara checkbox. ![Screenshot 2023-12-04 at 11 04 00](https://github.com/openjdk/jdk/assets/738413/a5e363ee-a9e0-4121-9677-c059aa299dd4) As for not having a review for the final version - I am not that restless. I specifically dismissed the previous review to avoid incidentally integrating based on a review of a version that was not actual. Then I asked @coleenp to re-do the review on the final bits (https://github.com/openjdk/jdk/pull/16662#issuecomment-1827432032) @dholmes-ora >Okay so why does this happen and is it a reasonable thing to be happening? On >the surface it sounds wrong to deallocate anything associated with a live >classloader. This is happening for previous versions of retransformed methods. As long as those methods are still on stack they are kept alive. But once they are not executing they are free to be destroyed. And this is where the problem was happening - the previous versions of methods were being destroyed but the associated jmethodIDs were not updated not to point to what became an invalid memory block. - PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1838221097
Integrated: 8313816: Accessing jmethodID might lead to spurious crashes
On Tue, 14 Nov 2023 17:56:09 GMT, Jaroslav Bachorik wrote: > Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > ~Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated.~ > > Therefore, we need to perform `jmethodID` lookup for each method in an old > class version that is getting purged, and null out the pointer of that > `jmethodID` to break the link from `jmethodID` to the method instance that is > about to get deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ This pull request has now been integrated. Changeset: cdd1a6e8 Author:Jaroslav Bachorik URL: https://git.openjdk.org/jdk/commit/cdd1a6e851bcaf4a25d4a405b8ee0b0d5b83a4a9 Stats: 206 lines in 8 files changed: 206 ins; 0 del; 0 mod 8313816: Accessing jmethodID might lead to spurious crashes Reviewed-by: coleenp - PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11]
On Wed, 29 Nov 2023 11:49:31 GMT, Jaroslav Bachorik wrote: >> Please, review this fix for a corner case handling of `jmethodID` values. >> >> The issue is related to the interplay between `jmethodID` values and method >> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` >> instance. Once that method gets redefined, the `jmethodID` is updated to >> point to the last `Method` version. >> Unless the method is still on stack/running, in which case the original >> `jmethodID` will be redirected to the latest `Method` version and at the >> same time the 'previous' `Method` version will receive a new `jmethodID` >> pointing to that previous version. >> >> If we happen to capture stacktrace via `GetStackTrace` or >> `GetAllStackTraces` JVMTI calls while this previous `Method` version is >> still on stack we will have the corresponding frame identified by a >> `jmethodID` pointing to that version. >> However, sooner or later the 'previous' class version becomes eligible for >> cleanup at what time all contained `Method` instances. The cleanup process >> will not perform the `jmethodID` pointer maintenance and we will end up with >> pointers to deallocated memory. >> This is caused by the fact that the `jmethodID` lifecycle is bound to >> `ClassLoaderData` instance and all relevant `jmethodID`s will get >> batch-updated when the class loader is being released and all its classes >> are getting unloaded. >> >> This means that we need to make sure that if a `Method` instance is being >> deallocate the associated `jmethodID` (if any) must not point to the >> deallocated instance once we are finished. Unfortunately, we can not just >> update the `jmethodID` values in bulk when purging an old class version - >> the per `InstanceKlass` jmethodID cache is present only for the main class >> version and contains `jmethodID` values for both the old and current method >> versions. >> >> ~Therefore we need to perform `jmethodID` lookup when we are about to >> deallocate a `Method` instance and clean up the pointer only if that >> `jmethodID` is pointing to the `Method` instance which is being deallocated.~ >> >> Therefore, we need to perform `jmethodID` lookup for each method in an old >> class version that is getting purged, and null out the pointer of that >> `jmethodID` to break the link from `jmethodID` to the method instance that >> is about to get deallocated. >> >> _(For anyone interested, a much lengthier writeup is available in [my >> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ > > Jaroslav Bachorik has updated the pull request incrementally with one > additional commit since the last revision: > > Restrict cleanup to obsolete methods only Thanks everyone involved in reviewing this PR! You were awesome and helped me drive the PR to better place than it started! - PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1832386439
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v10]
On Wed, 29 Nov 2023 14:57:34 GMT, Jaroslav Bachorik wrote: >> Can you confirm my observation above, that EMCP jmethodIDs are replaced? I >> haven't looked at this code in a while. Thanks. > > I am going to take a look. Ok, I found it. The reason for the jmethodID not being cleaned out is this assignment of a new jmethodID to obsolete methods - https://github.com/openjdk/jdk/blob/a2c5f1fc914ef5c28d044b75598f895cf6097138/src/hotspot/share/prims/jvmtiRedefineClasses.cpp#L3887 Since this is not done for EMCP (or more generic, non-obsolete) methods restricting the cleanup process to obsolete methods still sounds like the right thing to do. - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1409436435
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v10]
On Wed, 29 Nov 2023 13:29:18 GMT, Coleen Phillimore wrote: >> Restricting to obsolete methods sounds like a good idea. > > Can you confirm my observation above, that EMCP jmethodIDs are replaced? I > haven't looked at this code in a while. Thanks. I am going to take a look. - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1409410538
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v10]
On Tue, 28 Nov 2023 21:30:16 GMT, Coleen Phillimore wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Remove unnecessary assert > > src/hotspot/share/oops/instanceKlass.cpp line 4236: > >> 4234: if (method != nullptr) { >> 4235: method->clear_jmethod_id(); >> 4236: } > > This loops through the methods in the InstanceKlass that was a previous > version klass, and clears the jmethodIDs for all the methods. Will it clear > the jmethodIDs for the EMCP methods also, and should it? > The jmethodID for EMCP methods are replaced with a the new version, so the > Method* in this list won't find a matching jmethodID. Maybe this can be > restricted to obsolete methods? Restricting to obsolete methods sounds like a good idea. - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1409164812
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11]
> Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > ~Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated.~ > > Therefore, we need to perform `jmethodID` lookup for each method in an old > class version that is getting purged, and null out the pointer of that > `jmethodID` to break the link from `jmethodID` to the method instance that is > about to get deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Restrict cleanup to obsolete methods only - Changes: - all: https://git.openjdk.org/jdk/pull/16662/files - new: https://git.openjdk.org/jdk/pull/16662/files/81e31dae..aae367fb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16662=10 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=09-10 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v10]
On Mon, 20 Nov 2023 22:09:47 GMT, Coleen Phillimore wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Remove unnecessary assert > > Good analysis for a very subtle bug. I have a couple of comments, and maybe > the test can be simplified but approving the change. @coleenp I am sorry for bothering you again - the implementation changed slightly based on @tstuefe comments so I would like get your before proceeding with this version. Thanks! - PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1827432032
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v8]
On Sat, 25 Nov 2023 02:20:18 GMT, David Holmes wrote: >> This has gotten a lot more complicated. All I was suggesting was if this: >> >> if (with_method_holders) { >> method->clear_jmethod_id(); >> } >> >> could be changed to >> >> if (method->method_holder() == nullptr) { >> method->clear_jmethod_id(); >> } >> >> Now I'm not at all sure what you are doing. > >> @dholmes-ora Unfortunately, I can not just do `method->method_holder() == >> nullptr` as `method_holder()` is expanding to >> `Method::constants()->pool_holder()` and `Method::constants()` is expanding >> to `Method::constMethod()->constants()`. This can cause SIGSEGV if either >> `Method::_constMethod` or `ConstMethod::_constants` is NULL. > > I'm getting a strange sense of deja-vu here. This API is flawed if you cannot > even ask for the method holder without some intervening value causing a SEGV. > I've lost sight of the big picture here in terms of the lifecycle of the > Method we are querying, the methodID we may be clearing and the existence or > not of a method_holder(). @dholmes-ora I removed the assert. It is not necessary any more as the call to `Method::clear_jmethod_id()` was moved from the more generic `Method::deallocate_contents()` to `InstanceKlass::clear_jmethod_ids()` which is called if and only if the previous class versions are being purged. Because the issue with the method holder is related to `ClassParser` and not fully initialized classes only, the assert can safely be removed. The `Method::clear_jmethod_id()` will never be called in a context when the link to its method holder is broken. > I've lost sight of the big picture here in terms of the lifecycle of the > Method we are querying, the methodID we may be clearing and the existence or > not of a method_holder(). I have updated the PR description to correspond to the actual state of affairs - the change is that instead of doing the `jmethodID` pointer maintenance for each `Method::deallocate_contents()` call it will be done only for methods contained by the old class versions that are getting purged. This has two benefits compared to the original proposal: - we add the overhead of `jmethodID` lookup only to the problematic case of purging old class versions - method holder is always valid when purging old class versions so we don't need to have checks/asserts for that - PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1827409695 PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1827430093
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v10]
> Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > ~Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated.~ > > Therefore, we need to perform `jmethodID` lookup for each method in an old > class version that is getting purged, and null out the pointer of that > `jmethodID` to break the link from `jmethodID` to the method instance that is > about to get deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Remove unnecessary assert - Changes: - all: https://git.openjdk.org/jdk/pull/16662/files - new: https://git.openjdk.org/jdk/pull/16662/files/554b3ae0..81e31dae Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16662=09 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=08-09 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v9]
> Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Comment adjustments - Changes: - all: https://git.openjdk.org/jdk/pull/16662/files - new: https://git.openjdk.org/jdk/pull/16662/files/46eff8d3..554b3ae0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16662=08 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=07-08 Stats: 12 lines in 2 files changed: 7 ins; 4 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v8]
On Fri, 24 Nov 2023 06:49:14 GMT, Thomas Stuefe wrote: >I wondered about that but IIUC the pointers inside IK->_methods_jmethod_ids >may refer to jmethodID slots that have been reused for different methods? Yes. The reason is that if a class has previous versions, these versions do not contain their own jmethodID cache but rather delegate to the 'main' version. So we have nothing to iterate over when a previous class version gets unloaded - also, to make things more interesting, the previous method versions are pointing to the main class version as their `method_holder`. Making it impossible to iterate over the full jmethodID cache in the main class version and nulling out a jmethodID only if the method it points to has `method_holder` pointing to the class being unloaded (it will always point to the main class version ...). If there is an easy way out of this without incurring O(n^2) complexity, please speak up. I was not able to figure out an alternative without eg. keeping an explicit method->jmethodID link but that would increase the Method instance footprint and it felt like a more invasive change for what I want to achieve. - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1404260866
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v8]
On Fri, 24 Nov 2023 05:47:14 GMT, David Holmes wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Reinstate mistakenly deleted comment > > This has gotten a lot more complicated. All I was suggesting was if this: > > if (with_method_holders) { > method->clear_jmethod_id(); > } > > could be changed to > > if (method->method_holder() == nullptr) { > method->clear_jmethod_id(); > } > > Now I'm not at all sure what you are doing. @dholmes-ora Unfortunately, I can not just do `method->method_holder() == nullptr` as `method_holder()` is expanding to `Method::constants()->pool_holder()` and `Method::constants()` is expanding to `Method::constMethod()->constants()`. This can cause SIGSEGV if either `Method::_constMethod` or `ConstMethod::_constants` is NULL. - PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1825532431
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v8]
> Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Reinstate mistakenly deleted comment - Changes: - all: https://git.openjdk.org/jdk/pull/16662/files - new: https://git.openjdk.org/jdk/pull/16662/files/4c6a57e8..46eff8d3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16662=07 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=06-07 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v2]
On Thu, 23 Nov 2023 16:02:18 GMT, Thomas Stuefe wrote: > sorry for not reading the description carefully. No worries. This is a rather convoluted issue. I don't mind being challenged - I am quite new to this part of the code and I don't want to accidentally break something ... - PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1824691458
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v6]
On Thu, 23 Nov 2023 15:46:10 GMT, Jaroslav Bachorik wrote: >> Sadly, this is not async-profiler specific. The same issue can be observed >> by JVMTI only code grabbing a stacktrace. >> What do you mean exactly by 'conditional'? Introducing a new JVM flag or >> something else? > > Ok, I see now - I could do the jmethodID maintenance only from > `purge_previous_version_list()` call, leaving the proper metaspace > deallocation untouched, therefore not adding unnecessary overhead there. I have modified the code to do jmethodID cleanup only when in `purge_previous_version_list()` - this should help with the added overhead. - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403578138
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v7]
> Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Do expensive cleanup only in `purge_previous_version_list()` - Changes: - all: https://git.openjdk.org/jdk/pull/16662/files - new: https://git.openjdk.org/jdk/pull/16662/files/eaf2720e..4c6a57e8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16662=06 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=05-06 Stats: 25 lines in 3 files changed: 17 ins; 6 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v6]
On Thu, 23 Nov 2023 15:19:43 GMT, Jaroslav Bachorik wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 541: >> >>> 539: assert (!method->on_stack(), "shouldn't be called with methods >>> on stack"); >>> 540: // Do the pointer maintenance before releasing the metadata >>> 541: method->clear_jmethod_id(); >> >> IIUC this is O(n^2) over number of methods. Seeing that this is a workaround >> for a special case (an app that does a lot of retransforms *and* uses async >> profiler), I'd opt for making it conditional. > > Sadly, this is not async-profiler specific. The same issue can be observed by > JVMTI only code grabbing a stacktrace. > What do you mean exactly by 'conditional'? Introducing a new JVM flag or > something else? Ok, I see now - I could do the jmethodID maintenance only from `purge_previous_version_list()` call, leaving the proper metaspace deallocation untouched, therefore not adding unnecessary overhead there. - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403544462
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v2]
On Thu, 23 Nov 2023 14:15:54 GMT, Thomas Stuefe wrote: >> @dholmes-ora >>> Can't we just check method->method_holder() for null rather than passing in >>> a parameter like this? >> >> I have removed the argument and I am now performing the check for >> `method_holder() != nullptr` as recommended. The code is a bit simpler and >> the cost of resolving the method holder for each method is probably quite >> low so we should be ok here. > > @jbachorik You are aware that this fix only works for some uncommon corner > cases, right? > > It only works if the Method is explicitly deallocated. The vast bulk of > Method aren't. Method, as a Metaspace object, is released in bulk when the > class is unloaded. The `::deallocate` path you fixed up - that eventually > ends up in `MetaspaceArena::deallocate()` - is a rare case that only gets > invoked if > - a class cannot be loaded but parts of it have already been loaded into > Metaspace. > - a class gets transformed > > In case the class gets unloaded via conventional means, your fix won't get > invoked (nor should it; releasing in bulk without having to care for > individual allocations is the point of Metaspace). @tstuefe > In case the class gets unloaded via conventional means, your fix won't get > invoked (nor should it; releasing in bulk without having to care for > individual allocations is the point of Metaspace). Yes. This is intended. The deallocation path via Metaspace is fine. It is just the code that is purging previous versions of a redefined class that has this bug. - PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1824641628
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v6]
On Thu, 23 Nov 2023 14:20:48 GMT, Thomas Stuefe wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Add exhaustive check for method holder availability (1) > > src/hotspot/share/oops/instanceKlass.cpp line 541: > >> 539: assert (!method->on_stack(), "shouldn't be called with methods on >> stack"); >> 540: // Do the pointer maintenance before releasing the metadata >> 541: method->clear_jmethod_id(); > > IIUC this is O(n^2) over number of methods. Seeing that this is a workaround > for a special case (an app that does a lot of retransforms *and* uses async > profiler), I'd opt for making it conditional. Sadly, this is not async-profiler specific. The same issue can be observed by JVMTI only code grabbing a stacktrace. What do you mean exactly by 'conditional'? Introducing a new JVM flag or something else? - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403515842
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v6]
On Mon, 20 Nov 2023 22:08:49 GMT, Coleen Phillimore wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Add exhaustive check for method holder availability (1) > > src/hotspot/share/classfile/classFileParser.cpp line 5579: > >> 5577: >> 5578: if (_methods != nullptr) { >> 5579: // Free methods - those methods are not fully wired and miss the >> method holder > > How about saying: for methods whose InstanceKlass as method holder is not yet > created? And is now back, but I applied your suggestion, @coleenp - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403513054
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v6]
> Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Add exhaustive check for method holder availability (1) - Changes: - all: https://git.openjdk.org/jdk/pull/16662/files - new: https://git.openjdk.org/jdk/pull/16662/files/00f644a0..eaf2720e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16662=05 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=04-05 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v5]
> Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Add exhaustive check for method holder availability - Changes: - all: https://git.openjdk.org/jdk/pull/16662/files - new: https://git.openjdk.org/jdk/pull/16662/files/9f61d8e0..00f644a0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16662=04 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=03-04 Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v4]
> Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Move assert - Changes: - all: https://git.openjdk.org/jdk/pull/16662/files - new: https://git.openjdk.org/jdk/pull/16662/files/967813b6..9f61d8e0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16662=03 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=02-03 Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v3]
On Wed, 22 Nov 2023 01:21:15 GMT, David Holmes wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Rewerite the test to use RedefineClassHelper > > test/hotspot/jtreg/serviceability/jvmti/thread/GetStackTrace/GetStackTraceAndRetransformTest/GetStackTraceAndRetransformTest.java > line 2: > >> 1: /* >> 2: * Copyright (c) 2023 Oracle and/or its affiliates. All rights reserved. > > An Oracle copyright is not needed here if you wrote this test from scratch. > If it is present then we need a comma after the copyright year please. Fixed > test/hotspot/jtreg/serviceability/jvmti/thread/GetStackTrace/GetStackTraceAndRetransformTest/libGetStackTraceAndRetransformTest.cpp > line 2: > >> 1: /* >> 2: * Copyright (c) 2023 Oracle and/or its affiliates. All rights reserved. > > Ditto comment about Oracle copyright. Fixed - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403054811 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403055037
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v3]
On Mon, 20 Nov 2023 22:08:49 GMT, Coleen Phillimore wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Rewerite the test to use RedefineClassHelper > > src/hotspot/share/classfile/classFileParser.cpp line 5579: > >> 5577: >> 5578: if (_methods != nullptr) { >> 5579: // Free methods - those methods are not fully wired and miss the >> method holder > > How about saying: for methods whose InstanceKlass as method holder is not yet > created? This comment went away with the change @dholmes-ora recommended > test/hotspot/jtreg/serviceability/jvmti/thread/GetStackTrace/GetStackTraceAndRetransformTest/GetStackTraceAndRetransformTest.java > line 53: > >> 51: import java.util.List; >> 52: import java.util.concurrent.CyclicBarrier; >> 53: import java.util.concurrent.locks.LockSupport; > > Do you need all these imports? > > There's a simple RedefineClassHelper class that does most of the work, but > maybe you need the explicit agent code to reproduce the crash? See > test/hotspot/jtreg/serviceability/jvmti/RedefineClasses/RedefineRunningMethodsWithBacktrace.java > as an example. I did clean up the imports and switched to `RedefineClassHelper`. Thanks for pointing that out! - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403064332 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403054332
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v3]
On Wed, 22 Nov 2023 01:27:25 GMT, David Holmes wrote: >> I see, holder is the right word and concept. So the parameter means >> has_method_holder, in that the InstanceKlass has been fully parsed at the >> point of clearing the jmethodIDs. > > Can't we just check `method->method_holder()` for null rather than passing in > a parameter like this? Yes. I changed the code to do that. We will attempt to resolve the method holder for each method instead of just using one single boolean argument but I believe the resolution should be fast enough not to matter in this context. - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1403068144
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v2]
On Thu, 23 Nov 2023 08:32:36 GMT, Jaroslav Bachorik wrote: >> Please, review this fix for a corner case handling of `jmethodID` values. >> >> The issue is related to the interplay between `jmethodID` values and method >> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` >> instance. Once that method gets redefined, the `jmethodID` is updated to >> point to the last `Method` version. >> Unless the method is still on stack/running, in which case the original >> `jmethodID` will be redirected to the latest `Method` version and at the >> same time the 'previous' `Method` version will receive a new `jmethodID` >> pointing to that previous version. >> >> If we happen to capture stacktrace via `GetStackTrace` or >> `GetAllStackTraces` JVMTI calls while this previous `Method` version is >> still on stack we will have the corresponding frame identified by a >> `jmethodID` pointing to that version. >> However, sooner or later the 'previous' class version becomes eligible for >> cleanup at what time all contained `Method` instances. The cleanup process >> will not perform the `jmethodID` pointer maintenance and we will end up with >> pointers to deallocated memory. >> This is caused by the fact that the `jmethodID` lifecycle is bound to >> `ClassLoaderData` instance and all relevant `jmethodID`s will get >> batch-updated when the class loader is being released and all its classes >> are getting unloaded. >> >> This means that we need to make sure that if a `Method` instance is being >> deallocate the associated `jmethodID` (if any) must not point to the >> deallocated instance once we are finished. Unfortunately, we can not just >> update the `jmethodID` values in bulk when purging an old class version - >> the per `InstanceKlass` jmethodID cache is present only for the main class >> version and contains `jmethodID` values for both the old and current method >> versions. >> >> Therefore we need to perform `jmethodID` lookup when we are about to >> deallocate a `Method` instance and clean up the pointer only if that >> `jmethodID` is pointing to the `Method` instance which is being deallocated. >> >> _(For anyone interested, a much lengthier writeup is available in [my >> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ > > Jaroslav Bachorik has updated the pull request incrementally with three > additional commits since the last revision: > > - Clean up imports > - Simplify Method::clear_jmethod_id() > - Use correct copyrights @dholmes-ora > Can't we just check method->method_holder() for null rather than passing in a > parameter like this? I have removed the argument and I am now performing the check for `method_holder() != nullptr` as recommended. The code is a bit simpler and the cost of resolving the method holder for each method is probably quite low so we should be ok here. - PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1823997622
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v3]
> Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Rewerite the test to use RedefineClassHelper - Changes: - all: https://git.openjdk.org/jdk/pull/16662/files - new: https://git.openjdk.org/jdk/pull/16662/files/f47e9499..967813b6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16662=02 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=01-02 Stats: 124 lines in 2 files changed: 15 ins; 93 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v2]
> Please, review this fix for a corner case handling of `jmethodID` values. > > The issue is related to the interplay between `jmethodID` values and method > redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` > instance. Once that method gets redefined, the `jmethodID` is updated to > point to the last `Method` version. > Unless the method is still on stack/running, in which case the original > `jmethodID` will be redirected to the latest `Method` version and at the same > time the 'previous' `Method` version will receive a new `jmethodID` pointing > to that previous version. > > If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` > JVMTI calls while this previous `Method` version is still on stack we will > have the corresponding frame identified by a `jmethodID` pointing to that > version. > However, sooner or later the 'previous' class version becomes eligible for > cleanup at what time all contained `Method` instances. The cleanup process > will not perform the `jmethodID` pointer maintenance and we will end up with > pointers to deallocated memory. > This is caused by the fact that the `jmethodID` lifecycle is bound to > `ClassLoaderData` instance and all relevant `jmethodID`s will get > batch-updated when the class loader is being released and all its classes are > getting unloaded. > > This means that we need to make sure that if a `Method` instance is being > deallocate the associated `jmethodID` (if any) must not point to the > deallocated instance once we are finished. Unfortunately, we can not just > update the `jmethodID` values in bulk when purging an old class version - the > per `InstanceKlass` jmethodID cache is present only for the main class > version and contains `jmethodID` values for both the old and current method > versions. > > Therefore we need to perform `jmethodID` lookup when we are about to > deallocate a `Method` instance and clean up the pointer only if that > `jmethodID` is pointing to the `Method` instance which is being deallocated. > > _(For anyone interested, a much lengthier writeup is available in [my > blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ Jaroslav Bachorik has updated the pull request incrementally with three additional commits since the last revision: - Clean up imports - Simplify Method::clear_jmethod_id() - Use correct copyrights - Changes: - all: https://git.openjdk.org/jdk/pull/16662/files - new: https://git.openjdk.org/jdk/pull/16662/files/b2ca5e69..f47e9499 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=16662=01 - incr: https://webrevs.openjdk.org/?repo=jdk=16662=00-01 Stats: 25 lines in 6 files changed: 5 ins; 13 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes
On Fri, 17 Nov 2023 22:53:58 GMT, Coleen Phillimore wrote: >> Please, review this fix for a corner case handling of `jmethodID` values. >> >> The issue is related to the interplay between `jmethodID` values and method >> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` >> instance. Once that method gets redefined, the `jmethodID` is updated to >> point to the last `Method` version. >> Unless the method is still on stack/running, in which case the original >> `jmethodID` will be redirected to the latest `Method` version and at the >> same time the 'previous' `Method` version will receive a new `jmethodID` >> pointing to that previous version. >> >> If we happen to capture stacktrace via `GetStackTrace` or >> `GetAllStackTraces` JVMTI calls while this previous `Method` version is >> still on stack we will have the corresponding frame identified by a >> `jmethodID` pointing to that version. >> However, sooner or later the 'previous' class version becomes eligible for >> cleanup at what time all contained `Method` instances. The cleanup process >> will not perform the `jmethodID` pointer maintenance and we will end up with >> pointers to deallocated memory. >> This is caused by the fact that the `jmethodID` lifecycle is bound to >> `ClassLoaderData` instance and all relevant `jmethodID`s will get >> batch-updated when the class loader is being released and all its classes >> are getting unloaded. >> >> This means that we need to make sure that if a `Method` instance is being >> deallocate the associated `jmethodID` (if any) must not point to the >> deallocated instance once we are finished. Unfortunately, we can not just >> update the `jmethodID` values in bulk when purging an old class version - >> the per `InstanceKlass` jmethodID cache is present only for the main class >> version and contains `jmethodID` values for both the old and current method >> versions. >> >> Therefore we need to perform `jmethodID` lookup when we are about to >> deallocate a `Method` instance and clean up the pointer only if that >> `jmethodID` is pointing to the `Method` instance which is being deallocated. >> >> _(For anyone interested, a much lengthier writeup is available in [my >> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ > > src/hotspot/share/oops/instanceKlass.cpp line 541: > >> 539: // The previous version will point to them so they're not totally >> dangling >> 540: assert (!method->on_stack(), "shouldn't be called with methods on >> stack"); >> 541: // Do the pointer maintenance before releasing the metadata, but >> not for incomplete methods > > I'm confused by what you mean by method holder, which I think of as > methodHandle. Or InstanceKlass is the holder of the methods. Maybe this > should be more explicit that it's talking about clearing any associated > jmethodIDs. The method holder is an `InstanceKlass` object which can be retrieved as `method->method_holder()` (I apologize if I am using not completely correct terms - this is what I grokked from the sources). And incomplete methods created by the `ClassParser` from the class data stream will not have the link to that `InstanceKlass` set up if the `ClassParser` is already having its `_klass` field set to a non-null value. If we are talking about clearing any jmetbodIDs associated with an `InstanceKlass` instance it is not really possible for old method versions because only the current `InstanceKlass` version has the jmethodID cache associated with it and it contains jmethodIDs pointing to bot the old and current methods. - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1398009493
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes
On Wed, 15 Nov 2023 18:47:19 GMT, Jaroslav Bachorik wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 531: >> >>> 529: >>> 530: void InstanceKlass::deallocate_methods(ClassLoaderData* loader_data, >>> 531:Array* methods, >>> InstanceKlass* klass) { >> >> An explicit boolean parameter would be cleaner/clearer. > > I just removed the `klass` argument. It is not really used anyway. I actually ended up with a boolean parameter here. >> src/hotspot/share/oops/method.hpp line 730: >> >>> 728: // so handles are not used to avoid deadlock. >>> 729: jmethodID find_jmethod_id_or_null() { >>> 730: return method_holder() != nullptr ? >>> method_holder()->jmethod_id_or_null(this) : nullptr; >> >> If `method_holder()` is null at this point what does that mean for the >> lifecycle of the Method? > > Please, ignore this part of code for the time being. I had a crash in CI > which was pointing vaguely to this code - unfortunately, the hs_err.log files > are not preserved in the test archives and I am not able to reproduce the > failure locally. I need to debug the crash and make sure I understand the > root cause. _Update:_ I was able to get to the bottom of the methods not having method holder associated with them. The `ClassFileParser` does not finalize initialization of the `InstanceKlass` it has created if `_klass != nullptr` (https://github.com/openjdk/jdk/blob/9727f4bdddc071e6f59806087339f345405ab004/src/hotspot/share/classfile/classFileParser.cpp#L5161). This also means, that the `Method` instances are not wired to their method holders via 'constant method'->'constant pool'->'pool holder' chain. However, they need to be deallocated and as such I really need a distinguishing argument for `InstanceKlass::deallocate_methods` call such that I don't attempt to resolve `jmethodid` values in that case. - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1396296501 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1396175846
Re: RFR: 8313816: Accessing jmethodID might lead to spurious crashes
On Wed, 15 Nov 2023 05:35:49 GMT, David Holmes wrote: >> Please, review this fix for a corner case handling of `jmethodID` values. >> >> The issue is related to the interplay between `jmethodID` values and method >> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` >> instance. Once that method gets redefined, the `jmethodID` is updated to >> point to the last `Method` version. >> Unless the method is still on stack/running, in which case the original >> `jmethodID` will be redirected to the latest `Method` version and at the >> same time the 'previous' `Method` version will receive a new `jmethodID` >> pointing to that previous version. >> >> If we happen to capture stacktrace via `GetStackTrace` or >> `GetAllStackTraces` JVMTI calls while this previous `Method` version is >> still on stack we will have the corresponding frame identified by a >> `jmethodID` pointing to that version. >> However, sooner or later the 'previous' class version becomes eligible for >> cleanup at what time all contained `Method` instances. The cleanup process >> will not perform the `jmethodID` pointer maintenance and we will end up with >> pointers to deallocated memory. >> This is caused by the fact that the `jmethodID` lifecycle is bound to >> `ClassLoaderData` instance and all relevant `jmethodID`s will get >> batch-updated when the class loader is being released and all its classes >> are getting unloaded. >> >> This means that we need to make sure that if a `Method` instance is being >> deallocate the associated `jmethodID` (if any) must not point to the >> deallocated instance once we are finished. Unfortunately, we can not just >> update the `jmethodID` values in bulk when purging an old class version - >> the per `InstanceKlass` jmethodID cache is present only for the main class >> version and contains `jmethodID` values for both the old and current method >> versions. >> >> Therefore we need to perform `jmethodID` lookup when we are about to >> deallocate a `Method` instance and clean up the pointer only if that >> `jmethodID` is pointing to the `Method` instance which is being deallocated. >> >> _(For anyone interested, a much lengthier writeup is available in [my >> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ > > src/hotspot/share/oops/instanceKlass.cpp line 531: > >> 529: >> 530: void InstanceKlass::deallocate_methods(ClassLoaderData* loader_data, >> 531:Array* methods, >> InstanceKlass* klass) { > > An explicit boolean parameter would be cleaner/clearer. I just removed the `klass` argument. It is not really used anyway. > src/hotspot/share/oops/instanceKlass.cpp line 542: > >> 540: if (klass) { >> 541: jmethodID jmid = method->find_jmethod_id_or_null(); >> 542: // Do the pointer maintenance before releasing the metadata, >> just in case > > I assume there should be a period after 'case`. But just in case of what? The code was moved to`method.cpp` and this particular comment line became obsolete > src/hotspot/share/oops/instanceKlass.cpp line 549: > >> 547: if (jmid != nullptr && *((Method**)jmid) == method) { >> 548: *((Method**)jmid) = nullptr; >> 549: } > > This should be abstracted behind a utility function e.g. > `method->clear_jmethod_id()`. Done > src/hotspot/share/oops/method.cpp line 2277: > >> 2275: } >> 2276: } >> 2277: > > Can this race with redefinition? The cleanup of previous versions is executed in VM_Operation at a safepoint - therefore we should be safe against races with class redefinitions. I am adding an assert to `clear_jmethod_id()` to check for being at a safepoint. > src/hotspot/share/oops/method.hpp line 730: > >> 728: // so handles are not used to avoid deadlock. >> 729: jmethodID find_jmethod_id_or_null() { >> 730: return method_holder() != nullptr ? >> method_holder()->jmethod_id_or_null(this) : nullptr; > > If `method_holder()` is null at this point what does that mean for the > lifecycle of the Method? Please, ignore this part of code for the time being. I had a crash in CI which was pointing vaguely to this code - unfortunately, the hs_err.log files are not preserved in the test archives and I am not able to reproduce the failure locally. I need to debug the crash and make sure I understand the root cause. - PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1394642247 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1394647034 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1394647173 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1395100685 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1395102362
RFR: 8313816: Accessing jmethodID might lead to spurious crashes
Please, review this fix for a corner case handling of `jmethodID` values. The issue is related to the interplay between `jmethodID` values and method redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` instance. Once that method gets redefined, the `jmethodID` is updated to point to the last `Method` version. Unless the method is still on stack/running, in which case the original `jmethodID` will be redirected to the latest `Method` version and at the same time the 'previous' `Method` version will receive a new `jmethodID` pointing to that previous version. If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` JVMTI calls while this previous `Method` version is still on stack we will have the corresponding frame identified by a `jmethodID` pointing to that version. However, sooner or later the 'previous' class version becomes eligible for cleanup at what time all contained `Method` instances. The cleanup process will not perform the `jmethodID` pointer maintenance and we will end up with pointers to deallocated memory. This is caused by the fact that the `jmethodID` lifecycle is bound to `ClassLoaderData` instance and all relevant `jmethodID`s will get batch-updated when the class loader is being released and all its classes are getting unloaded. This means that we need to make sure that if a `Method` instance is being deallocate the associated `jmethodID` (if any) must not point to the deallocated instance once we are finished. Unfortunately, we can not just update the `jmethodID` values in bulk when purging an old class version - the per `InstanceKlass` jmethodID cache is present only for the main class version and contains `jmethodID` values for both the old and current method versions. Therefore we need to perform `jmethodID` lookup when we are about to deallocate a `Method` instance and clean up the pointer only if that `jmethodID` is pointing to the `Method` instance which is being deallocated. _(For anyone interested, a much lengthier writeup is available in [my blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ - Commit messages: - 8313816: Accessing jmethodID might lead to spurious crashes Changes: https://git.openjdk.org/jdk/pull/16662/files Webrev: https://webrevs.openjdk.org/?repo=jdk=16662=00 Issue: https://bugs.openjdk.org/browse/JDK-8313816 Stats: 282 lines in 9 files changed: 278 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/16662.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16662/head:pull/16662 PR: https://git.openjdk.org/jdk/pull/16662
Re: RFR: 8304725: AsyncGetCallTrace can cause SIGBUS on M1 [v3]
On Thu, 23 Mar 2023 12:45:27 GMT, Johannes Bechberger wrote: > Yes, but while JFR interrupts threads too, its sampler runs in its own > thread, so the async-safety of the interrupted code should not matter, or? And this sampling approach is quite subpar when compared with the signal based profiling - both in terms of bias and precision. Meaning that if/when JFR would implement a state-of-the-art CPU profiler it will have to deal with the same issues we are seeing here. - PR Comment: https://git.openjdk.org/jdk/pull/13144#issuecomment-1482370219
Re: [External] : Re: Disallowing the dynamic loading of agents by default
Hi, On Mon, Mar 20, 2023 at 11:11 AM Ron Pressler wrote: > Hi. > > The majority of serviceability tools don’t require dynamically loading an > agent, and the majority of applications never load an agent dynamically. > The majority of the JDK built-in tools, I would say. What about eg. the JMC agent? > > True, there are some tools that will be affected, which is why the > decision was to introduce the flag in JDK 9 and to announce this change, > but change the default in a later version to give tools ample time to > prepare their users. The rationale for this change then hasn’t changed, but > will be reiterated in a JEP (we just wanted to announce this ahead of the > JEP to give tool authors another reminder more than six months ahead of JDK > 21). The only change between then and now is that even fewer use cases > require dynamically loaded agents, and so the impact is even smaller. > As a maintainer of one of such tools I can confidently say that this change will either kill the tool as the ease of use will be gone or the workaround (eg. using JAVA_TOOL_OPTIONS) will completely defeat the purpose of this change. Having to put a flag when starting the JVM to allow dynamic loading of agents sounds a bit nonsensical to me - it would be much easier to directly add the agent to the JVM startup and then implement a lightweight control protocol over socket/shared memory to enabled/disable the agent features dynamically. > > It is also true that, when starting an application you don’t know that you > *will* need to load an agent, but in most situations you know that you > might. E.g. processes that are too critical to bring down even for deep > maintenance (although not many of these are written in modern version of > Java anyone) or canary services that are under trial. The relatively few > sophisticated users who know how to write ad-hoc agents can even opt to > enable dynamic agent loading on all their servers; these users are better > equipped to can weigh the risks and tradeoffs involved. > Wouldn't having this enabled system-wide actually defeat the purpose of having this flag? Considering that the dynamic attach can be performed only on the same host under the same user as the target process there seems to be a very small chance of loading agents accidentally. In the end people would set up their systems to enabled dynamic agent loading via eg. JAVA_TOOL_OPTIONS and we will be in the same place as before, with the additional hurdle of setting everything up. > Finally, some tools that require a dynamically loaded JVM TI agents, such > as profilers that profile native code, are so tied to the VM's internals > that the best place for them is in the JDK. If anything, the bigger problem > is not that profilers are used too much in production, but too little, > including less advanced ones that don’t require an agent. There is plenty > of time to enhance the JDK’s built-in profiling capabilities ahead of > demand. > I think this is an overly optimistic view. It is *much more* difficult to enhance the JDK's built-in profiling capabilities than do the same in an external profiling agent. Overall, I don't seem to understand the anticipated attack vectors this change is supposed to prevent. AFAIK, in order to do the dynamic agent load one needs to have full access to the target process. That means that there are more convenient and straightforward ways to do anything nefarious than loading a JVMTI agent. Am I missing some other usages where the JVMTI agent would actually give access to something which would be otherwise inaccessible considering that the attacher and attachee must be on the same host and under the same user? Cheers, -JB- > > — Ron > > On 20 Mar 2023, at 01:21, Andrei Pangin wrote: > > Hi all, > > Serviceability has been one of the biggest Java strengths, but the > proposed change is going to have a large negative impact on it. > > Disallowing dynamic agents by default means it will no longer be possible > to attach a profiler to a running app in runtime. JFR cannot close this gap > due to lack of capabilities modern Java profilers have (that's a separate > topic though). > > When an issue happens with a live app, it's already too late to add a > command line argument. Furthermore, it may not be even feasible to add an > agent at startup in containerized applications. Starting profiler on demand > from the host OS or from a sidecar is the only viable solution in these > cases. > > Next, it's hard to predict beforehand what tools exactly might be useful > for troubleshooting: e.g., one tool may be better for finding memory leaks, > a different one for analyzing CPU performance. Adding all possible tools at > startup does not seem a reasonable approach, especially when tools may > conflict with each other. > > The most important aspect of dynamic agents is the possibility to make a > special tool just in time for solving a particular problem. A typical > example is to get a value of