On Fri, 5 Jun 2026 10:46:20 GMT, Serguei Spitsyn <[email protected]> wrote:
>> The cur_stack_depth is the cache that is maintained in interponly mode to >> optimize getting number of thread in the debugger operations. >> >> Historically, it cause a lot of issues. Most oftenly, related with wrong >> calculation in the case if stack depth changed during yield or stack >> debugger operations. Also, there is a gep between decrement of >> cur_stack_depth and actual stack change where cur_stack_depth is invalid. >> >> The current problem happens when operations like PopFrame or >> ForceEarlyReturn happen when MethodExit is completed and cur_stack_depth is >> decremented while physical stack is not removed yet. It is reproduced by >> internal stress test. >> >> We discussed privately with @sspitsyn and there is an idea that this cache >> is not worth to maintain. It is used in interp-only mode which is slow >> already. Really this cur_stack_depth() is not so often used and quite often >> invalidated. >> >> Currently, the function is uses in 3 places: >> >> 1. JvmtiExport::continuation_yield_cleanup, where >> invalidate_cur_stack_depth() is called right before cur_stack_depth(), >> making the caching useless. So no performance degradation here is expected. >> 2. In the post_method_exit_inner when 'ets->has_frame_pops()' is true. And >> actually it might crash in this place. This happens when NotifyFramePop >> event is requested. It doesn't require the interponly mode after fix of >> [JDK-6960970](https://bugs.openjdk.org/browse/JDK-6960970) Debugger very >> slow during stepping >> 3. JvmtiThreadState::process_pending_step_for_popframe which is called in >> UpdateForPopTopFrameClosure during PopFrame when performance is not >> important. Also, the crash happens when 'process_pending_step_for_popframe' >> is executed while cache doesn't correspond the actual state. >> >> >> So I wonder if there are any scenario when the impact of the removal of >> cur_stack_depth is so significant that it makes sense to still maintain it? >> >> i am running tier1-8 and some internal stress testing to verify this fix. >> >> --------- >> - [x] I confirm that I make this contribution in accordance with the >> [OpenJDK Interim AI Policy](https://openjdk.org/legal/ai). > > Overall, it looks good. It is my long time dream to get rid of this > mechanism. :) > Thank you for this attempt to do this! > However, I still have a performance concern which I'd prefer to discuss > offline. This potential performance issue held me of this simplification > before. My concern is about `PopFrame` related fragment in the function > `post_method_exit_inner()`. If stack is big and the `interp_only_mode` has > been already enforced for target thread which has and a `FramePop` request > then recalculation of current stack depth can give a significant performance > overhead. At least, we need to somehow mitigate this issue and measure the > performance impact with a specific benchmark test. @sspitsyn, @plummercj I agree with the concerns. The interp_only mode is used for 5 events now: INTERP_EVENT_BITS = SINGLE_STEP_BIT | METHOD_ENTRY_BIT | METHOD_EXIT_BIT | FIELD_ACCESS_BIT | FIELD_MODIFICATION_BIT; The single-stepping, is pretty slow and I don't think that some performance impact is important here. The methodEntry/Exit are also very slow, JVMTI spec recommend them to use only to trace every method not to find specific method calls. The performance here is also not the case, I suppose. The FieldWatchers indeed might be a concern. The question is is performance matter if field watches are enabled and a lot of method exit are executed with pop frame notifications are enabled. The main issue might be switching to interp only mode by itself. ------------- PR Comment: https://git.openjdk.org/jdk/pull/31389#issuecomment-4635233348
