Hi, I would like propose to
1. Replace duplicated stack walking code with unified API 2. Create a new version of AsyncGetCallTrace, tentatively called "AsyncGetCallTrace2", with more information on more frames using the unified API A demo (as well as this text) is available at https://github.com/parttimenerd/asgct2-demo if you want to see a prototype of this proposal in action. Unify Stack Walking ================ There are currently multiple implementations of stack walking in JFR and for AsyncGetCallTrace. They each implement their own extension of vframeStream but with comparable features and check for problematic frames. My proposal is, therefore, to replace the stack walking code with a unified API that includes all error checking and vframeStream extensions in a single place. The prosposed new class is called StackWalker and could be part of `jfr/recorder/stacktrace` [1]. This class also supports getting information on C frames so it can be potentially used for walking stacks in VMError (used to create hs_err files), further reducing the amount of different stack walking code. AsyncGetCallTrace2 ================ The AsyncGetCallTrace call has seen increasing use in recent years in profilers like async-profiler. But it is not really an API (not exported in any header) and the information on frames it returns is pretty limited (only the method and bci for Java frames) which makes implementing profilers and other tooling harder. Tools like async-profiler have to resort to complicated code to partially obtain the information that the JVM already has. Information that is currently hidden and impossible to obtain is - whether a compiled frame is inlined (currently only obtainable for the topmost compiled frames) - although this can be obtained using JFR - C frames that are not at the top of the stack - compilation level (C1 or C2 compiled) This information is helpful when profiling and tuning the VM for a given application and also for profiling code that uses JNI heavily. Using the proposed StackWalker class, implementing a new API that returns more information on frames is possible as a thin wrapper over the StackWalker API [2]. This also improves the maintainability as the code used in this API is used in multiple places and is therefore also better tested than the previous implementation, see [1] for the implementation. The following describes the proposed API: ```cpp void AsyncGetCallTrace2(asgct2::CallTrace *trace, jint depth, void* ucontext); ``` The structure of `CallTrace` is the same as the original `ASGCT_CallTrace` with the same error codes encoded in <= 0 values of `num_frames`. ```cpp typedef struct { JNIEnv *env_id; // Env where trace was recorded jint num_frames; // number of frames in this trace CallFrame *frames; // frames void* frame_info; // more information on frames } CallTrace; ``` The only difference is that the `frames` array also contains information on C frames and the field `frame_info`. The `frame_info` is currently null and can later be used for extended information on each frame, being an array with an element for each frame. But the type of the elements in this array is implementation specific. This akin to `compile_info` field in JVMTI's CompiledMethodLoad [3] and used for extending the information returned by the API later. Protoype ------------ Currently `CallFrame` is implemented in the prototype [4] as ```cpp typedef struct { void *machine_pc; // program counter, for C and native frames (frames of native methods) uint8_t type; // frame type (single byte) uint8_t comp_level; // highest compilation level of a method related to a Java frame // information from original CallFrame jint bci; // bci for Java frames jmethodID method_id; // method ID for Java frames } CallFrame; ``` The `FrameTypeId` is based on the frame type in JFRStackFrame: ```cpp enum FrameTypeId { FRAME_INTERPRETED = 0, FRAME_JIT = 1, // JIT compiled FRAME_INLINE = 2, // inlined JITed methods FRAME_NATIVE = 3, // native wrapper to call C methods from Java FRAME_CPP = 4 // c/c++/... frames, stub frames have CompLevel_all }; ``` The `comp_level` states the compilation level of the method related to the frame with higher numbers representing "more" compilation. `0` is defined as interpreted. It is modeled after the `CompLevel` enum in `compiler/compilerDefinitions`: ```cpp // Enumeration to distinguish tiers of compilation enum CompLevel { // ... CompLevel_none = 0, // Interpreter CompLevel_simple = 1, // C1 CompLevel_limited_profile = 2, // C1, invocation & backedge counters CompLevel_full_profile = 3, // C1, invocation & backedge counters + mdo CompLevel_full_optimization = 4 // C2 or JVMCI }; ``` The traces produced by this prototype are fairly large (each frame requires 24 is instead of 16 bytes on 64 bit systems) and some data is duplicated. The reason for this is that it simplified the extension of async-profiler for the prototype, as it only extends the data structures of the original AsyncGetCallTrace API without changing the original fields. Proposal ------------ But packing the information and reducing duplication is of course possible if we step away from the former constraint: ```cpp enum FrameTypeId { FRAME_JAVA = 1, // JIT compiled and interpreted FRAME_JAVA_INLINED = 2, // inlined JIT compiled FRAME_NATIVE = 3, // native wrapper to call C methods from Java FRAME_STUB = 4, // VM generated stubs FRAME_CPP = 5 // C/C++/... frames }; typedef struct { uint8_t type; // frame type uint8_t comp_level; uint16_t bci; // 0 < bci < 65536 jmethodID method_id; } JavaFrame; // used for FRAME_JAVA and FRAME_JAVA_INLINED typedef struct { FrameTypeId type; // single byte type void *machine_pc; } NonJavaFrame; // used for FRAME_NATIVE, FRAME_STUB and FRAME_CPP typedef union { FrameTypeId type; // to distinguish between JavaFrame and NonJavaFrame JavaFrame java_frame; NonJavaFrame non_java_frame; } CallFrame; ``` This uses the same amount of space per frame (16 bytes) as the original but encodes far more information. Best regards Johannes [1] https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/jfr/recorder/stacktrace/stackWalker.hpp [2] https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/prims/asgct2.cpp**** [3] https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#CompiledMethodLoad [4] https://github.com/parttimenerd/jdk/blob/parttimenerd_asgct2/src/hotspot/share/prims/asgct2.hpp