Hi David, the src/hotspot/share/prims/stackwalk.cpp has been made for another purpose: It is used in java.lang.StackWalker to obtain the Java frames and allocates memory on the heap. It is not used in the places that my proposed Stackwalker could be used: in profiling and error stack traces (hs_err file), where memory allocation might cause problems. The class in prims also lacks the ability to obtain the native frames.
Best regards Johannes From: David Holmes <david.hol...@oracle.com> Date: Sunday, 20. March 2022 at 23:41 To: Bechberger, Johannes <johannes.bechber...@sap.com>, hotspot-...@openjdk.java.net <hotspot-...@openjdk.java.net>, hotspot-jfr-...@openjdk.java.net <hotspot-jfr-...@openjdk.java.net>, serviceability-dev@openjdk.java.net <serviceability-dev@openjdk.java.net> Subject: Re: Proposal of a new version of AsyncGetCallTrace Hi Johannes, On 18/03/2022 7:43 pm, Bechberger, Johannes wrote: > Hi, > > I would like propose to > > 1. Replace duplicated stack walking code with unified API > 2. Create a new version of AsyncGetCallTrace, tentatively called > "AsyncGetCallTrace2", with more information on more frames using the unified > API > > A demo (as well as this text) is available at > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fparttimenerd%2Fasgct2-demo&data=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=WzQw1M5pnBdK5PxtohnURFAJeSJOZy1ZAuRGSaczjOU%3D&reserved=0 > if you want to see a prototype of this proposal in action. > > Unify Stack Walking > ================ > > There are currently multiple implementations of stack walking in JFR and for > AsyncGetCallTrace. > They each implement their own extension of vframeStream but with comparable > features > and check for problematic frames. > > My proposal is, therefore, to replace the stack walking code with a unified > API that > includes all error checking and vframeStream extensions in a single place. > The prosposed new class is called StackWalker and could be part of > `jfr/recorder/stacktrace` [1]. So we already have the StackWalker API provided at the Java level and with the implementation in the VM in src/hotspot/share/prims/stackwalk.cpp. How does that fit in with what you propose? Cheers, David ----- > This class also supports getting information on C frames so it can be > potentially > used for walking stacks in VMError (used to create hs_err files), further > reducing the amount of different stack walking code. > > AsyncGetCallTrace2 > ================ > > The AsyncGetCallTrace call has seen increasing use in recent years > in profilers like async-profiler. > But it is not really an API (not exported in any header) and > the information on frames it returns is pretty limited > (only the method and bci for Java frames) which makes implementing > profilers and other tooling harder. Tools like async-profiler > have to resort to complicated code to partially obtain the information > that the JVM already has. > Information that is currently hidden and impossible to obtain is > > - whether a compiled frame is inlined (currently only obtainable for the > topmost compiled frames) > - although this can be obtained using JFR > - C frames that are not at the top of the stack > - compilation level (C1 or C2 compiled) > > This information is helpful when profiling and tuning the VM for > a given application and also for profiling code that uses > JNI heavily. > > Using the proposed StackWalker class, implementing a new API > that returns more information on frames is possible > as a thin wrapper over the StackWalker API [2]. > This also improves the maintainability as the code used > in this API is used in multiple places and is therefore > also better tested than the previous implementation, see > [1] for the implementation. > > The following describes the proposed API: > > ```cpp > void AsyncGetCallTrace2(asgct2::CallTrace *trace, jint depth, void* ucontext); > ``` > > The structure of `CallTrace` is the same as the original > `ASGCT_CallTrace` with the same error codes encoded in <= 0 > values of `num_frames`. > > ```cpp > typedef struct { > JNIEnv *env_id; // Env where trace was recorded > jint num_frames; // number of frames in this trace > CallFrame *frames; // frames > void* frame_info; // more information on frames > } CallTrace; > ``` > > The only difference is that the `frames` array also contains > information on C frames and the field `frame_info`. > The `frame_info` is currently null and can later be used > for extended information on each frame, being an array with > an element for each frame. But the type of the > elements in this array is implementation specific. > This akin to `compile_info` field in JVMTI's CompiledMethodLoad > [3] and used for extending the information returned by the > API later. > > Protoype > ------------ > > Currently `CallFrame` is implemented in the prototype [4] as > > ```cpp > typedef struct { > void *machine_pc; // program counter, for C and native frames > (frames of native methods) > uint8_t type; // frame type (single byte) > uint8_t comp_level; // highest compilation level of a method > related to a Java frame > // information from original CallFrame > jint bci; // bci for Java frames > jmethodID method_id; // method ID for Java frames > } CallFrame; > ``` > > The `FrameTypeId` is based on the frame type in JFRStackFrame: > > ```cpp > enum FrameTypeId { > FRAME_INTERPRETED = 0, > FRAME_JIT = 1, // JIT compiled > FRAME_INLINE = 2, // inlined JITed methods > FRAME_NATIVE = 3, // native wrapper to call C methods from Java > FRAME_CPP = 4 // c/c++/... frames, stub frames have CompLevel_all > }; > ``` > > The `comp_level` states the compilation level of the method related to the > frame > with higher numbers representing "more" compilation. `0` is defined as > interpreted. It is modeled after the `CompLevel` enum in > `compiler/compilerDefinitions`: > > ```cpp > // Enumeration to distinguish tiers of compilation > enum CompLevel { > // ... > CompLevel_none = 0, // Interpreter > CompLevel_simple = 1, // C1 > CompLevel_limited_profile = 2, // C1, invocation & backedge > counters > CompLevel_full_profile = 3, // C1, invocation & backedge > counters + mdo > CompLevel_full_optimization = 4 // C2 or JVMCI > }; > ``` > > The traces produced by this prototype are fairly large > (each frame requires 24 is instead of 16 bytes on 64 bit systems) and some > data is > duplicated. > The reason for this is that it simplified the extension of async-profiler > for the prototype, as it only extends the data structures of > the original AsyncGetCallTrace API without changing the original fields. > > Proposal > ------------ > > But packing the information and reducing duplication is of course possible > if we step away from the former constraint: > > ```cpp > enum FrameTypeId { > FRAME_JAVA = 1, // JIT compiled and interpreted > FRAME_JAVA_INLINED = 2, // inlined JIT compiled > FRAME_NATIVE = 3, // native wrapper to call C methods from Java > FRAME_STUB = 4, // VM generated stubs > FRAME_CPP = 5 // C/C++/... frames > }; > > typedef struct { > uint8_t type; // frame type > uint8_t comp_level; > uint16_t bci; // 0 < bci < 65536 > jmethodID method_id; > } JavaFrame; // used for FRAME_JAVA and FRAME_JAVA_INLINED > > typedef struct { > FrameTypeId type; // single byte type > void *machine_pc; > } NonJavaFrame; // used for FRAME_NATIVE, FRAME_STUB and FRAME_CPP > > typedef union { > FrameTypeId type; // to distinguish between JavaFrame and NonJavaFrame > JavaFrame java_frame; > NonJavaFrame non_java_frame; > } CallFrame; > ``` > > This uses the same amount of space per frame (16 bytes) as the original but > encodes far more information. > > Best regards > Johannes > > [1] > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fparttimenerd%2Fjdk%2Fblob%2Fparttimenerd_asgct2%2Fsrc%2Fhotspot%2Fshare%2Fjfr%2Frecorder%2Fstacktrace%2FstackWalker.hpp&data=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=u8imCsxZLO5qHcB3Kdn%2FiC5FQ2NGwxJzBocy2QGKngM%3D&reserved=0 > > [2] > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fparttimenerd%2Fjdk%2Fblob%2Fparttimenerd_asgct2%2Fsrc%2Fhotspot%2Fshare%2Fprims%2Fasgct2.cpp****&data=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=CVP%2BZFYiz%2Brmvn4RSFmYxttL6SttV9AxSLgWnGoBbXY%3D&reserved=0 > > [3] > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.oracle.com%2Fjavase%2F8%2Fdocs%2Fplatform%2Fjvmti%2Fjvmti.html%23CompiledMethodLoad&data=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=UmILa1T1BI4SKK3k%2BTYnCLE7dTwSL6jmUVvTMQIdA3E%3D&reserved=0 > > [4] > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fparttimenerd%2Fjdk%2Fblob%2Fparttimenerd_asgct2%2Fsrc%2Fhotspot%2Fshare%2Fprims%2Fasgct2.hpp&data=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ROyKJ2ADjrQvVIoLBxwBdKm3AWAdQ2Gk4m6f7He5mNI%3D&reserved=0 >