Re: Proposal of a new version of AsyncGetCallTrace

2022-03-21 Thread Bechberger, Johannes
Hi David,

the src/hotspot/share/prims/stackwalk.cpp has been made for another purpose: It 
is used in java.lang.StackWalker to obtain the Java frames and allocates memory 
on the heap.
It is not used in the places that my proposed Stackwalker could be used: in 
profiling and error stack traces (hs_err file), where memory allocation might 
cause problems. The class in prims also lacks the ability to obtain the native 
frames.

Best regards
Johannes

From: David Holmes 
Date: Sunday, 20. March 2022 at 23:41
To: Bechberger, Johannes , 
hotspot-...@openjdk.java.net , 
hotspot-jfr-...@openjdk.java.net , 
serviceability-dev@openjdk.java.net 
Subject: Re: Proposal of a new version of AsyncGetCallTrace
Hi Johannes,

On 18/03/2022 7:43 pm, Bechberger, Johannes wrote:
> Hi,
>
> I would like propose to
>
> 1. Replace duplicated stack walking code with unified API
> 2. Create a new version of AsyncGetCallTrace, tentatively called 
> "AsyncGetCallTrace2", with more information on more frames using the unified 
> API
>
> A demo (as well as this text) is available at 
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fparttimenerd%2Fasgct2-demodata=04%7C01%7Cjohannes.bechberger%40sap.com%7Ceb5ca380974b40d8ad1908da0ac2c949%7C42f7676cf455423c82f6dc2d99791af7%7C0%7C0%7C637834128980129964%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=WzQw1M5pnBdK5PxtohnURFAJeSJOZy1ZAuRGSaczjOU%3Dreserved=0
> if you want to see a prototype of this proposal in action.
>
> Unify Stack Walking
> 
>
> There are currently multiple implementations of stack walking in JFR and for 
> AsyncGetCallTrace.
> They each implement their own extension of vframeStream but with comparable 
> features
> and check for problematic frames.
>
> My proposal is, therefore, to replace the stack walking code with a unified 
> API that
> includes all error checking and vframeStream extensions in a single place.
> The prosposed new class is called StackWalker and could be part of
> `jfr/recorder/stacktrace` [1].

So we already have the StackWalker API provided at the Java level and
with the implementation in the VM in
src/hotspot/share/prims/stackwalk.cpp. How does that fit in with what
you propose?

Cheers,
David
-

> This class also supports getting information on C frames so it can be 
> potentially
> used for walking stacks in VMError (used to create hs_err files), further
> reducing the amount of different stack walking code.
>
> AsyncGetCallTrace2
> 
>
> The AsyncGetCallTrace call has seen increasing use in recent years
> in profilers like async-profiler.
> But it is not really an API (not exported in any header) and
> the information on frames it returns is pretty limited
> (only the method and bci for Java frames) which makes implementing
> profilers and other tooling harder. Tools like async-profiler
> have to resort to complicated code to partially obtain the information
> that the JVM already has.
> Information that is currently hidden and impossible to obtain is
>
> - whether a compiled frame is inlined (currently only obtainable for the 
> topmost compiled frames)
>-  although this can be obtained using JFR
> - C frames that are not at the top of the stack
> - compilation level (C1 or C2 compiled)
>
> This information is helpful when profiling and tuning the VM for
> a given application and also for profiling code that uses
> JNI heavily.
>
> Using the proposed StackWalker class, implementing a new API
> that returns more information on frames is possible
> as a thin wrapper over the StackWalker API [2].
> This also improves the maintainability as the code used
> in this API is used in multiple places and is therefore
> also better tested than the previous implementation, see
> [1] for the implementation.
>
> The following describes the proposed API:
>
> ```cpp
> void AsyncGetCallTrace2(asgct2::CallTrace *trace, jint depth, void* ucontext);
> ```
>
> The structure of `CallTrace` is the same as the original
> `ASGCT_CallTrace` with the same error codes encoded in <= 0
> values of `num_frames`.
>
> ```cpp
> typedef struct {
>JNIEnv *env_id;   // Env where trace was recorded
>jint num_frames;  // number of frames in this trace
>CallFrame *frames;// frames
>void* frame_info; // more information on frames
> } CallTrace;
> ```
>
> The only difference is that the `frames` array also contains
> information on C frames and the field `frame_info`.
> The `frame_info` is currently null and can later be used
> for extended information on each frame, being an array with
> an element for eac

Re: Proposal of a new version of AsyncGetCallTrace

2022-03-20 Thread David Holmes

Hi Johannes,

On 18/03/2022 7:43 pm, Bechberger, Johannes wrote:

Hi,

I would like propose to

1. Replace duplicated stack walking code with unified API
2. Create a new version of AsyncGetCallTrace, tentatively called 
"AsyncGetCallTrace2", with more information on more frames using the unified API

A demo (as well as this text) is available at 
https://github.com/parttimenerd/asgct2-demo
if you want to see a prototype of this proposal in action.

Unify Stack Walking


There are currently multiple implementations of stack walking in JFR and for 
AsyncGetCallTrace.
They each implement their own extension of vframeStream but with comparable 
features
and check for problematic frames.

My proposal is, therefore, to replace the stack walking code with a unified API 
that
includes all error checking and vframeStream extensions in a single place.
The prosposed new class is called StackWalker and could be part of
`jfr/recorder/stacktrace` [1].


So we already have the StackWalker API provided at the Java level and 
with the implementation in the VM in 
src/hotspot/share/prims/stackwalk.cpp. How does that fit in with what 
you propose?


Cheers,
David
-


This class also supports getting information on C frames so it can be 
potentially
used for walking stacks in VMError (used to create hs_err files), further
reducing the amount of different stack walking code.

AsyncGetCallTrace2


The AsyncGetCallTrace call has seen increasing use in recent years
in profilers like async-profiler.
But it is not really an API (not exported in any header) and
the information on frames it returns is pretty limited
(only the method and bci for Java frames) which makes implementing
profilers and other tooling harder. Tools like async-profiler
have to resort to complicated code to partially obtain the information
that the JVM already has.
Information that is currently hidden and impossible to obtain is

- whether a compiled frame is inlined (currently only obtainable for the 
topmost compiled frames)
   -  although this can be obtained using JFR
- C frames that are not at the top of the stack
- compilation level (C1 or C2 compiled)

This information is helpful when profiling and tuning the VM for
a given application and also for profiling code that uses
JNI heavily.

Using the proposed StackWalker class, implementing a new API
that returns more information on frames is possible
as a thin wrapper over the StackWalker API [2].
This also improves the maintainability as the code used
in this API is used in multiple places and is therefore
also better tested than the previous implementation, see
[1] for the implementation.

The following describes the proposed API:

```cpp
void AsyncGetCallTrace2(asgct2::CallTrace *trace, jint depth, void* ucontext);
```

The structure of `CallTrace` is the same as the original
`ASGCT_CallTrace` with the same error codes encoded in <= 0
values of `num_frames`.

```cpp
typedef struct {
   JNIEnv *env_id;   // Env where trace was recorded
   jint num_frames;  // number of frames in this trace
   CallFrame *frames;// frames
   void* frame_info; // more information on frames
} CallTrace;
```

The only difference is that the `frames` array also contains
information on C frames and the field `frame_info`.
The `frame_info` is currently null and can later be used
for extended information on each frame, being an array with
an element for each frame. But the type of the
elements in this array is implementation specific.
This akin to `compile_info` field in JVMTI's CompiledMethodLoad
[3] and used for extending the information returned by the
API later.

Protoype


Currently `CallFrame` is implemented in the prototype [4] as

```cpp
typedef struct {
   void *machine_pc;   // program counter, for C and native frames 
(frames of native methods)
   uint8_t type;   // frame type (single byte)
   uint8_t comp_level; // highest compilation level of a method related 
to a Java frame
   // information from original CallFrame
   jint bci;   // bci for Java frames
   jmethodID method_id;// method ID for Java frames
} CallFrame;
```

The `FrameTypeId` is based on the frame type in JFRStackFrame:

```cpp
enum FrameTypeId {
   FRAME_INTERPRETED = 0,
   FRAME_JIT = 1, // JIT compiled
   FRAME_INLINE  = 2, // inlined JITed methods
   FRAME_NATIVE  = 3, // native wrapper to call C methods from Java
   FRAME_CPP = 4  // c/c++/... frames, stub frames have CompLevel_all
};
```

The `comp_level` states the compilation level of the method related to the frame
with higher numbers representing "more" compilation. `0` is defined as
interpreted. It is modeled after the `CompLevel` enum in 
`compiler/compilerDefinitions`:

```cpp
// Enumeration to distinguish tiers of compilation
enum CompLevel {
   // ...
   CompLevel_none  = 0, // 

Re: Proposal of a new version of AsyncGetCallTrace

2022-03-18 Thread Remi Forax
Knowing if there is a C stackframe in the middle of the stack while blocking on 
a synchronized is an important feature for a profiler when loom will land.

RĂ©mi

- Original Message -
> From: "Bechberger, Johannes" 
> To: "hotspot-dev" , 
> hotspot-jfr-...@openjdk.java.net, "serviceability-dev"
> 
> Sent: Friday, March 18, 2022 10:43:58 AM
> Subject: Proposal of a new version of AsyncGetCallTrace

> Hi,
> 
> I would like propose to
> 
> 1. Replace duplicated stack walking code with unified API
> 2. Create a new version of AsyncGetCallTrace, tentatively called
> "AsyncGetCallTrace2", with more information on more frames using the unified
> API
> 
> A demo (as well as this text) is available at
> https://github.com/parttimenerd/asgct2-demo
> if you want to see a prototype of this proposal in action.
> 
> Unify Stack Walking
> 
> 
> There are currently multiple implementations of stack walking in JFR and for
> AsyncGetCallTrace.
> They each implement their own extension of vframeStream but with comparable
> features
> and check for problematic frames.
> 
> My proposal is, therefore, to replace the stack walking code with a unified 
> API
> that
> includes all error checking and vframeStream extensions in a single place.
> The prosposed new class is called StackWalker and could be part of
> `jfr/recorder/stacktrace` [1].
> This class also supports getting information on C frames so it can be
> potentially
> used for walking stacks in VMError (used to create hs_err files), further
> reducing the amount of different stack walking code.
> 
> AsyncGetCallTrace2
> 
> 
> The AsyncGetCallTrace call has seen increasing use in recent years
> in profilers like async-profiler.
> But it is not really an API (not exported in any header) and
> the information on frames it returns is pretty limited
> (only the method and bci for Java frames) which makes implementing
> profilers and other tooling harder. Tools like async-profiler
> have to resort to complicated code to partially obtain the information
> that the JVM already has.
> Information that is currently hidden and impossible to obtain is
> 
> - whether a compiled frame is inlined (currently only obtainable for the 
> topmost
> compiled frames)
>  -  although this can be obtained using JFR
> - C frames that are not at the top of the stack
> - compilation level (C1 or C2 compiled)
> 
> This information is helpful when profiling and tuning the VM for
> a given application and also for profiling code that uses
> JNI heavily.
> 
> Using the proposed StackWalker class, implementing a new API
> that returns more information on frames is possible
> as a thin wrapper over the StackWalker API [2].
> This also improves the maintainability as the code used
> in this API is used in multiple places and is therefore
> also better tested than the previous implementation, see
> [1] for the implementation.
> 
> The following describes the proposed API:
> 
> ```cpp
> void AsyncGetCallTrace2(asgct2::CallTrace *trace, jint depth, void* ucontext);
> ```
> 
> The structure of `CallTrace` is the same as the original
> `ASGCT_CallTrace` with the same error codes encoded in <= 0
> values of `num_frames`.
> 
> ```cpp
> typedef struct {
>  JNIEnv *env_id;   // Env where trace was recorded
>  jint num_frames;  // number of frames in this trace
>  CallFrame *frames;// frames
>  void* frame_info; // more information on frames
> } CallTrace;
> ```
> 
> The only difference is that the `frames` array also contains
> information on C frames and the field `frame_info`.
> The `frame_info` is currently null and can later be used
> for extended information on each frame, being an array with
> an element for each frame. But the type of the
> elements in this array is implementation specific.
> This akin to `compile_info` field in JVMTI's CompiledMethodLoad
> [3] and used for extending the information returned by the
> API later.
> 
> Protoype
> 
> 
> Currently `CallFrame` is implemented in the prototype [4] as
> 
> ```cpp
> typedef struct {
>  void *machine_pc;   // program counter, for C and native frames 
> (frames
>  of native methods)
>  uint8_t type;   // frame type (single byte)
>  uint8_t comp_level; // highest compilation level of a method related 
> to
>  a Java frame
>  // information from original CallFrame
>  jint bci;   // bci for Java frames
>  jmethodID method_id;// method ID for Java frames
> } CallFrame;
> ```
> 
> The `FrameTypeId` is based on the frame type in JFRStackFrame:
> 
> ```cpp
> enum FrameTypeId {
>  FRAME_INTERPRETED = 0,
>  FRAME_JIT = 1, // JIT compiled
>  FRAME_INLINE  = 2, // inlined JITed methods
>  FRAME_NATIVE  = 3, // native wrapper to call C methods from Java
>  FRAME_CPP = 4  // c/c++/... frames, stub frames have CompLevel_all
> };
> ```
> 
> The `comp_level` states the