Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-11-27 Thread Štefan Miklošovič
Hey,

just a heads-up that I have spent some time on moving the work related
to bringing async-profiler to Cassandra to more production-ready state
after I have continued to work on the original code of Yaman and
Bernardo who resolved the comments.

For anybody interested here is the branch

https://github.com/apache/cassandra/pull/4487

and here is the usage:

https://github.com/apache/cassandra/pull/4487/files#diff-374bb067d80842cd2ebd903b290ef247c0e66ee6ad8235eeb0c567ff6af28431

It would be great to gather more feedback on UX here, currently it is
possible to (via nodetool profile)

start / stop profiling
list - gets list of resulting profile files on a node
purge - remove results
fetch - transfer results from node to local machine

The reason for this design is that nodetool does not necessarily have
to be executed on the same machine Cassandra runs at. That means that
if we profile Cassandra remotely (calling respective MBean), then
results (flamegraphs as html etc) would be stored on that node as
well, but then a user does not necessarily have the access to that
disk to actually see the results.

Also, as any other new feature, the profiling capability is disabled
by default (albeit it can be enabled in runtime if one wants, via JMX
method only).

One interesting feature to add is to be able to convert JFR (Java
Flight Recorder) files async-profiler is producing to an output which
also shows "heatmaps". This is a little bit more involved and not sure
if that has to be the part of the initial implementation but it would
be super cool if we can profile out of the box and have heatmaps
automatically too.

I am personally stoked about this async-profiler in Cassandra a lot.
The experience in e.g. IDEA is also great as there is a very nice
plugin to visualise JFR files so one can just look at them in IDEA, or
we can just export into HTML etc ...

Do you have any ideas / insights about what was done so far? I think
this is a great way to shape the solution for now. Sidecar integration
will be easy as well as that will just call related MBean methods.

Regards



On Thu, Jul 17, 2025 at 6:02 PM Doug Rohrer  wrote:
>
> We could build this as part of the existing work - just make the endpoint 
> take an option to allow the user to pick which one gets profiled? Or at least 
> have the API capable of it for the first iteration and then add support for 
> sidecar after the C* side is done?
>
> Doug
>
> > On Jul 16, 2025, at 6:27 PM, Francisco Guerrero  wrote:
> >
> >> Do we want to create a new JIRA ticket to have the async profiler
> >> integration in Sidecar?
> >
> > Yes, +1 for this. I think it makes sense to also have async profiling 
> > capability in Sidecar itself
> >
> > On 2025/07/16 22:22:17 Yifan Cai wrote:
> >> Late to the party.
> >>
> >> Sidecar is a JVM process that would benefit from async profiling too.
> >>
> >> Do we want to create a new JIRA ticket to have the async profiler
> >> integration in Sidecar?
> >>
> >> - Yifan
> >>
> >> On Wed, Jul 16, 2025 at 3:17 PM Yaman Ziadeh (BLOOMBERG/ 919 3RD A) <
> >> [email protected]> wrote:
> >>
> >>> Hi all, I've opened a draft PR for the C* async-profiler feature here
> >>> https://github.com/apache/cassandra/pull/4255 with some initial basic
> >>> implementation - The PR is currently incomplete, but thought I'd open it 
> >>> to
> >>> get any feedback and pointers along the way!
> >>>
> >>
>


Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-07-17 Thread Doug Rohrer
We could build this as part of the existing work - just make the endpoint take 
an option to allow the user to pick which one gets profiled? Or at least have 
the API capable of it for the first iteration and then add support for sidecar 
after the C* side is done?

Doug

> On Jul 16, 2025, at 6:27 PM, Francisco Guerrero  wrote:
> 
>> Do we want to create a new JIRA ticket to have the async profiler
>> integration in Sidecar?
> 
> Yes, +1 for this. I think it makes sense to also have async profiling 
> capability in Sidecar itself
> 
> On 2025/07/16 22:22:17 Yifan Cai wrote:
>> Late to the party.
>> 
>> Sidecar is a JVM process that would benefit from async profiling too.
>> 
>> Do we want to create a new JIRA ticket to have the async profiler
>> integration in Sidecar?
>> 
>> - Yifan
>> 
>> On Wed, Jul 16, 2025 at 3:17 PM Yaman Ziadeh (BLOOMBERG/ 919 3RD A) <
>> [email protected]> wrote:
>> 
>>> Hi all, I've opened a draft PR for the C* async-profiler feature here
>>> https://github.com/apache/cassandra/pull/4255 with some initial basic
>>> implementation - The PR is currently incomplete, but thought I'd open it to
>>> get any feedback and pointers along the way!
>>> 
>> 



Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-07-16 Thread Francisco Guerrero
> Do we want to create a new JIRA ticket to have the async profiler
> integration in Sidecar?

Yes, +1 for this. I think it makes sense to also have async profiling 
capability in Sidecar itself

On 2025/07/16 22:22:17 Yifan Cai wrote:
> Late to the party.
> 
> Sidecar is a JVM process that would benefit from async profiling too.
> 
> Do we want to create a new JIRA ticket to have the async profiler
> integration in Sidecar?
> 
> - Yifan
> 
> On Wed, Jul 16, 2025 at 3:17 PM Yaman Ziadeh (BLOOMBERG/ 919 3RD A) <
> [email protected]> wrote:
> 
> > Hi all, I've opened a draft PR for the C* async-profiler feature here
> > https://github.com/apache/cassandra/pull/4255 with some initial basic
> > implementation - The PR is currently incomplete, but thought I'd open it to
> > get any feedback and pointers along the way!
> >
> 


Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-07-16 Thread Yifan Cai
Late to the party.

Sidecar is a JVM process that would benefit from async profiling too.

Do we want to create a new JIRA ticket to have the async profiler
integration in Sidecar?

- Yifan

On Wed, Jul 16, 2025 at 3:17 PM Yaman Ziadeh (BLOOMBERG/ 919 3RD A) <
[email protected]> wrote:

> Hi all, I've opened a draft PR for the C* async-profiler feature here
> https://github.com/apache/cassandra/pull/4255 with some initial basic
> implementation - The PR is currently incomplete, but thought I'd open it to
> get any feedback and pointers along the way!
>


Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-07-16 Thread Yaman Ziadeh (BLOOMBERG/ 919 3RD A)
Hi all, I've opened a draft PR for the C* async-profiler feature here 
https://github.com/apache/cassandra/pull/4255 with some initial basic 
implementation - The PR is currently incomplete, but thought I'd open it to get 
any feedback and pointers along the way!

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-07-09 Thread Doug Rohrer
Yaman:

Thank you for taking this on - the plan to me looks great, and I'm looking 
forward to seeing it progress.

Doug

> On Jul 2, 2025, at 3:26 PM, Yaman Ziadeh (BLOOMBERG/ 919 3RD A) 
>  wrote:
> 
> Hey everyone!
> 
> Thanks for the great discussion - I've consolidated the discussion points 
> into the following set of requirements:
> Include the async-profiler library with Cassandra
> Allow for easy async-profiler library version upgrades independent of 
> Cassandra
> Expose a JMX interface to access common commands (start, stop, etc.)
> Expose a default-disabled interface to access the 'execute' method for 
> advanced usage
> Create a nodetool command that reaches out to this interface
> Expose this feature through Sidecar's API
> 
> Given these requirements, I'll be developing this feature with the following 
> general plan in mind:
> Drop-in async-profiler jar & native lib files into top-level cassandra repo
> Include these files into the Cassandra build via build.xml
> Make an AsyncProfilerService class for loading native lib file and calling 
> out to the async-profiler.jar
> Register an MBean for the AsyncProfilerService
> MBean provides 3 methods: start, stop, and raw
> "Simple mode": start/stop will allow a common subset of flags (duration, 
> event, output format, etc.)
> raw will expose the execute method for free-use
> default disabled - calls are rejected at runtime unless a JVM flag is 
> included to enable this feature
> Add new `nodetool profile` command
> `nodetool profile start --duration 30 --event cpu`
> `nodetool profile stop `
> `nodetool profile --raw ...`
> Create Sidecar API interface to call out to C* node-specific 
> AsyncProfilerService
> Build a unit/integration test suite for this async-profiler integration in C* 
> and Sidecar



Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-07-02 Thread Yaman Ziadeh (BLOOMBERG/ 919 3RD A)
Hey everyone!

Thanks for the great discussion - I've consolidated the discussion points into 
the following set of requirements:

*Include the async-profiler library with Cassandra
*Allow for easy async-profiler library version upgrades independent of Cassandra
*Expose a JMX interface to access common commands (start, stop, etc.)
*Expose a default-disabled interface to access the 'execute' method for 
advanced usage
*Create a nodetool command that reaches out to this interface
*Expose this feature through Sidecar's API

Given these requirements, I'll be developing this feature with the following 
general plan in mind:

*Drop-in async-profiler jar & native lib files into top-level cassandra repo
*Include these files into the Cassandra build via build.xml
*Make an AsyncProfilerService class for loading native lib file and calling out 
to the async-profiler.jar
*Register an MBean for the AsyncProfilerService
*
*MBean provides 3 methods: start, stop, and raw
*
*"Simple mode": start/stop will allow a common subset of flags (duration, 
event, output format, etc.)
*raw will expose the execute method for free-use
*
*default disabled - calls are rejected at runtime unless a JVM flag is included 
to enable this feature
*Add new `nodetool profile` command
*
*`nodetool profile start --duration 30 --event cpu`
*`nodetool profile stop `
*`nodetool profile --raw ...`
*Create Sidecar API interface to call out to C* node-specific 
AsyncProfilerService
*Build a unit/integration test suite for this async-profiler integration in C* 
and Sidecar

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-23 Thread Jon Haddad
easy-cass-lab has a few shell functions that I use often  they're defined
as c-flame* [1]

The arguments I've found most useful have been -X for excluding Parked
threads, -I for narrowing scope to particular callstacks (compaction), -e
for switching between allocation / cpu / wall profiling, -o for different
formats, but when I look at the profiler [2] options, I find I've used
almost all of them at one point or another...

I haven't looked at the full history of the async profiler project, so I
might be mistaken, but I can't remember a time where a change was made that
wasn't backwards compatible.  I have a hard time thinking of how or why
they'd opt to do that.  For example, would they ever remove -X, which
allows you to remove frames that match a regex?  Seems unlikely.  Every
option in there that I've used has been critical in some form or another in
doing a performance analysis.  I am deeply skeptical that we'd ever
actually encounter this problem.

I do appreciate the forward thinking here but I want to just caution
against putting a solution in place to solve a theoretical problem that
might never exist, and have that solution introduce problems of it's
own.  For a concrete example:

- Let's say we had starting including the profiler with C* 5.0.  At the
time of release, v4 of asprof wasn't available.
- V4 is released with an awesome new feature, continuous profiling.
- Now I want to upgrade asprof and drop in a new jar with C* 5.0
- Without the escape hatch, now I have to go back to maintaining my own
tools

TL;DR: I think it's important that we support users being able to upgrade
asprof independently from C*.

To the subject of disabling it by default, I guess I'm -0 on that right
now, but that's not an opinion I hold strongly, and if you think there's a
good case for it, I'm not going to spend any time trying to convince you
otherwise :)

Jon

[1]
https://github.com/rustyrazorblade/easy-cass-lab/blob/5d4874bbdbaadcf6e33651e19d8332c8c9383961/src/main/resources/com/rustyrazorblade/easycasslab/configuration/env.sh#L46


[2]
https://github.com/async-profiler/async-profiler/blob/master/docs/ProfilerOptions.md


On Mon, Jun 23, 2025 at 8:11 AM Doug Rohrer  wrote:

> A few thoughts here:
>
> 1) Run-time configuration (or even compile-time inclusion/exclusion?) that
> allows you to enable/disable the “raw” mode in both Cassandra and Sidecar
> would be a reasonable middle-ground here. I’m not crazy about exposing it
> by default though, so I’d a minimum have it default to disabled.
> 2) Different APIs exposed via `nodetool` (where you generally already have
> node-local access) and Sidecar, with different levels of complexity.
> 3) The more input we get from “users who actually use the profiler today,”
> the better the “safe” API can be, so maybe you won’t (generally) need to
> deploy with the raw endpoint enabled. To Jon specifically,  do you have in
> easy-cass-lab or anywhere else examples of how you’re using the profiler
> today that we could use to help guide the API design? I know you’ve got
> plenty to do so if there’s something we can dig into without requiring you
> to do it yourself I’d be happy to try to dig out requirements from there.
>
>
> Doug
>
>
> On Jun 22, 2025, at 8:10 PM, Josh McKenzie  wrote:
>
> If sidecar wishes to expose exec and take the fact that this API could
> break on it, I am +0 to that.  I mostly am trying to highlight the risk
>
> Trying to disambiguate here.
>
> "This API": we referring to our friendly simple exposed subset? Or are we
> referring to "you passed --raw and whatever is parsing that could drift."
>
> The former we have control over. The latter not so much.
>
> I'm +0 to taking on (and breaking) the latter; we either allow power users
> to pass arg strings directly and stay frozen if the API in the profiler
> changes, or we just rev the profile dep as needed and let power users eat
> the re-architecting costs. In my head: they're power users. They can update
> their profiler... profiles... locally; not so big a burden.
>
> On Sun, Jun 22, 2025, at 3:40 PM, David Capwell wrote:
>
>  it sounds like you’re saying users who actually use the profiler today
> are SOL and need to roll their own solution.
>
>
> No, I am saying it’s good to have sidecar expose this and expose common
> patterns that people actually use.
>
> If sidecar wishes to expose exec and take the fact that this API could
> break on it, I am +0 to that.  I mostly am trying to highlight the risk
>
> On Jun 20, 2025, at 2:54 PM, Jon Haddad  wrote:
>
> Well, the discussion is about sidecar doing it. it sounds like you’re
> saying users who actually use the profiler today are SOL and need to roll
> their own solution.
>
>
> On Fri, Jun 20, 2025 at 10:24 AM David Capwell  wrote:
>
> However, for folks like me that know the command line options and
> regularly do things that you might not have planned out, I'd appreciate an
> escape hatch where I can pass my raw commands
>
>
> For more “advanced” users,

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-23 Thread Doug Rohrer
A few thoughts here:

1) Run-time configuration (or even compile-time inclusion/exclusion?) that 
allows you to enable/disable the “raw” mode in both Cassandra and Sidecar would 
be a reasonable middle-ground here. I’m not crazy about exposing it by default 
though, so I’d a minimum have it default to disabled.
2) Different APIs exposed via `nodetool` (where you generally already have 
node-local access) and Sidecar, with different levels of complexity. 
3) The more input we get from “users who actually use the profiler today,” the 
better the “safe” API can be, so maybe you won’t (generally) need to deploy 
with the raw endpoint enabled. To Jon specifically,  do you have in 
easy-cass-lab or anywhere else examples of how you’re using the profiler today 
that we could use to help guide the API design? I know you’ve got plenty to do 
so if there’s something we can dig into without requiring you to do it yourself 
I’d be happy to try to dig out requirements from there.


Doug


> On Jun 22, 2025, at 8:10 PM, Josh McKenzie  wrote:
> 
>> If sidecar wishes to expose exec and take the fact that this API could break 
>> on it, I am +0 to that.  I mostly am trying to highlight the risk
> Trying to disambiguate here.
> 
> "This API": we referring to our friendly simple exposed subset? Or are we 
> referring to "you passed --raw and whatever is parsing that could drift."
> 
> The former we have control over. The latter not so much.
> 
> I'm +0 to taking on (and breaking) the latter; we either allow power users to 
> pass arg strings directly and stay frozen if the API in the profiler changes, 
> or we just rev the profile dep as needed and let power users eat the 
> re-architecting costs. In my head: they're power users. They can update their 
> profiler... profiles... locally; not so big a burden.
> 
> On Sun, Jun 22, 2025, at 3:40 PM, David Capwell wrote:
>>>  it sounds like you’re saying users who actually use the profiler today are 
>>> SOL and need to roll their own solution.
>> 
>> No, I am saying it’s good to have sidecar expose this and expose common 
>> patterns that people actually use. 
>> 
>> If sidecar wishes to expose exec and take the fact that this API could break 
>> on it, I am +0 to that.  I mostly am trying to highlight the risk
>> 
>>> On Jun 20, 2025, at 2:54 PM, Jon Haddad  wrote:
>>> 
>>> Well, the discussion is about sidecar doing it. it sounds like you’re 
>>> saying users who actually use the profiler today are SOL and need to roll 
>>> their own solution. 
>>> 
>>> 
>>> On Fri, Jun 20, 2025 at 10:24 AM David Capwell >> > wrote:
 However, for folks like me that know the command line options and 
 regularly do things that you might not have planned out, I'd appreciate an 
 escape hatch where I can pass my raw commands
>>> 
>>> For more “advanced” users, normal profile.sh would still be able to 
>>> profile, just requires more steps.
>>> 
 I think supporting both an abstraction-layer bound "simple mode" and a 
 "--raw for experts" is the way to go.
>>> 
>>> How do we say “this API has 0 compatibility for C* and can break w/e”? 
>>> 
 On Jun 20, 2025, at 5:22 AM, Josh McKenzie >>> > wrote:
 
 I think supporting both an abstraction-layer bound "simple mode" and a 
 "--raw for experts" is the way to go.
 
 On Thu, Jun 19, 2025, at 1:23 PM, Jon Haddad wrote:
> I understand the motivation to decouple the command line configuration 
> from what we expose to end users, and to an extent, agree with the 
> reasoning.  However, for folks like me that know the command line options 
> and regularly do things that you might not have planned out, I'd 
> appreciate an escape hatch where I can pass my raw commands.  Whatever 
> you end up implementing, there's almost certainly commands that 
> experienced async-profiler folks will want to use that weren't planned 
> for.
> 
> I am also not particularly interested in learning another syntax only to 
> have it transformed into the thing I want to use.  I expect that would be 
> a fairly simple flag (nodetool profile --raw xyz) that would skip the 
> parse logic, so hopefully it's not too much trouble to add.  Reverse 
> engineering the async profiler syntax into the thing we decide to use is, 
> at least for me, will be a source of frustration.
> 
> Thanks,
> Jon
> 
> 
> 
> On Wed, Jun 18, 2025 at 4:01 PM Abe Ratnofsky  > wrote:
> Another vote in favor of including async-profiler as a library in C*. The 
> new heatmap format in async-profiler 4.0[1] is excellent and makes 
> long-running profiles miles more useful than a plain flamegraph, but it 
> requires a post-processing step after a JFR is collected, which would 
> require a dependency on jfr-converter.jar[2]. Exposing the JFR files 
> directly would be reasonable but sli

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-23 Thread David Capwell
>  it sounds like you’re saying users who actually use the profiler today are 
> SOL and need to roll their own solution.

No, I am saying it’s good to have sidecar expose this and expose common 
patterns that people actually use. 

If sidecar wishes to expose exec and take the fact that this API could break on 
it, I am +0 to that.  I mostly am trying to highlight the risk

> On Jun 20, 2025, at 2:54 PM, Jon Haddad  wrote:
> 
> Well, the discussion is about sidecar doing it. it sounds like you’re saying 
> users who actually use the profiler today are SOL and need to roll their own 
> solution. 
> 
> 
> On Fri, Jun 20, 2025 at 10:24 AM David Capwell  > wrote:
>>> However, for folks like me that know the command line options and regularly 
>>> do things that you might not have planned out, I'd appreciate an escape 
>>> hatch where I can pass my raw commands
>> 
>> For more “advanced” users, normal profile.sh would still be able to profile, 
>> just requires more steps.
>> 
>>> I think supporting both an abstraction-layer bound "simple mode" and a 
>>> "--raw for experts" is the way to go.
>> 
>> How do we say “this API has 0 compatibility for C* and can break w/e”? 
>> 
>>> On Jun 20, 2025, at 5:22 AM, Josh McKenzie >> > wrote:
>>> 
>>> I think supporting both an abstraction-layer bound "simple mode" and a 
>>> "--raw for experts" is the way to go.
>>> 
>>> On Thu, Jun 19, 2025, at 1:23 PM, Jon Haddad wrote:
 I understand the motivation to decouple the command line configuration 
 from what we expose to end users, and to an extent, agree with the 
 reasoning.  However, for folks like me that know the command line options 
 and regularly do things that you might not have planned out, I'd 
 appreciate an escape hatch where I can pass my raw commands.  Whatever you 
 end up implementing, there's almost certainly commands that experienced 
 async-profiler folks will want to use that weren't planned for.
 
 I am also not particularly interested in learning another syntax only to 
 have it transformed into the thing I want to use.  I expect that would be 
 a fairly simple flag (nodetool profile --raw xyz) that would skip the 
 parse logic, so hopefully it's not too much trouble to add.  Reverse 
 engineering the async profiler syntax into the thing we decide to use is, 
 at least for me, will be a source of frustration.
 
 Thanks,
 Jon
 
 
 
 On Wed, Jun 18, 2025 at 4:01 PM Abe Ratnofsky >>> > wrote:
 Another vote in favor of including async-profiler as a library in C*. The 
 new heatmap format in async-profiler 4.0[1] is excellent and makes 
 long-running profiles miles more useful than a plain flamegraph, but it 
 requires a post-processing step after a JFR is collected, which would 
 require a dependency on jfr-converter.jar[2]. Exposing the JFR files 
 directly would be reasonable but slightly less useful, and the 
 post-processed heatmap HTML files are much smaller and self-contained. A 
 recent example on my machine shows HTML at 1/20th the size of the raw JFR 
 dump, which is meaningful especially for uploading to Jira.
 
 Note that JDK25 will have experimental support for better CPU 
 profiling[3], but async-profiler is still more mature and featureful, 
 especially for other profiling types (wall, alloc).
 
 [1]: 
 https://github.com/async-profiler/async-profiler/blob/master/docs/Heatmap.md
 [2]: 
 https://github.com/async-profiler/async-profiler?tab=readme-ov-file#stable-release-40
 [3]: 
 https://mostlynerdless.de/blog/2025/06/11/java-25s-new-cpu-time-profiler-1/
  
 
>> 



Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-22 Thread Josh McKenzie
> If sidecar wishes to expose exec and take the fact that this API could break 
> on it, I am +0 to that.  I mostly am trying to highlight the risk
Trying to disambiguate here.

"This API": we referring to our friendly simple exposed subset? Or are we 
referring to "you passed --raw and whatever is parsing that could drift."

The former we have control over. The latter not so much.

I'm +0 to taking on (and breaking) the latter; we either allow power users to 
pass arg strings directly and stay frozen if the API in the profiler changes, 
or we just rev the profile dep as needed and let power users eat the 
re-architecting costs. In my head: they're power users. They can update their 
profiler... profiles... locally; not so big a burden.

On Sun, Jun 22, 2025, at 3:40 PM, David Capwell wrote:
>>  it sounds like you’re saying users who actually use the profiler today are 
>> SOL and need to roll their own solution.
> 
> No, I am saying it’s good to have sidecar expose this and expose common 
> patterns that people actually use. 
> 
> If sidecar wishes to expose exec and take the fact that this API could break 
> on it, I am +0 to that.  I mostly am trying to highlight the risk
> 
>> On Jun 20, 2025, at 2:54 PM, Jon Haddad  wrote:
>> 
>> Well, the discussion is about sidecar doing it. it sounds like you’re saying 
>> users who actually use the profiler today are SOL and need to roll their own 
>> solution. 
>> 
>> 
>> On Fri, Jun 20, 2025 at 10:24 AM David Capwell  wrote:
 However, for folks like me that know the command line options and 
 regularly do things that you might not have planned out, I'd appreciate an 
 escape hatch where I can pass my raw commands
>>> 
>>> For more “advanced” users, normal profile.sh would still be able to 
>>> profile, just requires more steps.
>>> 
 I think supporting both an abstraction-layer bound "simple mode" and a 
 "--raw for experts" is the way to go.
>>> 
>>> How do we say “this API has 0 compatibility for C* and can break w/e”? 
>>> 
 On Jun 20, 2025, at 5:22 AM, Josh McKenzie  wrote:
 
 I think supporting both an abstraction-layer bound "simple mode" and a 
 "--raw for experts" is the way to go.
 
 On Thu, Jun 19, 2025, at 1:23 PM, Jon Haddad wrote:
> I understand the motivation to decouple the command line configuration 
> from what we expose to end users, and to an extent, agree with the 
> reasoning.  However, for folks like me that know the command line options 
> and regularly do things that you might not have planned out, I'd 
> appreciate an escape hatch where I can pass my raw commands.  Whatever 
> you end up implementing, there's almost certainly commands that 
> experienced async-profiler folks will want to use that weren't planned 
> for.
> 
> I am also not particularly interested in learning another syntax only to 
> have it transformed into the thing I want to use.  I expect that would be 
> a fairly simple flag (nodetool profile --raw xyz) that would skip the 
> parse logic, so hopefully it's not too much trouble to add.  Reverse 
> engineering the async profiler syntax into the thing we decide to use is, 
> at least for me, will be a source of frustration.
> 
> Thanks,
> Jon
> 
> 
> 
> On Wed, Jun 18, 2025 at 4:01 PM Abe Ratnofsky  wrote:
>> Another vote in favor of including async-profiler as a library in C*. 
>> The new heatmap format in async-profiler 4.0[1] is excellent and makes 
>> long-running profiles miles more useful than a plain flamegraph, but it 
>> requires a post-processing step after a JFR is collected, which would 
>> require a dependency on jfr-converter.jar[2]. Exposing the JFR files 
>> directly would be reasonable but slightly less useful, and the 
>> post-processed heatmap HTML files are much smaller and self-contained. A 
>> recent example on my machine shows HTML at 1/20th the size of the raw 
>> JFR dump, which is meaningful especially for uploading to Jira.
>> 
>> Note that JDK25 will have experimental support for better CPU 
>> profiling[3], but async-profiler is still more mature and featureful, 
>> especially for other profiling types (wall, alloc).
>> 
>> [1]: 
>> https://github.com/async-profiler/async-profiler/blob/master/docs/Heatmap.md
>> [2]: 
>> https://github.com/async-profiler/async-profiler?tab=readme-ov-file#stable-release-40
>> [3]: 
>> https://mostlynerdless.de/blog/2025/06/11/java-25s-new-cpu-time-profiler-1/
 


Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-20 Thread Jon Haddad
Well, the discussion is about sidecar doing it. it sounds like you’re
saying users who actually use the profiler today are SOL and need to roll
their own solution.


On Fri, Jun 20, 2025 at 10:24 AM David Capwell  wrote:

> However, for folks like me that know the command line options and
> regularly do things that you might not have planned out, I'd appreciate an
> escape hatch where I can pass my raw commands
>
>
> For more “advanced” users, normal profile.sh would still be able to
> profile, just requires more steps.
>
> I think supporting both an abstraction-layer bound "simple mode" and a
> "--raw for experts" is the way to go.
>
>
> How do we say “this API has 0 compatibility for C* and can break w/e”?
>
> On Jun 20, 2025, at 5:22 AM, Josh McKenzie  wrote:
>
> I think supporting both an abstraction-layer bound "simple mode" and a
> "--raw for experts" is the way to go.
>
> On Thu, Jun 19, 2025, at 1:23 PM, Jon Haddad wrote:
>
> I understand the motivation to decouple the command line configuration
> from what we expose to end users, and to an extent, agree with the
> reasoning.  However, for folks like me that know the command line options
> and regularly do things that you might not have planned out, I'd appreciate
> an escape hatch where I can pass my raw commands.  Whatever you end up
> implementing, there's almost certainly commands that experienced
> async-profiler folks will want to use that weren't planned for.
>
> I am also not particularly interested in learning another syntax only to
> have it transformed into the thing I want to use.  I expect that would be a
> fairly simple flag (nodetool profile --raw xyz) that would skip the parse
> logic, so hopefully it's not too much trouble to add.  Reverse engineering
> the async profiler syntax into the thing we decide to use is, at least for
> me, will be a source of frustration.
>
> Thanks,
> Jon
>
>
>
> On Wed, Jun 18, 2025 at 4:01 PM Abe Ratnofsky  wrote:
>
> Another vote in favor of including async-profiler as a library in C*. The
> new heatmap format in async-profiler 4.0[1] is excellent and makes
> long-running profiles miles more useful than a plain flamegraph, but it
> requires a post-processing step after a JFR is collected, which would
> require a dependency on jfr-converter.jar[2]. Exposing the JFR files
> directly would be reasonable but slightly less useful, and the
> post-processed heatmap HTML files are much smaller and self-contained. A
> recent example on my machine shows HTML at 1/20th the size of the raw JFR
> dump, which is meaningful especially for uploading to Jira.
>
> Note that JDK25 will have experimental support for better CPU
> profiling[3], but async-profiler is still more mature and featureful,
> especially for other profiling types (wall, alloc).
>
> [1]:
> https://github.com/async-profiler/async-profiler/blob/master/docs/Heatmap.md
> [2]:
> https://github.com/async-profiler/async-profiler?tab=readme-ov-file#stable-release-40
> [3]:
> https://mostlynerdless.de/blog/2025/06/11/java-25s-new-cpu-time-profiler-1/
>
>
>
>


Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-20 Thread David Capwell
> However, for folks like me that know the command line options and regularly 
> do things that you might not have planned out, I'd appreciate an escape hatch 
> where I can pass my raw commands

For more “advanced” users, normal profile.sh would still be able to profile, 
just requires more steps.

> I think supporting both an abstraction-layer bound "simple mode" and a "--raw 
> for experts" is the way to go.

How do we say “this API has 0 compatibility for C* and can break w/e”? 

> On Jun 20, 2025, at 5:22 AM, Josh McKenzie  wrote:
> 
> I think supporting both an abstraction-layer bound "simple mode" and a "--raw 
> for experts" is the way to go.
> 
> On Thu, Jun 19, 2025, at 1:23 PM, Jon Haddad wrote:
>> I understand the motivation to decouple the command line configuration from 
>> what we expose to end users, and to an extent, agree with the reasoning.  
>> However, for folks like me that know the command line options and regularly 
>> do things that you might not have planned out, I'd appreciate an escape 
>> hatch where I can pass my raw commands.  Whatever you end up implementing, 
>> there's almost certainly commands that experienced async-profiler folks will 
>> want to use that weren't planned for.
>> 
>> I am also not particularly interested in learning another syntax only to 
>> have it transformed into the thing I want to use.  I expect that would be a 
>> fairly simple flag (nodetool profile --raw xyz) that would skip the parse 
>> logic, so hopefully it's not too much trouble to add.  Reverse engineering 
>> the async profiler syntax into the thing we decide to use is, at least for 
>> me, will be a source of frustration.
>> 
>> Thanks,
>> Jon
>> 
>> 
>> 
>> On Wed, Jun 18, 2025 at 4:01 PM Abe Ratnofsky > > wrote:
>> Another vote in favor of including async-profiler as a library in C*. The 
>> new heatmap format in async-profiler 4.0[1] is excellent and makes 
>> long-running profiles miles more useful than a plain flamegraph, but it 
>> requires a post-processing step after a JFR is collected, which would 
>> require a dependency on jfr-converter.jar[2]. Exposing the JFR files 
>> directly would be reasonable but slightly less useful, and the 
>> post-processed heatmap HTML files are much smaller and self-contained. A 
>> recent example on my machine shows HTML at 1/20th the size of the raw JFR 
>> dump, which is meaningful especially for uploading to Jira.
>> 
>> Note that JDK25 will have experimental support for better CPU profiling[3], 
>> but async-profiler is still more mature and featureful, especially for other 
>> profiling types (wall, alloc).
>> 
>> [1]: 
>> https://github.com/async-profiler/async-profiler/blob/master/docs/Heatmap.md
>> [2]: 
>> https://github.com/async-profiler/async-profiler?tab=readme-ov-file#stable-release-40
>> [3]: 
>> https://mostlynerdless.de/blog/2025/06/11/java-25s-new-cpu-time-profiler-1/
>>  
>> 



Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-20 Thread Josh McKenzie
I think supporting both an abstraction-layer bound "simple mode" and a "--raw 
for experts" is the way to go.

On Thu, Jun 19, 2025, at 1:23 PM, Jon Haddad wrote:
> I understand the motivation to decouple the command line configuration from 
> what we expose to end users, and to an extent, agree with the reasoning.  
> However, for folks like me that know the command line options and regularly 
> do things that you might not have planned out, I'd appreciate an escape hatch 
> where I can pass my raw commands.  Whatever you end up implementing, there's 
> almost certainly commands that experienced async-profiler folks will want to 
> use that weren't planned for.
> 
> I am also not particularly interested in learning another syntax only to have 
> it transformed into the thing I want to use.  I expect that would be a fairly 
> simple flag (nodetool profile --raw xyz) that would skip the parse logic, so 
> hopefully it's not too much trouble to add.  Reverse engineering the async 
> profiler syntax into the thing we decide to use is, at least for me, will be 
> a source of frustration.
> 
> Thanks,
> Jon
> 
> 
> 
> On Wed, Jun 18, 2025 at 4:01 PM Abe Ratnofsky  wrote:
>> Another vote in favor of including async-profiler as a library in C*. The 
>> new heatmap format in async-profiler 4.0[1] is excellent and makes 
>> long-running profiles miles more useful than a plain flamegraph, but it 
>> requires a post-processing step after a JFR is collected, which would 
>> require a dependency on jfr-converter.jar[2]. Exposing the JFR files 
>> directly would be reasonable but slightly less useful, and the 
>> post-processed heatmap HTML files are much smaller and self-contained. A 
>> recent example on my machine shows HTML at 1/20th the size of the raw JFR 
>> dump, which is meaningful especially for uploading to Jira.
>> 
>> Note that JDK25 will have experimental support for better CPU profiling[3], 
>> but async-profiler is still more mature and featureful, especially for other 
>> profiling types (wall, alloc).
>> 
>> [1]: 
>> https://github.com/async-profiler/async-profiler/blob/master/docs/Heatmap.md
>> [2]: 
>> https://github.com/async-profiler/async-profiler?tab=readme-ov-file#stable-release-40
>> [3]: 
>> https://mostlynerdless.de/blog/2025/06/11/java-25s-new-cpu-time-profiler-1/


Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-19 Thread Jon Haddad
I understand the motivation to decouple the command line configuration from
what we expose to end users, and to an extent, agree with the reasoning.
However, for folks like me that know the command line options and regularly
do things that you might not have planned out, I'd appreciate an escape
hatch where I can pass my raw commands.  Whatever you end up implementing,
there's almost certainly commands that experienced async-profiler folks
will want to use that weren't planned for.

I am also not particularly interested in learning another syntax only to
have it transformed into the thing I want to use.  I expect that would be a
fairly simple flag (nodetool profile --raw xyz) that would skip the parse
logic, so hopefully it's not too much trouble to add.  Reverse engineering
the async profiler syntax into the thing we decide to use is, at least for
me, will be a source of frustration.

Thanks,
Jon



On Wed, Jun 18, 2025 at 4:01 PM Abe Ratnofsky  wrote:

> Another vote in favor of including async-profiler as a library in C*. The
> new heatmap format in async-profiler 4.0[1] is excellent and makes
> long-running profiles miles more useful than a plain flamegraph, but it
> requires a post-processing step after a JFR is collected, which would
> require a dependency on jfr-converter.jar[2]. Exposing the JFR files
> directly would be reasonable but slightly less useful, and the
> post-processed heatmap HTML files are much smaller and self-contained. A
> recent example on my machine shows HTML at 1/20th the size of the raw JFR
> dump, which is meaningful especially for uploading to Jira.
>
> Note that JDK25 will have experimental support for better CPU
> profiling[3], but async-profiler is still more mature and featureful,
> especially for other profiling types (wall, alloc).
>
> [1]:
> https://github.com/async-profiler/async-profiler/blob/master/docs/Heatmap.md
> [2]:
> https://github.com/async-profiler/async-profiler?tab=readme-ov-file#stable-release-40
> [3]:
> https://mostlynerdless.de/blog/2025/06/11/java-25s-new-cpu-time-profiler-1/
>
>


Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-18 Thread Abe Ratnofsky
Another vote in favor of including async-profiler as a library in C*. The new 
heatmap format in async-profiler 4.0[1] is excellent and makes long-running 
profiles miles more useful than a plain flamegraph, but it requires a 
post-processing step after a JFR is collected, which would require a dependency 
on jfr-converter.jar[2]. Exposing the JFR files directly would be reasonable 
but slightly less useful, and the post-processed heatmap HTML files are much 
smaller and self-contained. A recent example on my machine shows HTML at 1/20th 
the size of the raw JFR dump, which is meaningful especially for uploading to 
Jira.

Note that JDK25 will have experimental support for better CPU profiling[3], but 
async-profiler is still more mature and featureful, especially for other 
profiling types (wall, alloc).

[1]: 
https://github.com/async-profiler/async-profiler/blob/master/docs/Heatmap.md
[2]: 
https://github.com/async-profiler/async-profiler?tab=readme-ov-file#stable-release-40
[3]: https://mostlynerdless.de/blog/2025/06/11/java-25s-new-cpu-time-profiler-1/
 


Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-18 Thread Abe Ratnofsky
Another vote in favor of including async-profiler as a library in C*. The new 
heatmap format in async-profiler 4.0[1] is excellent and makes long-running 
profiles miles more useful than a plain flamegraph, but it requires a 
post-processing step after a JFR is collected, which would require a dependency 
on jfr-converter.jar[2]. Exposing the JFR files directly would be reasonable 
but slightly less useful, and the post-processed heatmap HTML files are much 
smaller and self-contained. A recent example on my machine shows HTML at 1/20th 
the size of the raw JFR dump, which is meaningful especially for uploading to 
Jira.

Note that JDK25 will have experimental support for better CPU profiling[3], but 
async-profiler is still more mature and featureful, especially for other 
profiling types (wall, alloc).

[1]: 
https://github.com/async-profiler/async-profiler/blob/master/docs/Heatmap.md
[2]: 
https://github.com/async-profiler/async-profiler?tab=readme-ov-file#stable-release-40
[3]: https://mostlynerdless.de/blog/2025/06/11/java-25s-new-cpu-time-profiler-1/



Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-18 Thread Doug Rohrer
I agree that exposing the raw execute method is a bad idea, for both the reason 
David mentions but also the foot-gun problem - there are a lot of ways that 
calling “execute” can cause you to overwrite files and we really shouldn’t 
expose an arbitrary file overwrite feature on purpose if we can avoid it.

Looking forward to seeing what Yaman comes back with after doing some 
additional research.

Doug

> On Jun 17, 2025, at 7:45 PM, David Capwell  wrote:
> 
> I am in favor of the project adopting as a library.
> 
> My automation is very outdated, so what I am saying maybe a legacy thing… so 
> w/e is the “new” way is what we should promote…. I rely a lot on the 
> collapsed format and wish to migrate to the JFR format so I can collect CPU / 
> Memory at the same time; it would be great for us to expose this as a 
> promoted ability (curl cassandra/profile -o result.jfr). One issue I see with 
> exposing the raw “execute” method is that it tied our API with the tools API, 
> so any breaking changes there break our API; I am not against this, but it is 
> something to consider.
> 
> As Scott has pointed out, there have been stability issues, so we should be 
> able to dynamically flag the feature off.
> 
>> On Jun 16, 2025, at 9:26 AM, Jaydeep Chovatia  
>> wrote:
>> 
>> >Previous experiences (good or bad)
>> I have been using an async-profiler in my project for quite some time to 
>> profile the CPU. Additionally, I have wrapped it with an HTTP interface, 
>> allowing one to open a browser and view the CPU flame graph in real-time, 
>> which further simplifies the process.
>> It is integrated as a library, and my preference is to include it as a 
>> library, rather than forking processes.
>> 
>> Jaydeep
>> 
>> On Sat, Jun 14, 2025 at 8:14 AM Josh McKenzie > > wrote:
 I have seen cases where specific async-profiler/JVM/Cassandra version 
 combos (JDK11/4.1-derived source tree) will immediately crash the JVM on 
 profile - especially successive profile invocations on the same process
>>> This would be a great candidate for testing to ensure that, at least for 
>>> provided profiles, this doesn't happen.
>>> 
>>> On Fri, Jun 13, 2025, at 10:41 PM, C. Scott Andreas wrote:
 Supportive of inclusion as well. General preference for invoking as a 
 library rather than forking processes.
 
 Jon, thanks for the tips on off-CPU profiling - added to my personal cheat 
 sheet.
 
 I have seen cases where specific async-profiler/JVM/Cassandra version 
 combos (JDK11/4.1-derived source tree) will immediately crash the JVM on 
 profile - especially successive profile invocations on the same process - 
 but have not observed this on JDK21 or trunk-derived source trees. If we 
 have user reports of that happening, we’ll need to figure out how to 
 reproduce and get to the bottom of it.
 
 – Scott
 
 > On Jun 13, 2025, at 5:24 PM, Francisco Guerrero >>> > > wrote:
 > 
 > Thanks for bringing this discussion Doug. I didn't realize that 
 > async-profiler allows you to
 > bring it as a dependency. It looks pretty neat from what I could tell. I 
 > also think bringing
 > this to Cassandra as a dependency is a reasonable approach. We need to 
 > come up with
 > a solid way to expose this via JMX / vtable.
 > 
 > Best,
 > - Francisco
 > 
 >> On 2025/06/13 21:08:28 Doug Rohrer wrote:
 >> The nice thing from what I can tell about using the Java API per [6] 
 >> below is that you can literally just get an instance of the profiler 
 >> and pass it some commands in the `execute` method… just need to be 
 >> careful how much of that surface area we expose. Jon (and others 
 >> obviously) I’d love to get your take on how we could make a useful 
 >> interface to the async-profiler, maybe exposed via JMX, that doesn’t 
 >> require someone to read the entirety of the async-profiler docs and 
 >> provides some useful profiles without the rough edges (things like 
 >> managing temp files so users don’t have to know the layout of the 
 >> filesystem C* is running on, for example, since at least in the Sidecar 
 >> we’d be executing this on behalf of a remote user, with all of the 
 >> constraints that implies).
 >> 
 >> We can always be more protective in the Sidecar than we are server-side 
 >> as well, but it seems like helping operators not do bad things is a 
 >> good thing.
 >> 
 >> Obviously we’d want the ability Cassandra-side to disable this 
 >> functionality all together however we implement it.
 >> 
 >> Doug
 >> 
  On Jun 13, 2025, at 2:38 PM, Jon Haddad >>>  > wrote:
 >>> 
 >>> I'd be very happy to see async-profiler included with C*  I've made 
 >>> extensive use of it in my performance evaluations [1][2], a

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-17 Thread David Capwell
I am in favor of the project adopting as a library.

My automation is very outdated, so what I am saying maybe a legacy thing… so 
w/e is the “new” way is what we should promote…. I rely a lot on the collapsed 
format and wish to migrate to the JFR format so I can collect CPU / Memory at 
the same time; it would be great for us to expose this as a promoted ability 
(curl cassandra/profile -o result.jfr). One issue I see with exposing the raw 
“execute” method is that it tied our API with the tools API, so any breaking 
changes there break our API; I am not against this, but it is something to 
consider.

As Scott has pointed out, there have been stability issues, so we should be 
able to dynamically flag the feature off.

> On Jun 16, 2025, at 9:26 AM, Jaydeep Chovatia  
> wrote:
> 
> >Previous experiences (good or bad)
> I have been using an async-profiler in my project for quite some time to 
> profile the CPU. Additionally, I have wrapped it with an HTTP interface, 
> allowing one to open a browser and view the CPU flame graph in real-time, 
> which further simplifies the process.
> It is integrated as a library, and my preference is to include it as a 
> library, rather than forking processes.
> 
> Jaydeep
> 
> On Sat, Jun 14, 2025 at 8:14 AM Josh McKenzie  > wrote:
>>> I have seen cases where specific async-profiler/JVM/Cassandra version 
>>> combos (JDK11/4.1-derived source tree) will immediately crash the JVM on 
>>> profile - especially successive profile invocations on the same process
>> This would be a great candidate for testing to ensure that, at least for 
>> provided profiles, this doesn't happen.
>> 
>> On Fri, Jun 13, 2025, at 10:41 PM, C. Scott Andreas wrote:
>>> Supportive of inclusion as well. General preference for invoking as a 
>>> library rather than forking processes.
>>> 
>>> Jon, thanks for the tips on off-CPU profiling - added to my personal cheat 
>>> sheet.
>>> 
>>> I have seen cases where specific async-profiler/JVM/Cassandra version 
>>> combos (JDK11/4.1-derived source tree) will immediately crash the JVM on 
>>> profile - especially successive profile invocations on the same process - 
>>> but have not observed this on JDK21 or trunk-derived source trees. If we 
>>> have user reports of that happening, we’ll need to figure out how to 
>>> reproduce and get to the bottom of it.
>>> 
>>> – Scott
>>> 
>>> > On Jun 13, 2025, at 5:24 PM, Francisco Guerrero >> > > wrote:
>>> > 
>>> > Thanks for bringing this discussion Doug. I didn't realize that 
>>> > async-profiler allows you to
>>> > bring it as a dependency. It looks pretty neat from what I could tell. I 
>>> > also think bringing
>>> > this to Cassandra as a dependency is a reasonable approach. We need to 
>>> > come up with
>>> > a solid way to expose this via JMX / vtable.
>>> > 
>>> > Best,
>>> > - Francisco
>>> > 
>>> >> On 2025/06/13 21:08:28 Doug Rohrer wrote:
>>> >> The nice thing from what I can tell about using the Java API per [6] 
>>> >> below is that you can literally just get an instance of the profiler and 
>>> >> pass it some commands in the `execute` method… just need to be careful 
>>> >> how much of that surface area we expose. Jon (and others obviously) I’d 
>>> >> love to get your take on how we could make a useful interface to the 
>>> >> async-profiler, maybe exposed via JMX, that doesn’t require someone to 
>>> >> read the entirety of the async-profiler docs and provides some useful 
>>> >> profiles without the rough edges (things like managing temp files so 
>>> >> users don’t have to know the layout of the filesystem C* is running on, 
>>> >> for example, since at least in the Sidecar we’d be executing this on 
>>> >> behalf of a remote user, with all of the constraints that implies).
>>> >> 
>>> >> We can always be more protective in the Sidecar than we are server-side 
>>> >> as well, but it seems like helping operators not do bad things is a good 
>>> >> thing.
>>> >> 
>>> >> Obviously we’d want the ability Cassandra-side to disable this 
>>> >> functionality all together however we implement it.
>>> >> 
>>> >> Doug
>>> >> 
>>>  On Jun 13, 2025, at 2:38 PM, Jon Haddad >>  > wrote:
>>> >>> 
>>> >>> I'd be very happy to see async-profiler included with C*  I've made 
>>> >>> extensive use of it in my performance evaluations [1][2], and even 
>>> >>> posted a video about it [3] for general Java perf analysis (among 
>>> >>> others).  It's part of easy-cass-lab and is easily the most informative 
>>> >>> tool I've found for the getting to the bottom of anything performance 
>>> >>> related.
>>> >>> 
>>> >>> There's probably a good case to be made for including it with the C* 
>>> >>> artifact as well as having it be something you can drop in. I lean 
>>> >>> towards including it all the time, but I haven't run it this way myself 
>>> >>> yet, so there might be some downside I'm unaware of.
>>> >>> 
>>> 

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-16 Thread Yaman Ziadeh (BLOOMBERG/ 919 3RD A)
Thanks everyone for your inputs! 

I'm looking to work on this, and will circle back with any recommendations or 
discussion points moving forward - excited to get this into C*!

From: [email protected] At: 06/13/25 14:40:24 UTC-4:00To:  
[email protected]
Subject: Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async 
profiles

I'd be very happy to see async-profiler included with C*  I've  made extensive 
use of it in my performance evaluations [1][2], and even posted a video about 
it [3] for general Java perf analysis (among others).  It's part of 
easy-cass-lab and is easily the most informative tool I've found for the 
getting to the bottom of anything performance related.

There's probably a good case to be made for including it with the C* artifact 
as well as  having it be something you can drop in. I lean towards including it 
all the time, but I haven't run it this way myself yet, so there might be some 
downside I'm unaware of.
When you  call the asprof executable, it attaches the async-profiler to the  
running jvm using jattach [4].  We could do this as well, if we wanted to  
avoid including it with the release, but I don't know how much we really 
benefit from that.  I've run into issues with it when it's unable to detatch 
correctly, then you're unable to reattach it until after the server is 
restarted.  On the flip side, I don't know if you're able to set up all the 
same options for arbitrary profiling when it's loaded as an agent and turned 
on/off dynamically.  I think we can, based on the integration page [6], but I 
haven't tried it yet.  It would be a bummer if we only had a single mode of 
profiling available.  

The default mode, CPU profiling, is fantastic, but I've also made extensive use 
of allocation profiling [5] to identify perf issues as well so having that 
available is a must, imo. Wall clock / off cpu profiling is great for 
identifying when IO is the root cause, which isn't clearly revealed by on-cpu 
profiling due to the way threads are scheduled.  When I look at a system I 
typically do CPU / Wall / Alloc / Off-CPU to be thorough, and the last thing 
you want to do is have to restart between each one.  You can also specify 
specific Java methods, include or exclude frames matching specific regex, and a 
whole slew of other options.  The latest version even supports continuous 
profiling with heatmaps although I haven't tried it yet.  

So hopefully the option we go with allows all of that, otherwise the limits 
would impose more of a headache to me as I'd need to remove it and continue to 
bring my own.

Under the hood, the async-profiler uses Linux perf events + asynchronous 
polling of the java stack to match them up and generate it's reports.  As a 
result, it requires certain permissions to run and get all the details I like.  
Specifically these kernel parameters:

sudo sysctl kernel.perf_event_paranoid=1
sudo sysctl kernel.kptr_restrict=0

You also need to enable some capabilities for off-cpu profiliing:

sudo find /usr/lib/jvm/ -type f -name 'java' -exec setcap 
"cap_perfmon,cap_sys_ptrace,cap_syslog=ep" {} \;

Then you can do off-cpu with this wild cryptic version (shout out to Andrei 
Pangin for helping me with this [7]):

asprof -e kprobe:schedule -i 2 --cstack dwarf -X '*Unsafe.park*' "${@:2}" $PID

There's also some subtle issues when it's run in a container, since by default 
you don't have access to the perf_event_open syscall.  Just something to keep 
in mind.  This is one of my main grievances with container deployments.

Indeed Patrick, I am very happy to see this discussion!  Thanks Doug for 
starting the thread.

Jon

[1] https://issues.apache.org/jira/browse/CASSANDRA-15452
[2] https://issues.apache.org/jira/browse/CASSANDRA-19477
[3] 
https://www.youtube.com/watch?v=yNZtnzjyJRI&t=212s&pp=ygUOYXN5bmMgcHJvZmlsZXI%3D
[4] 
https://github.com/async-profiler/async-profiler/blob/2b556680dc8f5d02c3f26ac119d835dc2381e604/src/jattach/jattach_hotspot.c#L38
[5] https://issues.apache.org/jira/browse/CASSANDRA-20428
[6] 
https://github.com/async-profiler/async-profiler/blob/master/docs/IntegratingAsyncProfiler.md
[7] https://github.com/async-profiler/async-profiler/issues/907


On Fri, Jun 13, 2025 at 10:18 AM Patrick McFadin  wrote:

The fact o3 used "Bus-factor" as a dimension is just amazing. 

After reading more about the project, the possibilities are pretty interesting. 
I suspect we'll see this in a Haddad talk soon. 
On Fri, Jun 13, 2025 at 1:57 AM Josh McKenzie  wrote:

I was curious if o3 (model from OpenAI) would be able to do a deep dive health 
check on a repo to assist in considering taking it as a dependency. The results 
can be found here: 
https://chatgpt.com/share/684be703-1d4c-8002-b831-f997f829f4b4

Apparently it can, and can do it q

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-16 Thread Jaydeep Chovatia
>Previous experiences (good or bad)
I have been using an async-profiler in my project for quite some time to
profile the CPU. Additionally, I have wrapped it with an HTTP interface,
allowing one to open a browser and view the CPU flame graph in real-time,
which further simplifies the process.
It is integrated as a library, and my preference is to include it as a
library, rather than forking processes.

Jaydeep

On Sat, Jun 14, 2025 at 8:14 AM Josh McKenzie  wrote:

> I have seen cases where specific async-profiler/JVM/Cassandra version
> combos (JDK11/4.1-derived source tree) will immediately crash the JVM on
> profile - especially successive profile invocations on the same process
>
> This would be a great candidate for testing to ensure that, at least for
> provided profiles, this doesn't happen.
>
> On Fri, Jun 13, 2025, at 10:41 PM, C. Scott Andreas wrote:
>
> Supportive of inclusion as well. General preference for invoking as a
> library rather than forking processes.
>
> Jon, thanks for the tips on off-CPU profiling - added to my personal cheat
> sheet.
>
> I have seen cases where specific async-profiler/JVM/Cassandra version
> combos (JDK11/4.1-derived source tree) will immediately crash the JVM on
> profile - especially successive profile invocations on the same process -
> but have not observed this on JDK21 or trunk-derived source trees. If we
> have user reports of that happening, we’ll need to figure out how to
> reproduce and get to the bottom of it.
>
> – Scott
>
> > On Jun 13, 2025, at 5:24 PM, Francisco Guerrero 
> wrote:
> >
> > Thanks for bringing this discussion Doug. I didn't realize that
> async-profiler allows you to
> > bring it as a dependency. It looks pretty neat from what I could tell. I
> also think bringing
> > this to Cassandra as a dependency is a reasonable approach. We need to
> come up with
> > a solid way to expose this via JMX / vtable.
> >
> > Best,
> > - Francisco
> >
> >> On 2025/06/13 21:08:28 Doug Rohrer wrote:
> >> The nice thing from what I can tell about using the Java API per [6]
> below is that you can literally just get an instance of the profiler and
> pass it some commands in the `execute` method… just need to be careful how
> much of that surface area we expose. Jon (and others obviously) I’d love to
> get your take on how we could make a useful interface to the
> async-profiler, maybe exposed via JMX, that doesn’t require someone to read
> the entirety of the async-profiler docs and provides some useful profiles
> without the rough edges (things like managing temp files so users don’t
> have to know the layout of the filesystem C* is running on, for example,
> since at least in the Sidecar we’d be executing this on behalf of a remote
> user, with all of the constraints that implies).
> >>
> >> We can always be more protective in the Sidecar than we are server-side
> as well, but it seems like helping operators not do bad things is a good
> thing.
> >>
> >> Obviously we’d want the ability Cassandra-side to disable this
> functionality all together however we implement it.
> >>
> >> Doug
> >>
>  On Jun 13, 2025, at 2:38 PM, Jon Haddad 
> wrote:
> >>>
> >>> I'd be very happy to see async-profiler included with C*  I've made
> extensive use of it in my performance evaluations [1][2], and even posted a
> video about it [3] for general Java perf analysis (among others).  It's
> part of easy-cass-lab and is easily the most informative tool I've found
> for the getting to the bottom of anything performance related.
> >>>
> >>> There's probably a good case to be made for including it with the C*
> artifact as well as having it be something you can drop in. I lean towards
> including it all the time, but I haven't run it this way myself yet, so
> there might be some downside I'm unaware of.
> >>>
> >>> When you call the asprof executable, it attaches the async-profiler to
> the running jvm using jattach [4].  We could do this as well, if we wanted
> to avoid including it with the release, but I don't know how much we really
> benefit from that.  I've run into issues with it when it's unable to
> detatch correctly, then you're unable to reattach it until after the server
> is restarted.  On the flip side, I don't know if you're able to set up all
> the same options for arbitrary profiling when it's loaded as an agent and
> turned on/off dynamically.  I think we can, based on the integration page
> [6], but I haven't tried it yet.  It would be a bummer if we only had a
> single mode of profiling available.
> >>>
> >>> The default mode, CPU profiling, is fantastic, but I've also made
> extensive use of allocation profiling [5] to identify perf issues as well
> so having that available is a must, imo. Wall clock / off cpu profiling is
> great for identifying when IO is the root cause, which isn't clearly
> revealed by on-cpu profiling due to the way threads are scheduled.  When I
> look at a system I typically do CPU / Wall / Alloc / Off-CPU to be
> thorough, and t

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-14 Thread Josh McKenzie
> I have seen cases where specific async-profiler/JVM/Cassandra version combos 
> (JDK11/4.1-derived source tree) will immediately crash the JVM on profile - 
> especially successive profile invocations on the same process
This would be a great candidate for testing to ensure that, at least for 
provided profiles, this doesn't happen.

On Fri, Jun 13, 2025, at 10:41 PM, C. Scott Andreas wrote:
> Supportive of inclusion as well. General preference for invoking as a library 
> rather than forking processes.
> 
> Jon, thanks for the tips on off-CPU profiling - added to my personal cheat 
> sheet.
> 
> I have seen cases where specific async-profiler/JVM/Cassandra version combos 
> (JDK11/4.1-derived source tree) will immediately crash the JVM on profile - 
> especially successive profile invocations on the same process - but have not 
> observed this on JDK21 or trunk-derived source trees. If we have user reports 
> of that happening, we’ll need to figure out how to reproduce and get to the 
> bottom of it.
> 
> – Scott
> 
> > On Jun 13, 2025, at 5:24 PM, Francisco Guerrero  wrote:
> > 
> > Thanks for bringing this discussion Doug. I didn't realize that 
> > async-profiler allows you to
> > bring it as a dependency. It looks pretty neat from what I could tell. I 
> > also think bringing
> > this to Cassandra as a dependency is a reasonable approach. We need to come 
> > up with
> > a solid way to expose this via JMX / vtable.
> > 
> > Best,
> > - Francisco
> > 
> >> On 2025/06/13 21:08:28 Doug Rohrer wrote:
> >> The nice thing from what I can tell about using the Java API per [6] below 
> >> is that you can literally just get an instance of the profiler and pass it 
> >> some commands in the `execute` method… just need to be careful how much of 
> >> that surface area we expose. Jon (and others obviously) I’d love to get 
> >> your take on how we could make a useful interface to the async-profiler, 
> >> maybe exposed via JMX, that doesn’t require someone to read the entirety 
> >> of the async-profiler docs and provides some useful profiles without the 
> >> rough edges (things like managing temp files so users don’t have to know 
> >> the layout of the filesystem C* is running on, for example, since at least 
> >> in the Sidecar we’d be executing this on behalf of a remote user, with all 
> >> of the constraints that implies).
> >> 
> >> We can always be more protective in the Sidecar than we are server-side as 
> >> well, but it seems like helping operators not do bad things is a good 
> >> thing.
> >> 
> >> Obviously we’d want the ability Cassandra-side to disable this 
> >> functionality all together however we implement it.
> >> 
> >> Doug
> >> 
>  On Jun 13, 2025, at 2:38 PM, Jon Haddad  wrote:
> >>> 
> >>> I'd be very happy to see async-profiler included with C*  I've made 
> >>> extensive use of it in my performance evaluations [1][2], and even posted 
> >>> a video about it [3] for general Java perf analysis (among others).  It's 
> >>> part of easy-cass-lab and is easily the most informative tool I've found 
> >>> for the getting to the bottom of anything performance related.
> >>> 
> >>> There's probably a good case to be made for including it with the C* 
> >>> artifact as well as having it be something you can drop in. I lean 
> >>> towards including it all the time, but I haven't run it this way myself 
> >>> yet, so there might be some downside I'm unaware of.
> >>> 
> >>> When you call the asprof executable, it attaches the async-profiler to 
> >>> the running jvm using jattach [4].  We could do this as well, if we 
> >>> wanted to avoid including it with the release, but I don't know how much 
> >>> we really benefit from that.  I've run into issues with it when it's 
> >>> unable to detatch correctly, then you're unable to reattach it until 
> >>> after the server is restarted.  On the flip side, I don't know if you're 
> >>> able to set up all the same options for arbitrary profiling when it's 
> >>> loaded as an agent and turned on/off dynamically.  I think we can, based 
> >>> on the integration page [6], but I haven't tried it yet.  It would be a 
> >>> bummer if we only had a single mode of profiling available.  
> >>> 
> >>> The default mode, CPU profiling, is fantastic, but I've also made 
> >>> extensive use of allocation profiling [5] to identify perf issues as well 
> >>> so having that available is a must, imo. Wall clock / off cpu profiling 
> >>> is great for identifying when IO is the root cause, which isn't clearly 
> >>> revealed by on-cpu profiling due to the way threads are scheduled.  When 
> >>> I look at a system I typically do CPU / Wall / Alloc / Off-CPU to be 
> >>> thorough, and the last thing you want to do is have to restart between 
> >>> each one.  You can also specify specific Java methods, include or exclude 
> >>> frames matching specific regex, and a whole slew of other options.  The 
> >>> latest version even supports continuous profiling with heatmaps 

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-13 Thread C. Scott Andreas
Supportive of inclusion as well. General preference for invoking as a library 
rather than forking processes.

Jon, thanks for the tips on off-CPU profiling - added to my personal cheat 
sheet.

I have seen cases where specific async-profiler/JVM/Cassandra version combos 
(JDK11/4.1-derived source tree) will immediately crash the JVM on profile - 
especially successive profile invocations on the same process - but have not 
observed this on JDK21 or trunk-derived source trees. If we have user reports 
of that happening, we’ll need to figure out how to reproduce and get to the 
bottom of it.

– Scott

> On Jun 13, 2025, at 5:24 PM, Francisco Guerrero  wrote:
> 
> Thanks for bringing this discussion Doug. I didn't realize that 
> async-profiler allows you to
> bring it as a dependency. It looks pretty neat from what I could tell. I also 
> think bringing
> this to Cassandra as a dependency is a reasonable approach. We need to come 
> up with
> a solid way to expose this via JMX / vtable.
> 
> Best,
> - Francisco
> 
>> On 2025/06/13 21:08:28 Doug Rohrer wrote:
>> The nice thing from what I can tell about using the Java API per [6] below 
>> is that you can literally just get an instance of the profiler and pass it 
>> some commands in the `execute` method… just need to be careful how much of 
>> that surface area we expose. Jon (and others obviously) I’d love to get your 
>> take on how we could make a useful interface to the async-profiler, maybe 
>> exposed via JMX, that doesn’t require someone to read the entirety of the 
>> async-profiler docs and provides some useful profiles without the rough 
>> edges (things like managing temp files so users don’t have to know the 
>> layout of the filesystem C* is running on, for example, since at least in 
>> the Sidecar we’d be executing this on behalf of a remote user, with all of 
>> the constraints that implies).
>> 
>> We can always be more protective in the Sidecar than we are server-side as 
>> well, but it seems like helping operators not do bad things is a good thing.
>> 
>> Obviously we’d want the ability Cassandra-side to disable this functionality 
>> all together however we implement it.
>> 
>> Doug
>> 
 On Jun 13, 2025, at 2:38 PM, Jon Haddad  wrote:
>>> 
>>> I'd be very happy to see async-profiler included with C*  I've made 
>>> extensive use of it in my performance evaluations [1][2], and even posted a 
>>> video about it [3] for general Java perf analysis (among others).  It's 
>>> part of easy-cass-lab and is easily the most informative tool I've found 
>>> for the getting to the bottom of anything performance related.
>>> 
>>> There's probably a good case to be made for including it with the C* 
>>> artifact as well as having it be something you can drop in. I lean towards 
>>> including it all the time, but I haven't run it this way myself yet, so 
>>> there might be some downside I'm unaware of.
>>> 
>>> When you call the asprof executable, it attaches the async-profiler to the 
>>> running jvm using jattach [4].  We could do this as well, if we wanted to 
>>> avoid including it with the release, but I don't know how much we really 
>>> benefit from that.  I've run into issues with it when it's unable to 
>>> detatch correctly, then you're unable to reattach it until after the server 
>>> is restarted.  On the flip side, I don't know if you're able to set up all 
>>> the same options for arbitrary profiling when it's loaded as an agent and 
>>> turned on/off dynamically.  I think we can, based on the integration page 
>>> [6], but I haven't tried it yet.  It would be a bummer if we only had a 
>>> single mode of profiling available.  
>>> 
>>> The default mode, CPU profiling, is fantastic, but I've also made extensive 
>>> use of allocation profiling [5] to identify perf issues as well so having 
>>> that available is a must, imo. Wall clock / off cpu profiling is great for 
>>> identifying when IO is the root cause, which isn't clearly revealed by 
>>> on-cpu profiling due to the way threads are scheduled.  When I look at a 
>>> system I typically do CPU / Wall / Alloc / Off-CPU to be thorough, and the 
>>> last thing you want to do is have to restart between each one.  You can 
>>> also specify specific Java methods, include or exclude frames matching 
>>> specific regex, and a whole slew of other options.  The latest version even 
>>> supports continuous profiling with heatmaps although I haven't tried it 
>>> yet.  
>>> 
>>> So hopefully the option we go with allows all of that, otherwise the limits 
>>> would impose more of a headache to me as I'd need to remove it and continue 
>>> to bring my own.
>>> 
>>> Under the hood, the async-profiler uses Linux perf events + <> asynchronous 
>>> polling of the java stack to match them up and generate it's reports.  As a 
>>> result, it requires certain permissions to run and get all the details I 
>>> like.  Specifically these kernel parameters:
>>> 
>>> sudo sysctl kernel.perf_event_paranoid

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-13 Thread Francisco Guerrero
Thanks for bringing this discussion Doug. I didn't realize that async-profiler 
allows you to
bring it as a dependency. It looks pretty neat from what I could tell. I also 
think bringing
this to Cassandra as a dependency is a reasonable approach. We need to come up 
with
a solid way to expose this via JMX / vtable.

Best,
- Francisco

On 2025/06/13 21:08:28 Doug Rohrer wrote:
> The nice thing from what I can tell about using the Java API per [6] below is 
> that you can literally just get an instance of the profiler and pass it some 
> commands in the `execute` method… just need to be careful how much of that 
> surface area we expose. Jon (and others obviously) I’d love to get your take 
> on how we could make a useful interface to the async-profiler, maybe exposed 
> via JMX, that doesn’t require someone to read the entirety of the 
> async-profiler docs and provides some useful profiles without the rough edges 
> (things like managing temp files so users don’t have to know the layout of 
> the filesystem C* is running on, for example, since at least in the Sidecar 
> we’d be executing this on behalf of a remote user, with all of the 
> constraints that implies).
> 
> We can always be more protective in the Sidecar than we are server-side as 
> well, but it seems like helping operators not do bad things is a good thing.
> 
> Obviously we’d want the ability Cassandra-side to disable this functionality 
> all together however we implement it.
> 
> Doug
> 
> > On Jun 13, 2025, at 2:38 PM, Jon Haddad  wrote:
> > 
> > I'd be very happy to see async-profiler included with C*  I've made 
> > extensive use of it in my performance evaluations [1][2], and even posted a 
> > video about it [3] for general Java perf analysis (among others).  It's 
> > part of easy-cass-lab and is easily the most informative tool I've found 
> > for the getting to the bottom of anything performance related.
> > 
> > There's probably a good case to be made for including it with the C* 
> > artifact as well as having it be something you can drop in. I lean towards 
> > including it all the time, but I haven't run it this way myself yet, so 
> > there might be some downside I'm unaware of.
> > 
> > When you call the asprof executable, it attaches the async-profiler to the 
> > running jvm using jattach [4].  We could do this as well, if we wanted to 
> > avoid including it with the release, but I don't know how much we really 
> > benefit from that.  I've run into issues with it when it's unable to 
> > detatch correctly, then you're unable to reattach it until after the server 
> > is restarted.  On the flip side, I don't know if you're able to set up all 
> > the same options for arbitrary profiling when it's loaded as an agent and 
> > turned on/off dynamically.  I think we can, based on the integration page 
> > [6], but I haven't tried it yet.  It would be a bummer if we only had a 
> > single mode of profiling available.  
> > 
> > The default mode, CPU profiling, is fantastic, but I've also made extensive 
> > use of allocation profiling [5] to identify perf issues as well so having 
> > that available is a must, imo. Wall clock / off cpu profiling is great for 
> > identifying when IO is the root cause, which isn't clearly revealed by 
> > on-cpu profiling due to the way threads are scheduled.  When I look at a 
> > system I typically do CPU / Wall / Alloc / Off-CPU to be thorough, and the 
> > last thing you want to do is have to restart between each one.  You can 
> > also specify specific Java methods, include or exclude frames matching 
> > specific regex, and a whole slew of other options.  The latest version even 
> > supports continuous profiling with heatmaps although I haven't tried it 
> > yet.  
> > 
> > So hopefully the option we go with allows all of that, otherwise the limits 
> > would impose more of a headache to me as I'd need to remove it and continue 
> > to bring my own.
> > 
> > Under the hood, the async-profiler uses Linux perf events + <> asynchronous 
> > polling of the java stack to match them up and generate it's reports.  As a 
> > result, it requires certain permissions to run and get all the details I 
> > like.  Specifically these kernel parameters:
> > 
> > sudo sysctl kernel.perf_event_paranoid=1
> > sudo sysctl kernel.kptr_restrict=0
> > 
> > You also need to enable some capabilities for off-cpu profiliing:
> > 
> > sudo find /usr/lib/jvm/ -type f -name 'java' -exec setcap 
> > "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" {} \;
> > 
> > Then you can do off-cpu with this wild cryptic version (shout out to Andrei 
> > Pangin for helping me with this [7]):
> > 
> > asprof -e kprobe:schedule -i 2 --cstack dwarf -X '*Unsafe.park*' "${@:2}" 
> > $PID
> > 
> > There's also some subtle issues when it's run in a container, since by 
> > default you don't have access to the perf_event_open syscall.  Just 
> > something to keep in mind.  This is one of my main grievances with 
> > container deployments.
> > 

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-13 Thread Doug Rohrer
The nice thing from what I can tell about using the Java API per [6] below is 
that you can literally just get an instance of the profiler and pass it some 
commands in the `execute` method… just need to be careful how much of that 
surface area we expose. Jon (and others obviously) I’d love to get your take on 
how we could make a useful interface to the async-profiler, maybe exposed via 
JMX, that doesn’t require someone to read the entirety of the async-profiler 
docs and provides some useful profiles without the rough edges (things like 
managing temp files so users don’t have to know the layout of the filesystem C* 
is running on, for example, since at least in the Sidecar we’d be executing 
this on behalf of a remote user, with all of the constraints that implies).

We can always be more protective in the Sidecar than we are server-side as 
well, but it seems like helping operators not do bad things is a good thing.

Obviously we’d want the ability Cassandra-side to disable this functionality 
all together however we implement it.

Doug

> On Jun 13, 2025, at 2:38 PM, Jon Haddad  wrote:
> 
> I'd be very happy to see async-profiler included with C*  I've made extensive 
> use of it in my performance evaluations [1][2], and even posted a video about 
> it [3] for general Java perf analysis (among others).  It's part of 
> easy-cass-lab and is easily the most informative tool I've found for the 
> getting to the bottom of anything performance related.
> 
> There's probably a good case to be made for including it with the C* artifact 
> as well as having it be something you can drop in. I lean towards including 
> it all the time, but I haven't run it this way myself yet, so there might be 
> some downside I'm unaware of.
> 
> When you call the asprof executable, it attaches the async-profiler to the 
> running jvm using jattach [4].  We could do this as well, if we wanted to 
> avoid including it with the release, but I don't know how much we really 
> benefit from that.  I've run into issues with it when it's unable to detatch 
> correctly, then you're unable to reattach it until after the server is 
> restarted.  On the flip side, I don't know if you're able to set up all the 
> same options for arbitrary profiling when it's loaded as an agent and turned 
> on/off dynamically.  I think we can, based on the integration page [6], but I 
> haven't tried it yet.  It would be a bummer if we only had a single mode of 
> profiling available.  
> 
> The default mode, CPU profiling, is fantastic, but I've also made extensive 
> use of allocation profiling [5] to identify perf issues as well so having 
> that available is a must, imo. Wall clock / off cpu profiling is great for 
> identifying when IO is the root cause, which isn't clearly revealed by on-cpu 
> profiling due to the way threads are scheduled.  When I look at a system I 
> typically do CPU / Wall / Alloc / Off-CPU to be thorough, and the last thing 
> you want to do is have to restart between each one.  You can also specify 
> specific Java methods, include or exclude frames matching specific regex, and 
> a whole slew of other options.  The latest version even supports continuous 
> profiling with heatmaps although I haven't tried it yet.  
> 
> So hopefully the option we go with allows all of that, otherwise the limits 
> would impose more of a headache to me as I'd need to remove it and continue 
> to bring my own.
> 
> Under the hood, the async-profiler uses Linux perf events + <> asynchronous 
> polling of the java stack to match them up and generate it's reports.  As a 
> result, it requires certain permissions to run and get all the details I 
> like.  Specifically these kernel parameters:
> 
> sudo sysctl kernel.perf_event_paranoid=1
> sudo sysctl kernel.kptr_restrict=0
> 
> You also need to enable some capabilities for off-cpu profiliing:
> 
> sudo find /usr/lib/jvm/ -type f -name 'java' -exec setcap 
> "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" {} \;
> 
> Then you can do off-cpu with this wild cryptic version (shout out to Andrei 
> Pangin for helping me with this [7]):
> 
> asprof -e kprobe:schedule -i 2 --cstack dwarf -X '*Unsafe.park*' "${@:2}" $PID
> 
> There's also some subtle issues when it's run in a container, since by 
> default you don't have access to the perf_event_open syscall.  Just something 
> to keep in mind.  This is one of my main grievances with container 
> deployments.
> 
> Indeed Patrick, I am very happy to see this discussion!  Thanks Doug for 
> starting the thread.
> 
> Jon
> 
> [1] https://issues.apache.org/jira/browse/CASSANDRA-15452
> [2] https://issues.apache.org/jira/browse/CASSANDRA-19477
> [3] 
> https://www.youtube.com/watch?v=yNZtnzjyJRI&t=212s&pp=ygUOYXN5bmMgcHJvZmlsZXI%3D
> [4] 
> https://github.com/async-profiler/async-profiler/blob/2b556680dc8f5d02c3f26ac119d835dc2381e604/src/jattach/jattach_hotspot.c#L38
> [5] https://issues.apache.org/jira/browse/CASSANDRA-20428
> [6] 
> https://github.com/async-p

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-13 Thread Josh McKenzie
> The fact o3 used "Bus-factor" as a dimension is just amazing. 
Yeah - that got me too.

On Fri, Jun 13, 2025, at 2:38 PM, Jon Haddad wrote:
> I'd be very happy to see async-profiler included with C*  I've made extensive 
> use of it in my performance evaluations [1][2], and even posted a video about 
> it [3] for general Java perf analysis (among others).  It's part of 
> easy-cass-lab and is easily the most informative tool I've found for the 
> getting to the bottom of anything performance related.
> 
> There's probably a good case to be made for including it with the C* artifact 
> as well as having it be something you can drop in. I lean towards including 
> it all the time, but I haven't run it this way myself yet, so there might be 
> some downside I'm unaware of.
> 
> When you call the asprof executable, it attaches the async-profiler to the 
> running jvm using jattach [4].  We could do this as well, if we wanted to 
> avoid including it with the release, but I don't know how much we really 
> benefit from that.  I've run into issues with it when it's unable to detatch 
> correctly, then you're unable to reattach it until after the server is 
> restarted.  On the flip side, I don't know if you're able to set up all the 
> same options for arbitrary profiling when it's loaded as an agent and turned 
> on/off dynamically.  I think we can, based on the integration page [6], but I 
> haven't tried it yet.  It would be a bummer if we only had a single mode of 
> profiling available.  
> 
> The default mode, CPU profiling, is fantastic, but I've also made extensive 
> use of allocation profiling [5] to identify perf issues as well so having 
> that available is a must, imo. Wall clock / off cpu profiling is great for 
> identifying when IO is the root cause, which isn't clearly revealed by on-cpu 
> profiling due to the way threads are scheduled.  When I look at a system I 
> typically do CPU / Wall / Alloc / Off-CPU to be thorough, and the last thing 
> you want to do is have to restart between each one.  You can also specify 
> specific Java methods, include or exclude frames matching specific regex, and 
> a whole slew of other options.  The latest version even supports continuous 
> profiling with heatmaps although I haven't tried it yet.  
> 
> So hopefully the option we go with allows all of that, otherwise the limits 
> would impose more of a headache to me as I'd need to remove it and continue 
> to bring my own.
> 
> Under the hood, the async-profiler uses Linux perf events + asynchronous 
> polling of the java stack to match them up and generate it's reports.  As a 
> result, it requires certain permissions to run and get all the details I 
> like.  Specifically these kernel parameters:
> 
> sudo sysctl kernel.perf_event_paranoid=1
> sudo sysctl kernel.kptr_restrict=0
> 
> You also need to enable some capabilities for off-cpu profiliing:
> 
> sudo find /usr/lib/jvm/ -type f -name 'java' -exec setcap 
> "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" {} \;
> 
> Then you can do off-cpu with this wild cryptic version (shout out to Andrei 
> Pangin for helping me with this [7]):
> 
> asprof -e kprobe:schedule -i 2 --cstack dwarf -X '*Unsafe.park*' "${@:2}" $PID
> 
> There's also some subtle issues when it's run in a container, since by 
> default you don't have access to the perf_event_open syscall.  Just something 
> to keep in mind.  This is one of my main grievances with container 
> deployments.
> 
> Indeed Patrick, I am very happy to see this discussion!  Thanks Doug for 
> starting the thread.
> 
> Jon
> 
> [1] https://issues.apache.org/jira/browse/CASSANDRA-15452
> [2] https://issues.apache.org/jira/browse/CASSANDRA-19477
> [3] 
> https://www.youtube.com/watch?v=yNZtnzjyJRI&t=212s&pp=ygUOYXN5bmMgcHJvZmlsZXI%3D
> [4] 
> https://github.com/async-profiler/async-profiler/blob/2b556680dc8f5d02c3f26ac119d835dc2381e604/src/jattach/jattach_hotspot.c#L38
> [5] https://issues.apache.org/jira/browse/CASSANDRA-20428
> [6] 
> https://github.com/async-profiler/async-profiler/blob/master/docs/IntegratingAsyncProfiler.md
> [7] https://github.com/async-profiler/async-profiler/issues/907
> 
> 
> On Fri, Jun 13, 2025 at 10:18 AM Patrick McFadin  wrote:
>> The fact o3 used "Bus-factor" as a dimension is just amazing. 
>> 
>> After reading more about the project, the possibilities are pretty 
>> interesting. I suspect we'll see this in a Haddad talk soon. 
>> 
>> On Fri, Jun 13, 2025 at 1:57 AM Josh McKenzie  wrote:
>>> __
>>> I was curious if o3 (model from OpenAI) would be able to do a deep dive 
>>> health check on a repo to assist in considering taking it as a dependency. 
>>> The results can be found here: 
>>> https://chatgpt.com/share/684be703-1d4c-8002-b831-f997f829f4b4
>>> 
>>> Apparently it can, and can do it quite well. This was a useful time saver 
>>> (and honestly did a better job than I usually can in > 10x the time)
>>> 
>>> I'm +1 to taking this as a dependency on the lib in core C*. The rest of 
>

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-13 Thread Jon Haddad
I'd be very happy to see async-profiler included with C*  I've made
extensive use of it in my performance evaluations [1][2], and even posted a
video about it [3] for general Java perf analysis (among others).  It's
part of easy-cass-lab and is easily the most informative tool I've found
for the getting to the bottom of anything performance related.

There's probably a good case to be made for including it with the C*
artifact as well as having it be something you can drop in. I lean towards
including it all the time, but I haven't run it this way myself yet, so
there might be some downside I'm unaware of.

When you call the asprof executable, it attaches the async-profiler to the
running jvm using jattach [4].  We could do this as well, if we wanted to
avoid including it with the release, but I don't know how much we really
benefit from that.  I've run into issues with it when it's unable to
detatch correctly, then you're unable to reattach it until after the server
is restarted.  On the flip side, I don't know if you're able to set up all
the same options for arbitrary profiling when it's loaded as an agent and
turned on/off dynamically.  I think we can, based on the integration page
[6], but I haven't tried it yet.  It would be a bummer if we only had a
single mode of profiling available.

The default mode, CPU profiling, is fantastic, but I've also made extensive
use of allocation profiling [5] to identify perf issues as well so having
that available is a must, imo. Wall clock / off cpu profiling is great for
identifying when IO is the root cause, which isn't clearly revealed by
on-cpu profiling due to the way threads are scheduled.  When I look at a
system I typically do CPU / Wall / Alloc / Off-CPU to be thorough, and the
last thing you want to do is have to restart between each one.  You can
also specify specific Java methods, include or exclude frames matching
specific regex, and a whole slew of other options.  The latest version even
supports continuous profiling with heatmaps although I haven't tried it
yet.

So hopefully the option we go with allows all of that, otherwise the limits
would impose more of a headache to me as I'd need to remove it and continue
to bring my own.

Under the hood, the async-profiler uses Linux perf events + asynchronous
polling of the java stack to match them up and generate it's reports.  As a
result, it requires certain permissions to run and get all the details I
like.  Specifically these kernel parameters:

sudo sysctl kernel.perf_event_paranoid=1
sudo sysctl kernel.kptr_restrict=0

You also need to enable some capabilities for off-cpu profiliing:

sudo find /usr/lib/jvm/ -type f -name 'java' -exec setcap
"cap_perfmon,cap_sys_ptrace,cap_syslog=ep" {} \;

Then you can do off-cpu with this wild cryptic version (shout out to Andrei
Pangin for helping me with this [7]):

asprof -e kprobe:schedule -i 2 --cstack dwarf -X '*Unsafe.park*' "${@:2}"
$PID

There's also some subtle issues when it's run in a container, since by
default you don't have access to the perf_event_open syscall.  Just
something to keep in mind.  This is one of my main grievances with
container deployments.

Indeed Patrick, I am very happy to see this discussion!  Thanks Doug for
starting the thread.

Jon

[1] https://issues.apache.org/jira/browse/CASSANDRA-15452
[2] https://issues.apache.org/jira/browse/CASSANDRA-19477
[3]
https://www.youtube.com/watch?v=yNZtnzjyJRI&t=212s&pp=ygUOYXN5bmMgcHJvZmlsZXI%3D
[4]
https://github.com/async-profiler/async-profiler/blob/2b556680dc8f5d02c3f26ac119d835dc2381e604/src/jattach/jattach_hotspot.c#L38
[5] https://issues.apache.org/jira/browse/CASSANDRA-20428
[6]
https://github.com/async-profiler/async-profiler/blob/master/docs/IntegratingAsyncProfiler.md
[7] https://github.com/async-profiler/async-profiler/issues/907


On Fri, Jun 13, 2025 at 10:18 AM Patrick McFadin  wrote:

> The fact o3 used "Bus-factor" as a dimension is just amazing.
>
> After reading more about the project, the possibilities are pretty
> interesting. I suspect we'll see this in a Haddad talk soon.
>
> On Fri, Jun 13, 2025 at 1:57 AM Josh McKenzie 
> wrote:
>
>> I was curious if o3 (model from OpenAI) would be able to do a deep dive
>> health check on a repo to assist in considering taking it as a dependency.
>> The results can be found here:
>> https://chatgpt.com/share/684be703-1d4c-8002-b831-f997f829f4b4
>>
>> Apparently it can, and can do it quite well. This was a useful time saver
>> (and honestly did a better job than I usually can in > 10x the time)
>>
>> I'm +1 to taking this as a dependency on the lib in core C*. The rest of
>> the ecosystem can consume it (more easily if we move to a cassandra-shared
>> regime shared library build as well), and it opens up some interesting
>> opportunities for us in both how we test core C* proper and what we expose
>> in tooling.
>>
>> On Thu, Jun 12, 2025, at 7:36 PM, Paulo Motta wrote:
>>
>> I'd prefer to avoid calling an external process and use the l

Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-13 Thread Patrick McFadin
The fact o3 used "Bus-factor" as a dimension is just amazing.

After reading more about the project, the possibilities are pretty
interesting. I suspect we'll see this in a Haddad talk soon.

On Fri, Jun 13, 2025 at 1:57 AM Josh McKenzie  wrote:

> I was curious if o3 (model from OpenAI) would be able to do a deep dive
> health check on a repo to assist in considering taking it as a dependency.
> The results can be found here:
> https://chatgpt.com/share/684be703-1d4c-8002-b831-f997f829f4b4
>
> Apparently it can, and can do it quite well. This was a useful time saver
> (and honestly did a better job than I usually can in > 10x the time)
>
> I'm +1 to taking this as a dependency on the lib in core C*. The rest of
> the ecosystem can consume it (more easily if we move to a cassandra-shared
> regime shared library build as well), and it opens up some interesting
> opportunities for us in both how we test core C* proper and what we expose
> in tooling.
>
> On Thu, Jun 12, 2025, at 7:36 PM, Paulo Motta wrote:
>
> I'd prefer to avoid calling an external process and use the library if
> possible. Not sure about including it in the project by default, but also
> not against.
>
> If there's contention about including it, I wonder if it would make sense
> to explore  java's optional module extension[1] to make this available
> optionally ? I can see this being useful for other extensions if we haven't
> explored that option.
>
> Then we could have another project cassandra-sidecar-extensions (or
> similar) that would be linked by sidecar/advanced operators to enable
> extended featureset in the main process.
>
>
> [1] -
> https://openjdk.org/projects/jigsaw/doc/topics/optional.html
>
> On Thu, 12 Jun 2025 at 17:57 Doug Rohrer  wrote:
>
> Hey folks!
>
> We're looking into enabling the sidecar to collect async profiles from
> Cassandra and, digging through the async-profiler code and usage, it seems
> like there may be a few different ways to do it. I’m curious if other folks
> have already done this beyond just “run asprof with the pid of the
> Cassandra process”, as I’m a bit hesitant to depend on executing an
> external process from the Sidecar to gather the actual profile if we can
> avoid it.
>
> There seem to be some opportunities to integrate the profiler into another
> project (see
> https://github.com/async-profiler/async-profiler/blob/master/docs/IntegratingAsyncProfiler.md#using-java-api)
> but it seems this would end up having to be part of Cassandra, and somehow
> callable via the sidecar (JMX? Some virtual table interface where you
> insert a row to start a profile with the profiler options, and it kicks off
> the profile, dumping the results into the table when it’s done?).
>
> The benefit in putting this functionality into Cassandra would be that
> other consumers (in-jvm dtests, python dtests, other monitoring systems
> where Sidecar isn’t available, easy-cass-lab) would be able to leverage the
> same interface rather than having to re-invent the wheel each time.
>
> Drawback is it’s another library, and one with native library
> dependencies, added to the class path and loaded at runtime.
>
> Thoughts? Previous experiences (good or bad)?
>
> Thanks,
>
> Doug
>
>
>


Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-13 Thread Josh McKenzie
I was curious if o3 (model from OpenAI) would be able to do a deep dive health 
check on a repo to assist in considering taking it as a dependency. The results 
can be found here: 
https://chatgpt.com/share/684be703-1d4c-8002-b831-f997f829f4b4

Apparently it can, and can do it quite well. This was a useful time saver (and 
honestly did a better job than I usually can in > 10x the time)

I'm +1 to taking this as a dependency on the lib in core C*. The rest of the 
ecosystem can consume it (more easily if we move to a cassandra-shared regime 
shared library build as well), and it opens up some interesting opportunities 
for us in both how we test core C* proper and what we expose in tooling.

On Thu, Jun 12, 2025, at 7:36 PM, Paulo Motta wrote:
> I'd prefer to avoid calling an external process and use the library if 
> possible. Not sure about including it in the project by default, but also not 
> against.
> 
> If there's contention about including it, I wonder if it would make sense to 
> explore  java's optional module extension[1] to make this available 
> optionally ? I can see this being useful for other extensions if we haven't 
> explored that option.
> 
> Then we could have another project cassandra-sidecar-extensions (or similar) 
> that would be linked by sidecar/advanced operators to enable extended 
> featureset in the main process.
> 
> 
> [1] - 
> https://openjdk.org/projects/jigsaw/doc/topics/optional.html
> 
> On Thu, 12 Jun 2025 at 17:57 Doug Rohrer  wrote:
>> Hey folks!
>> 
>> We're looking into enabling the sidecar to collect async profiles from 
>> Cassandra and, digging through the async-profiler code and usage, it seems 
>> like there may be a few different ways to do it. I’m curious if other folks 
>> have already done this beyond just “run asprof with the pid of the Cassandra 
>> process”, as I’m a bit hesitant to depend on executing an external process 
>> from the Sidecar to gather the actual profile if we can avoid it.
>> 
>> There seem to be some opportunities to integrate the profiler into another 
>> project (see 
>> https://github.com/async-profiler/async-profiler/blob/master/docs/IntegratingAsyncProfiler.md#using-java-api)
>>  but it seems this would end up having to be part of Cassandra, and somehow 
>> callable via the sidecar (JMX? Some virtual table interface where you insert 
>> a row to start a profile with the profiler options, and it kicks off the 
>> profile, dumping the results into the table when it’s done?).
>> 
>> The benefit in putting this functionality into Cassandra would be that other 
>> consumers (in-jvm dtests, python dtests, other monitoring systems where 
>> Sidecar isn’t available, easy-cass-lab) would be able to leverage the same 
>> interface rather than having to re-invent the wheel each time.
>> 
>> Drawback is it’s another library, and one with native library dependencies, 
>> added to the class path and loaded at runtime.
>> 
>> Thoughts? Previous experiences (good or bad)?
>> 
>> Thanks,
>> 
>> Doug


Re: [DISCUSS] CASSSIDECAR-254 - Enabling sidecar to collect async profiles

2025-06-12 Thread Paulo Motta
I'd prefer to avoid calling an external process and use the library if
possible. Not sure about including it in the project by default, but also
not against.

If there's contention about including it, I wonder if it would make sense
to explore  java's optional module extension[1] to make this available
optionally ? I can see this being useful for other extensions if we haven't
explored that option.

Then we could have another project cassandra-sidecar-extensions (or
similar) that would be linked by sidecar/advanced operators to enable
extended featureset in the main process.

[1] -
https://openjdk.org/projects/jigsaw/doc/topics/optional.html

On Thu, 12 Jun 2025 at 17:57 Doug Rohrer  wrote:

> Hey folks!
>
> We're looking into enabling the sidecar to collect async profiles from
> Cassandra and, digging through the async-profiler code and usage, it seems
> like there may be a few different ways to do it. I’m curious if other folks
> have already done this beyond just “run asprof with the pid of the
> Cassandra process”, as I’m a bit hesitant to depend on executing an
> external process from the Sidecar to gather the actual profile if we can
> avoid it.
>
> There seem to be some opportunities to integrate the profiler into another
> project (see
> https://github.com/async-profiler/async-profiler/blob/master/docs/IntegratingAsyncProfiler.md#using-java-api)
> but it seems this would end up having to be part of Cassandra, and somehow
> callable via the sidecar (JMX? Some virtual table interface where you
> insert a row to start a profile with the profiler options, and it kicks off
> the profile, dumping the results into the table when it’s done?).
>
> The benefit in putting this functionality into Cassandra would be that
> other consumers (in-jvm dtests, python dtests, other monitoring systems
> where Sidecar isn’t available, easy-cass-lab) would be able to leverage the
> same interface rather than having to re-invent the wheel each time.
>
> Drawback is it’s another library, and one with native library
> dependencies, added to the class path and loaded at runtime.
>
> Thoughts? Previous experiences (good or bad)?
>
> Thanks,
>
> Doug
>