Hey! Instead of adding new logic for this, can we make the flamegraphs enabled by default?
Based on my experience almost everyone wants it enabled , doesn't seem to add any overhead when they are not actually checked on the UI Cheers, Gyula On Tue, Aug 19, 2025 at 1:27 PM Danny Cranmer <dannycran...@apache.org> wrote: > Hello Poorvank, > > Thanks for driving this, I can understand how dynamically enabling > FlameGraphs can be powerful, so +1 on the general idea. > > 1. Instead of adding a FlameGraph specific REST API did you consider adding > a more general config API? Similar to that of the dynamical job > configuration [1] endpoint but for cluster configs instead of job? We could > add an allow list of supported config options and start with Flamegraph. > This would allow other configs to use the API in the future without adding > more APIs. > 2. nit: As for the UI, I would prefer for the settings to take up less > space. The new options are at the top of the view, even when not expanded. > > Thanks, > Danny > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-530%3A+Dynamic+job+configuration > > On Tue, Aug 12, 2025 at 4:31 AM Poorvank Bhatia <puravbhat...@gmail.com> > wrote: > > > Hi all, > > > > I would like to open a discussion proposing the ability to enable > > flamegraphs at runtime and make their configuration i.e number of > samples, > > delay between samples, and stack depth *dynamically adjustable via the > Web > > UI*, without requiring any job or cluster restarts. > > > > As of now, enabling flamegraphs requires setting > > *rest.flamegraph.enabled=true* and restarting the Job. This is not ideal > > for debugging live issues, especially in production environments. > > > > I discussed this idea offline with Roman Khachatryan (author of FLIP-530 > > < > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-530%3A+Dynamic+job+configuration > > >), > > Rui Fan, and Arvid Heise. While Rui noted that this could potentially > align > > with FLIP-530’s direction, Roman confirmed that it’s better handled as a > > separate effort, since FLIP-530 > > < > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-530%3A+Dynamic+job+configuration > > > > > is scoped to job-level config, whereas this proposal addresses > > cluster-level observability via RestOptions. > > > > For Design Details, Please refer: Dynamic Flamegraph via UI > > < > > > https://docs.google.com/document/d/1A9fLFgXMGxQQn6X8WCv7mLL21AnLqrDFvLSHnUg8rLA/edit?tab=t.0#heading=h.s351fc464ma6 > > > > > > > I’ve attached a short demo to help visualize the proposed feature and > > gather feedback. Demo > > < > > > https://drive.google.com/file/d/1iik6aOc2uc9sFlHFlT8YDX5TKFdoD15u/view?usp=sharing > > > > > > > Looking forward to your thoughts. > > > > Regards, > > > > Poorvank Bhatia > > >