I'd second to Alex's concerns. Is there a reason why you can't use the
async-profiler directly? In what kind of environment are your Flink
clusters running (YARN / k8s / ...)?

Best,
D.

On Wed, Jan 26, 2022 at 4:32 PM Alexander Fedulov <alexan...@ververica.com>
wrote:

> Hi Jacky,
>
> Could you please clarify what kind of *problems* you experience with the
> large parallelism? You referred to D3, is it something related to rendering
> on the browser side or is it about the samples collection process? Were you
> able to identify the bottleneck?
>
> Fundamentally I have some concerns regarding the proposed approach:
> 1. Calling shell scripts triggered via the web UI is a security concern and
> it needs to be evaluated carefully if it could introduce any unexpected
> attack vectors (depending on the implementation, passed parameters etc.)
> 2. My understanding is that the async-profiler implementation is
> system-dependent. How do you propose to handle multiple architectures?
> Would you like to ship each available implementation within Flink? [1]
> 3. Do you plan to make use of full async-profiler features including native
> calls sampling with perf_events? If so, the issue I see is that some
> environments restrict ptrace calls by default [2]
>
> [1] https://github.com/jvm-profiling-tools/async-profiler#download
> [2]
>
> https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces
>
>
> Best,
> Alexander Fedulov
>
> On Wed, Jan 26, 2022 at 1:59 PM 李森 <lisen...@icloud.com.invalid> wrote:
>
> > This is an expected feature, as we also experienced browser crashes on
> > existing operator-level flame graphs
> >
> > Best,
> > Echo Lee
> >
> > > 在 2022年1月24日,下午6:16,David Morávek <david.mora...@gmail.com> 写道:
> > >
> > > Hi Jacky,
> > >
> > > The link seems to be broken, here is the correct one [1].
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs
> > >
> > > Best,
> > > D.
> > >
> > >> On Mon, Jan 24, 2022 at 9:48 AM Jacky Lau <281293...@qq.com.invalid>
> > wrote:
> > >>
> > >> Hi All,
> > >> &nbsp; &nbsp; I would like to start the discussion on FLIP-213 <
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs&gt
> > >> ;
> > >> &nbsp;which aims to provide taskmanager level(process level) flame
> graph
> > >> by async profiler, which is most popular tool in java performance. and
> > the
> > >> arthas and intellij both use it.&nbsp;
> > >> And we support it in our ant group company.
> > >> &nbsp; &nbsp;And&nbsp;Flink supports FLIP-165: Operator's Flame Graphs
> > >> now. and it draw flame graph by the&nbsp;front-end
> > >> libraries&nbsp;d3-flame-graph, which has some problem in&nbsp; jobs
> > >> of&nbsp;large of parallelism.
> > >> &nbsp; &nbsp;Please be aware that the FLIP wiki area is not fully done
> > >> since i don't konw whether it will accept by
> flink&nbsp;community.&nbsp;
> > >> &nbsp; &nbsp;Feel free to add your thoughts to make this feature
> > better! i
> > >> am looking forward&nbsp; to all your response. Thanks too much!
> > >>
> > >>
> > >>
> > >>
> > >> Best Jacky Lau
> >
>

Reply via email to