Hi Sebastian!

There is some profiling code that was used by previous versions of Flink
(Stratosphere). The profiling works, but there is currently nothing that
displays the profiling data.

It would be a great addition to integrate displaying the profiling code in
the web frontend, or making it available for download.

Have a look at those classes here:
 - JobManager side :
https://github.com/apache/incubator-flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/profiling/impl/JobManagerProfilerImpl.java
 - TaskManager sied :
https://github.com/apache/incubator-flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/profiling/impl/TaskManagerProfilerImpl.java

Daniel Warneke authored those, maybe he can chime in and give a few pointers

Greetings,
Stephan




On Tue, Aug 19, 2014 at 11:08 AM, Kruse, Sebastian <[email protected]>
wrote:

> Hi everyone,
>
> I want to profile my flink jobs to find bottlenecks. I read the issue
> https://issues.apache.org/jira/browse/FLINK-964 and my question is
> whether there are currently ongoing efforts to bring the profiling data to
> the web frontend.
>
> Additionally, I was thinking of some kind of logical profiling, that
> measures the elements (like tuples) being passed among the operators. That
> way one could better understand the properties of intermediate data, e.g.,
> join cardinalities. Plotting these data against a time axis, one would come
> up with something like a data flow profile of the job. However, before
> engaging in creating such profiles, I wanted to ask you if the system
> already keeps track of such data. For instance, the job history graphs
> provide something similar, but the scheduling states of tasks are not
> necessarily identical to the data flow through them.
> I am happy for any comments!
>
> Cheers,
> Sebastian
>

Reply via email to