Hi everyone,

I want to profile my flink jobs to find bottlenecks. I read the issue 
https://issues.apache.org/jira/browse/FLINK-964 and my question is whether 
there are currently ongoing efforts to bring the profiling data to the web 
frontend.

Additionally, I was thinking of some kind of logical profiling, that measures 
the elements (like tuples) being passed among the operators. That way one could 
better understand the properties of intermediate data, e.g., join 
cardinalities. Plotting these data against a time axis, one would come up with 
something like a data flow profile of the job. However, before engaging in 
creating such profiles, I wanted to ask you if the system already keeps track 
of such data. For instance, the job history graphs provide something similar, 
but the scheduling states of tasks are not necessarily identical to the data 
flow through them.
I am happy for any comments!

Cheers,
Sebastian

Reply via email to