Hey Ufuk and Stephan, you've replied on dev@ to a conversation happening on JIRA. I would suggest to re-post your messages in JIRA. (there is no automated mirroring).
-- Robert On Tue, Aug 26, 2014 at 11:57 AM, Stephan Ewen <[email protected]> wrote: > Very cool first prototype, I like it! > > I am posting a quick summary of the status and the other ideas that have > been floating around in the context of the job profiling: > > - There is quite a bit of profiling data gathered, but I think some stuff > is also a bit out of date (for example the gate profiling does not work and > make sense any more because the internal models changed) > > - We are currently thinking to gather data stats (byte and record counts) > from the operators as well. This could go well together with the profiling. > It would be good if the profiling code was generic in the sense that it > allows to transfer arbitrary time series of metrics. It makes sense to > define scopes for these metrics, such as for example "global (cluster > profiling)", "singe machine (machine profiling)", "operator", so these > metrics would be displayed in the web frontend in the respective section. > > - The memory profiling is a bit senseless right now, because the JVMs are > always of the roughly same memory size, once ramped up. Instead, I would > add the "managed memory" of Flink. > > - I think a lot of the machine profiling code (cpu utilization, network > throughput) works currently only on Linux. > > > As a side note: I think it makes sense to integrate the currently separate > profiling code communication (RPC) with the regular coordination RPCs. That > is transparent (probably 50 lines) change once we have Till's changes > merged, which bases the distributed coordination on Akka. > > > On Tue, Aug 26, 2014 at 10:20 AM, Ufuk Celebi <[email protected]> wrote: > > > This GSoC proposal [1] might also be of interest. > > > > [1] > > > > > https://github.com/stratosphere/stratosphere/wiki/GSoC-2014-Project-Proposal-Draft-by-Rajika-Kumarasiri > > > > > > On Tue, Aug 26, 2014 at 10:12 AM, Sebastian Kruse (JIRA) < > [email protected]> > > wrote: > > > > > > > > [ > > > > > > https://issues.apache.org/jira/browse/FLINK-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110468#comment-14110468 > > > ] > > > > > > Sebastian Kruse commented on FLINK-964: > > > --------------------------------------- > > > > > > Hey guys, > > > > > > I am happy to hear that you like it! :) > > > > > > But please also consider that this prototype was thought as a first > spike > > > and baseline for further discussion. There is a lot more profiling data > > > available, e.g., stats per task manager and execution vertex. I propose > > to > > > have a bit of a discussion about what of those data to include and how. > > > > > > Cheers, > > > Sebastian > > > > > > > Integrate profiling code with web interface > > > > ------------------------------------------- > > > > > > > > Key: FLINK-964 > > > > URL: https://issues.apache.org/jira/browse/FLINK-964 > > > > Project: Flink > > > > Issue Type: Improvement > > > > Components: Local Runtime, Webfrontend > > > > Affects Versions: 0.6-incubating > > > > Reporter: Stephan Ewen > > > > Assignee: Jonathan Hasenburg > > > > > > > > This issue is subject to discussion. > > > > The profiling code currently needs to be kept in sync with the job > > graph > > > code, execution graph code, and runtime code. > > > > Since that part of the code is undergoing quite some changes and the > > > profiling code is not used right now, I suggest to remove it, or move > it > > to > > > an artifact repository. > > > > > > > > > > > > -- > > > This message was sent by Atlassian JIRA > > > (v6.2#6252) > > > > > >
