Hi! Thanks for writing this up. I think it looks quite reasonable (I hope I understood that design correctly)
There is one point of confusions left for me, though: The MetricDumper and MetricSnapshot: I think it is just the names that confuse me here. It looks like they define a way to query the metrics in the Metric Registry in a standard schema (independent of the scope formats). Should the "dumper" maybe be called "MetricsQueryService" or so (the query service returns a MetricSnapshot, if I understand correctly). It would be great if the "query service" would not need metrics to be registered - saves us some effort during startup / teardown. It looks as if the query service could just use the the root-most component metric groups to walk the tree of whatever metric is currently there and put it into the current snapshot. One open questions that I have is: How do you know how to merge the metrics from the subtasks, for example in case you want a metric across subtasks. In general, not transferring objects (only strings / numbers) would be preferable, because the WebMonitor may run in an environment where no user-code classloader can be used. It may run in the dispatcher (which must be trusted and cannot execute user code). Greetings, Stephan On Thu, Jul 28, 2016 at 3:12 PM, Chesnay Schepler <ches...@apache.org> wrote: > Hello, > > I just created a new FLIP which aims at exposing our metrics to the > WebInterface. > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-7%3A+Expose+metrics+to+WebInterface > > Looking forward to feedback :) > > Regards, > Chesnay Schepler >