[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622421#comment-15622421 ] ASF GitHub Bot commented on FLINK-4733: --- Github user zentol closed the pull request at: https://github.com/apache/flink/pull/2616 > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15615178#comment-15615178 ] ASF GitHub Bot commented on FLINK-4733: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2616 The `CoordinatorShutdownTest` fixes look reasonable. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614841#comment-15614841 ] ASF GitHub Bot commented on FLINK-4733: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/2616 Both issues should be fixed now, but I'll let travis take another stab at it. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614772#comment-15614772 ] ASF GitHub Bot commented on FLINK-4733: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/2616 The `CoordinatorShutdownTest` can't be fixed with a cast. This test assumes that the actual ExecutionGraph is still available when a job is finished since it tries to access the CheckpointCoordinator within. This however no longer works; finished jobs are immediately archived, and the archived version does not contain the CheckpointCoordinator. A possible fix is to not let the job fail immediately but block him, ask for the ExecutionGraph while being blocked (which, since the job is still running actually returns the runtime ExecutionGraph) and store the reference, and then let the job fail. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614739#comment-15614739 ] ASF GitHub Bot commented on FLINK-4733: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/2616 I just found another bug related to forked-chains. Right now only the output of a single operator is used as the task output counter, where as we should actually use both of them. Will fix it while merging. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612284#comment-15612284 ] ASF GitHub Bot commented on FLINK-4733: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2616 I tested the change locally, it works. +1 to merge. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15612172#comment-15612172 ] ASF GitHub Bot commented on FLINK-4733: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2616 Thank you for rebasing. This run: had the following error: https://s3.amazonaws.com/archive.travis-ci.org/jobs/170428661/log.txt ``` Failed tests: CoordinatorShutdownTest.testCoordinatorShutsDownOnFailure:94 org.apache.flink.runtime.executiongraph.ArchivedExecutionGraph cannot be cast to org.apache.flink.runtime.executiongraph.ExecutionGraph ``` I suspect we need to change the cast there to `AccessExecutionGraph`. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605046#comment-15605046 ] ASF GitHub Bot commented on FLINK-4733: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/2616 Note that 2 new methods were introduced into the `StreamConfig`: `setChainStart() / isChainStart()` to better determine which output counter should be used for the task. Previously this used the same logic as the operator name extraction, which was bugged for multi-chains. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605036#comment-15605036 ] ASF GitHub Bot commented on FLINK-4733: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/2616 @rmetzger Rebased version is up. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15604837#comment-15604837 ] ASF GitHub Bot commented on FLINK-4733: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/2616 Will rebase now, this will also fix the test failure. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15604835#comment-15604835 ] ASF GitHub Bot commented on FLINK-4733: --- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/2616#discussion_r84868336 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JobVertexDetailsHandler.java --- @@ -99,11 +83,34 @@ public String handleRequest(ExecutionJobVertex jobVertex, Mappa gen.writeNumberField("end-time", endTime); gen.writeNumberField("duration", duration); + IOMetrics ioMetrics = vertex.getCurrentExecutionAttempt().getIOMetrics(); + + long numBytesIn = 0; + long numBytesOut = 0; + long numRecordsIn = 0; + long numRecordsOut = 0; + + if (ioMetrics != null) { // execAttempt is already finished, use final metrics stored in ExecutionGraph + numBytesIn = ioMetrics.getNumBytesInLocal() + ioMetrics.getNumBytesInRemote(); + numBytesOut = ioMetrics.getNumBytesOut(); + numRecordsIn = ioMetrics.getNumRecordsIn(); + numRecordsOut = ioMetrics.getNumRecordsOut(); + } else { // execAttempt is still running, use MetricQueryService instead + fetcher.update(); + MetricStore.SubtaskMetricStore metrics = fetcher.getMetricStore().getSubtaskMetricStore(vertex.getJobId().toString(), vertex.getJobvertexId().toString(), vertex.getParallelSubtaskIndex()); + if (metrics != null) { + numBytesIn += Long.valueOf(metrics.getMetric("numBytesInLocal", "0")) + Long.valueOf(metrics.getMetric("numBytesInRemote", "0")); + numBytesOut += Long.valueOf(metrics.getMetric("numBytesOut", "0")); + numRecordsIn += Long.valueOf(metrics.getMetric("numRecordsIn", "0")); + numRecordsOut += Long.valueOf(metrics.getMetric("numRecordsOut", "0")); --- End diff -- Agreed, I've opened a JIRA for that, see FLINK-4906. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15604826#comment-15604826 ] ASF GitHub Bot commented on FLINK-4733: --- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/2616#discussion_r84867488 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JobDetailsHandler.java --- @@ -147,11 +143,36 @@ public String handleRequest(ExecutionGraph graph, Mapparams) th } gen.writeEndObject(); + long numBytesIn = 0; + long numBytesOut = 0; + long numRecordsIn = 0; + long numRecordsOut = 0; + + for (ExecutionVertex vertex : ejv.getTaskVertices()) { + IOMetrics ioMetrics = vertex.getCurrentExecutionAttempt().getIOMetrics(); + + if (ioMetrics != null) { // execAttempt is already finished, use final metrics stored in ExecutionGraph + numBytesIn += ioMetrics.getNumBytesInLocal() + ioMetrics.getNumBytesInRemote(); --- End diff -- That's a relic from the previous iterations, `getNumBytesInTotal()` did not always exist :) > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15604780#comment-15604780 ] ASF GitHub Bot commented on FLINK-4733: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/2616#discussion_r84697528 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JobVertexDetailsHandler.java --- @@ -99,11 +83,34 @@ public String handleRequest(ExecutionJobVertex jobVertex, Mappa gen.writeNumberField("end-time", endTime); gen.writeNumberField("duration", duration); + IOMetrics ioMetrics = vertex.getCurrentExecutionAttempt().getIOMetrics(); + + long numBytesIn = 0; + long numBytesOut = 0; + long numRecordsIn = 0; + long numRecordsOut = 0; + + if (ioMetrics != null) { // execAttempt is already finished, use final metrics stored in ExecutionGraph + numBytesIn = ioMetrics.getNumBytesInLocal() + ioMetrics.getNumBytesInRemote(); + numBytesOut = ioMetrics.getNumBytesOut(); + numRecordsIn = ioMetrics.getNumRecordsIn(); + numRecordsOut = ioMetrics.getNumRecordsOut(); + } else { // execAttempt is still running, use MetricQueryService instead + fetcher.update(); + MetricStore.SubtaskMetricStore metrics = fetcher.getMetricStore().getSubtaskMetricStore(vertex.getJobId().toString(), vertex.getJobvertexId().toString(), vertex.getParallelSubtaskIndex()); + if (metrics != null) { + numBytesIn += Long.valueOf(metrics.getMetric("numBytesInLocal", "0")) + Long.valueOf(metrics.getMetric("numBytesInRemote", "0")); + numBytesOut += Long.valueOf(metrics.getMetric("numBytesOut", "0")); + numRecordsIn += Long.valueOf(metrics.getMetric("numRecordsIn", "0")); + numRecordsOut += Long.valueOf(metrics.getMetric("numRecordsOut", "0")); --- End diff -- The metric names are used in many places. I think we should use constants for them. > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15604779#comment-15604779 ] ASF GitHub Bot commented on FLINK-4733: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/2616#discussion_r84696513 --- Diff: flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/handlers/JobDetailsHandler.java --- @@ -147,11 +143,36 @@ public String handleRequest(ExecutionGraph graph, Mapparams) th } gen.writeEndObject(); + long numBytesIn = 0; + long numBytesOut = 0; + long numRecordsIn = 0; + long numRecordsOut = 0; + + for (ExecutionVertex vertex : ejv.getTaskVertices()) { + IOMetrics ioMetrics = vertex.getCurrentExecutionAttempt().getIOMetrics(); + + if (ioMetrics != null) { // execAttempt is already finished, use final metrics stored in ExecutionGraph + numBytesIn += ioMetrics.getNumBytesInLocal() + ioMetrics.getNumBytesInRemote(); --- End diff -- (no need to update) There's a `getNumBytesInTotal()` method for this ;) > Port WebFrontend to new metric system > - > > Key: FLINK-4733 > URL: https://issues.apache.org/jira/browse/FLINK-4733 > Project: Flink > Issue Type: Improvement > Components: Metrics, TaskManager, Webfrontend >Affects Versions: 1.1.2 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler > Fix For: 1.2.0 > > > While the WebFrontend has access to the metric system it still relies on > older code in some parts. > The TaskManager metrics are still gathered using the Codahale library and > send with the heartbeats. > Task related metrics (numRecordsIn etc) are still gathered using > accumulators, which are accessed through the execution graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4733) Port WebFrontend to new metric system
[ https://issues.apache.org/jira/browse/FLINK-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559484#comment-15559484 ] ASF GitHub Bot commented on FLINK-4733: --- GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/2616 [FLINK-4733] Port WebInterface to metric system # This PR relies on #2613, #2614 and #2615. Thus, the first 5 commits should not be reviewed here. This PR ports the remaining parts of the WebInterface to rely on the metric system. # TaskManager metrics In a7011e8305d7c828fabc4245358c2d21568fd561 the TaskManagersHandler is modified to use the metric system. In addition, the garbage collector section in the WebInterface was enhanced to no longer rely on hard-coded GC names, but instead be dynamic. The recently introduced network metrics have been added as well. cbff6d6aab80bc423a09aa6b62c80a2f409d796a then removes the remnants of the old metrics that are now unused. This affects the TaskManager(no longer gathers these metrics) and Heartbeat messages (no longer includes a metrics report). As a result the DropWizard dependency was removed. The transitive jackson dependency is now explicitly set for both flink-runtime and flink-runtime-web. # Task metrics The Webinterface shows how many records/bytes each task has received or sent. Until now these were gathered with system specific accumulators. cab25496ff5991de60e757f68c5d5139c86f34ba these accumulators were removed. Under the new system, bytes In/Out is measured per task (since it doesn't make sense within chained operators), while records In/Out is measured per operator. In order to display the records metrics for each task it was thus necessary to "reuse" some operator counters for the task. This is implemented in 16983485198a61bec0418adb833508dcaf276170 by re-registering the numRecordsIn counter of the first operator in the chain and the the numRecordsOut counter of the last operator on the task level This re-use could (sadly) not be done automatically within the metric system. Instead 2 helper methods were added to the OperatorIOMetricGroup, which are called for example within BatchTask#invoke(), which forward the counters to the TaskIOMetricGroup where they are stored and re-registered. With these metrics being re-registered they can be accessed easily via the MetricQueryService from the WebInterface handlers. The downside is that this service provides no guarantee that the most up-to-date metrics for a finished task will be transferred. It was thus necessary to store a snapshot of these IOMetrics within the ExecutionGraph, similar to the system accumulators, which the handlers could access as well. The handlers were finally adjusted in 8be5145a9406dc8d6d661299c9ee98aa09233df4. For running tasks they access metrics via the MetricQueryService, whereas for finished tasks they rely on the metrics stored in the ExecutionGraph. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zentol/flink 4733_metrics_port Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2616.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2616 commit 5f0f3598fa5d0fdf8b61d591e2bb94b74924ee0d Author: zentolDate: 2016-10-07T11:02:10Z [FLINK-4773] [metrics] [refactor] Rename IOMetricGroup to TaskIOMetricGroup commit df40a58c74e7f0fc3feec4a5848f1627bf4537dd Author: zentol Date: 2016-10-05T13:04:03Z [FLINK-4773] [metrics] [refactor] Introduce OperatorIOMetricGroup commit 2685f6a908a0ce4cc9fe3d97beca005ea3d59ee5 Author: zentol Date: 2016-10-07T08:11:31Z [FLINK-4772] [metrics] Store metrics as strings in MetricStore commit 33297e716a0a327fad20331813a582642c5e68e3 Author: zentol Date: 2016-10-07T08:16:49Z [FLINK-4775] [metrics] Simplify MetricStore access commit dfed8166272b361684594f61b401c38f0d68ebd6 Author: zentol Date: 2016-10-07T11:11:58Z [FLINK-4774] [metrics] [hotfix] Fix scope concatenation in QueryScopeInfo commit a7011e8305d7c828fabc4245358c2d21568fd561 Author: zentol Date: 2016-10-07T11:12:31Z [FLINK-4733] [metrics] Port TaskManagersHandler commit cbff6d6aab80bc423a09aa6b62c80a2f409d796a Author: zentol Date: 2016-10-07T11:12:41Z [FLINK-4733] [metrics] Remove old TaskManager metrics commit cab25496ff5991de60e757f68c5d5139c86f34ba Author: zentol Date: 2016-10-05T13:12:22Z [FLINK-4733] [metrics] Remove system accumulators commit 16983485198a61bec0418adb833508dcaf276170