[
https://issues.apache.org/jira/browse/FLINK-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15922825#comment-15922825
]
ASF GitHub Bot commented on FLINK-5090:
---------------------------------------
Github user zentol commented on a diff in the pull request:
https://github.com/apache/flink/pull/3348#discussion_r105753418
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/InputGateMetrics.java
---
@@ -0,0 +1,168 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.partition.consumer;
+
+import org.apache.flink.metrics.Gauge;
+import org.apache.flink.metrics.MetricGroup;
+
+import static org.apache.flink.util.Preconditions.checkNotNull;
+
+public class InputGateMetrics {
+
+ private final SingleInputGate inputGate;
+
+ private long lastTotal = -1;
+
+ private int lastMin = -1;
+
+ private int lastMax = -1;
+
+ private float lastAvg = -1.0f;
+
+ //
------------------------------------------------------------------------
+
+ private InputGateMetrics(SingleInputGate inputGate) {
+ this.inputGate = checkNotNull(inputGate);
+ }
+
+ //
------------------------------------------------------------------------
+
+ // these methods are package private to make access from the nested
classes faster
+
+ long refreshAndGetTotal() {
+ long total;
+ if ((total = lastTotal) == -1) {
+ refresh();
--- End diff --
Custom objects can't be displayed properly in the web interface since we
call ```toString()``` on it. The same happens in most reporters; so this isn't
really an option.
As it stands we don't have a single metrics that is guaranteed to be 100%
consistent with other metrics. numRecordsOut and numBytesOut to not descriibe
the same moment in time. Neither is this guaranteed for the checkpoint metrics;
while these are updated all at once (from the outside), there is no mechanism
that prevents this update in the middle of a report.
I don't know a lot about the network stack; so whether it is truly
necessary to have all metrics describe one point in time I can't say.
If this is necessary the only way i can think of right now is abusing the
View metric type. View's are meant an add-on for metrics that want to be
updated in regular intervals (5 seconds) regardless of when their value is
actually requested. A metric that only implements the View interface is never
reported, but still updated, so you could have this view update a shared
data-structure from which the other gauges simply retrieve the current value,
If this is not necessary i would simply separate them and don't worry about
the performance overhead of the metrics; as long as this doesn't affect the job
via taking locks or similar.
> Expose optionally detailed metrics about network queue lengths
> --------------------------------------------------------------
>
> Key: FLINK-5090
> URL: https://issues.apache.org/jira/browse/FLINK-5090
> Project: Flink
> Issue Type: New Feature
> Components: Metrics, Network
> Affects Versions: 1.1.3
> Reporter: Stephan Ewen
> Assignee: Stephan Ewen
>
> For debugging purposes, it is important to have access to more detailed
> metrics about the length of network input and output queues.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)