zentol commented on a change in pull request #6850: [FLINK-10252] Handle 
oversized metric messges
URL: https://github.com/apache/flink/pull/6850#discussion_r238982072
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/metrics/dump/MetricQueryService.java
 ##########
 @@ -119,6 +121,101 @@ public void onReceive(Object message) {
                }
        }
 
+       public void unregister(Metric metric) {
+               if (metric instanceof Counter) {
+                       this.counters.remove(metric);
+               } else if (metric instanceof Gauge) {
+                       this.gauges.remove(metric);
+               } else if (metric instanceof Histogram) {
+                       this.histograms.remove(metric);
+               } else if (metric instanceof Meter) {
+                       this.meters.remove(metric);
+               }
+       }
+
+       private MetricDumpSerialization.MetricSerializationResult 
verifyResultAndUnregisterOversizedMetrics(
+               MetricDumpSerialization.MetricSerializationResult 
serializationResult) {
+
+               byte[] serializedCounters;
+               int numCounters;
+               if (serializationResult.serializedCounters.length > 
maximumFramesize) {
+                       LOG.warn("The serialized counter metric is larger than 
the maximum frame size, " +
+                               " so maybe not all metrics would be reported.");
+                       serializedCounters = new byte[0];
+                       numCounters = 0;
+
+                       for (Metric metric : this.counters.keySet()) {
+                               unregister(metric);
 
 Review comment:
   So i was thinking about this for a while now. Un-registering metrics because 
they don't fit may be a bad idea after all.
   
   These metrics will never be registered again; they are lost, potentially due 
to a some short-term issue with the # of metrics. In other words, _some_ 
short-term issue affects the cluster in a permanent way and can only be 
resolved by restarting it.
   
   Let's not un-register them for now. We pay the _kind of unnecessary_ 
serialization cost, but that's fine for now. The PR improves the current 
situation anyway; we can handle this kind of optimizations in a follow-up.
   
   WDYT?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to