Sumit Agrawal created HDDS-7349:
-----------------------------------

             Summary: Flaky integration test have memory leak for 
RatisDropwizardExports
                 Key: HDDS-7349
                 URL: https://issues.apache.org/jira/browse/HDDS-7349
             Project: Apache Ozone
          Issue Type: Bug
            Reporter: Sumit Agrawal


Memory leak observed in flaky integration test causing OutOfMemory multiple 
times, and causing the cancel of the case.

1. registeration of RatisDropwizardExports is done multiple time, even its 
registered.
-- This is called by Ratis CopyOnWrite when another metrics get registered.
-- Since the track of above object is lost, no way to cleanup later

------------------stack showing registeration-----------
  at 
io.prometheus.client.CollectorRegistry.register(Lio/prometheus/client/Collector;)V
 (CollectorRegistry.java:60)
  at 
org.apache.hadoop.hdds.server.http.RatisDropwizardExports.registerDropwizard(Lorg/apache/ratis/metrics/RatisMetricRegistry;Ljava/util/Map;)V
 (RatisDropwizardExports.java:75)
  at 
org.apache.hadoop.hdds.server.http.RatisDropwizardExports.lambda$registerRatisMetricReporters$0(Ljava/util/Map;Lorg/apache/ratis/metrics/RatisMetricRegistry;)V
 (RatisDropwizardExports.java:53)
  at 
org.apache.hadoop.hdds.server.http.RatisDropwizardExports$$Lambda$384.accept(Ljava/lang/Object;)V
 (Unknown Source)
  at 
org.apache.ratis.metrics.impl.MetricRegistriesImpl.lambda$null$0(Lorg/apache/ratis/metrics/RatisMetricRegistry;Ljava/util/function/Consumer;)V
 (MetricRegistriesImpl.java:72)
  at 
org.apache.ratis.metrics.impl.MetricRegistriesImpl$$Lambda$618.accept(Ljava/lang/Object;)V
 (Unknown Source)
  at 
java.util.concurrent.CopyOnWriteArrayList.forEach(Ljava/util/function/Consumer;)V
 (CopyOnWriteArrayList.java:895)
  at 
org.apache.ratis.metrics.impl.MetricRegistriesImpl.lambda$create$1(Lorg/apache/ratis/metrics/MetricRegistryInfo;)Lorg/apache/ratis/metrics/RatisMetricRegistry;
 (MetricRegistriesImpl.java:72)
------------------------------------

2. after stop of Datanode, still ratis registeration going on causing to get 
accumulated.

This is observed that Datanode is stopped, but still Datanode having 
"ratisMetricsMap" having new metrics registered.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to