Timothee Maret created SLING-10133:
--------------------------------------

             Summary: Memory leak in MonitoringDistributionPackageBuilder
                 Key: SLING-10133
                 URL: https://issues.apache.org/jira/browse/SLING-10133
             Project: Sling
          Issue Type: Bug
            Reporter: Timothee Maret


The MonitoringDistributionPackageBuilder maintain a list of MBean for the 
latest packages. The number of packages to be monitored is passed as the 
[queueCapacity|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/b80cd8f3bae6b7875387ee7caaea271b7e9baec6/src/main/java/org/apache/sling/distribution/monitor/impl/MonitoringDistributionPackageBuilder.java#L49]
 via the constructor. When the queueCapacity is 0, the monitoring is disabled.

[VaultDistributionPackageBuilderFactory|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/b80cd8f3bae6b7875387ee7caaea271b7e9baec6/src/main/java/org/apache/sling/distribution/serialization/impl/vlt/VaultDistributionPackageBuilderFactory.java#L201]
 and 
[DistributionPackageBuilderFactory|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/b80cd8f3bae6b7875387ee7caaea271b7e9baec6/src/main/java/org/apache/sling/distribution/serialization/impl/DistributionPackageBuilderFactory.java]
 disable this feature by default. An environment that runs for multiple weeks 
without restart and with the default configuration will experience a memory 
leak that leads to the JVM running out of memory.

The implementation has two flaws that explain the memory leak.

 
h2. #1 - Registering a MBean when the queueCapacity is 0

The code [unconditionally registers a 
MBean|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/b80cd8f3bae6b7875387ee7caaea271b7e9baec6/src/main/java/org/apache/sling/distribution/monitor/impl/MonitoringDistributionPackageBuilder.java#L106]
 even if the queueCapacity is 0. We need to only register a MBean when the 
capacity is > 0.
h2. #2 - Concurrency issue when un-registering MBean

The code [attempts to 
remove|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/b80cd8f3bae6b7875387ee7caaea271b7e9baec6/src/main/java/org/apache/sling/distribution/monitor/impl/MonitoringDistributionPackageBuilder.java#L108]
 by checking if the queueCapacity equals the list of MBeans. This check works 
in a single threaded context but it falls short when 
registerDistributionPackageMBean is invoked concurrently. In the latter case, 
it can happen that the check never holds true leading the mBeans queue to grow 
indefinitely. One solution is to leverage the features of the 
LinkedBlockingDeque. Create a LinkedBlockingDeque with bounded capacity and 
rely on the returned status from the offer method to decide if an item needs to 
be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to