[
https://issues.apache.org/jira/browse/CAMEL-12969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710187#comment-16710187
]
ASF GitHub Bot commented on CAMEL-12969:
----------------------------------------
bobpaulin commented on issue #2647: CAMEL-12969: Adding ServiceReference Cache
to prevent memory leak.
URL: https://github.com/apache/camel/pull/2647#issuecomment-444518591
@davsclaus Yes I would also like to avoid the background thread if possible.
The problem I ran into with removing the service reference on the unregister
event is per the OSGi spec[1] (and in the Felix implementation) the event is
fired at the beginning of the service being unregistered not at the end. So it
is possible that if the service is looked up after the event firing but before
the service registration is invalidated and removed from Felix's registry [2]
it could be re-cached without any other means to remove it other than stopping
the camel context. This gap between the unregistered event firing and the
service actually being removed causes some problems for trying to invalidate
the cache in a synchronous way. The thread allows the code to check the
ungetService return value which switches to false when the service is actually
gone. That allows the invalidation to work properly without locking but trades
the extra resources that get allocated to the thread.
To your point of just applying the fix to the onContextStop being called. I
believe that could be applied separately which would improve the situation for
users that are using the lookup calls conservatively. Without the other parts
of the patch (such as the cache reintroduction and invalidation strategy) the
ConcurrentLinkedQueue$Node objects would continue to accumulate with each
lookup call. I'd prefer an approach that shields developers from the OSGi
runtime and allows them make lookup calls as liberally as they can be with the
SimpleRegistry and JndiRegistry.
I agree with your point about complexity and I'm open to ideas to address
the issue in a way that allows developers to use the Camel Registries in a
uniform way. Also sorry for the length of this note. I wish I could have made
it shorter but I think the issue is a bit tricky.
[1]
https://osgi.org/specification/osgi.core/7.0.0/framework.api.html#org.osgi.framework.ServiceEvent
[2]
https://github.com/apache/felix/blob/trunk/framework/src/main/java/org/apache/felix/framework/ServiceRegistry.java
(specfically the unregisterService method.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> camel-core-osgi: Slow Memory Leak in OsgiServiceRegistry
> --------------------------------------------------------
>
> Key: CAMEL-12969
> URL: https://issues.apache.org/jira/browse/CAMEL-12969
> Project: Camel
> Issue Type: Bug
> Components: camel-osgi
> Affects Versions: 2.18.0, 2.19.0, 2.20.0, 2.21.0, 2.22.0, 2.23.0
> Environment: Java 10
> Karaf 4.2.1
> Camel 2.22.0
> Reporter: Bob Paulin
> Priority: Major
> Fix For: 2.22.3, 2.24.0, 2.23.1
>
> Attachments: ServiceReferenceQueueLeak.PNG,
> ServiceReferenceQueuePostContextStop.PNG,
> ServiceReferenceQueuePreContextStop.PNG, karafCamelContextStop.PNG
>
>
> The OsgiServiceRegistry has a slow memory leak in the serviceReferenceQueue.
> Currently every time a service is looked up by any method an item is added to
> the serviceReferenceQueue. This is required because of OSGi ServiceReference
> counting. However left unchecked the system just continues to add
> ConcurrentLinkedQueue$Node objects until memory is exhausted.
> !ServiceReferenceQueueLeak.PNG! .
>
> There is also a second problem with how the registry is being managed within
> the OsgiDefaultCamelContext. OsgiServiceRegistry is currently extends
> LifecycleStrategySupport which is suppose to unload the serviceReferenceQueue
> onContextStop. However the registry is never getting added to the
> CamelContext to manage the Lifecycle because the overridden createRegistry
> method in OsgiDefaultCamelContext is not being called. This is because the
> registry is being set in the constructor of OsgiDefaultCamelContext with
> {code:java}
> super(registry);{code}
> this calls the DefaultCamelContext implementation of createRegistry which
> does not add the registry to lifecyclemanagement since
> {code:java}
> OsgiCamelContextHelper.wrapRegistry(this, registry, bundleContext);{code}
> is never called.
> See serviceReferenceQueue pre context stop
> !ServiceReferenceQueuePreContextStop.PNG!
> !karafCamelContextStop.PNG!
> See serviceReferenceQueue post context stop (still contain objects)
> !ServiceReferenceQueuePostContextStop.PNG!
> Both issues would have existed for some time but may have gone unnoticed
> because the leak was so slow (ConcurrentLinkedQueue$Node takes up very little
> memory). It appears the removal of the cache in
> https://issues.apache.org/jira/browse/CAMEL-9631 makes the leak occur more
> quickly.
>
> I have a patch that involves reintroducing the cache but with an invalidation
> strategy using the OSGi ServiceListener that leverages a single clean up
> thread to remain non-blocking. I'm working on an upstream adaptation and
> will post a PR for community review.
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)