[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16718514#comment-16718514
 ] 

Joseph Lynch edited comment on CASSANDRA-14922 at 12/12/18 6:43 PM:
--------------------------------------------------------------------

I think I figured it out, it turns out that our version of {{jna.Native}} holds 
a static map called 
[{{options}}|https://github.com/java-native-access/jna/blob/4bcc6191c5467361b5c1f12fb5797354cc3aa897/src/com/sun/jna/Native.java#L104]
 which holds noncollectable references to the 
[{{InstanceClassLoader}}|https://github.com/java-native-access/jna/blob/4bcc6191c5467361b5c1f12fb5797354cc3aa897/src/com/sun/jna/Native.java#L1528];
 I believe this leak is what the last two references were talking about. I 
tried cleanly unloading or unregistering those but that still doesn't remove 
the reference to the ClassLoader in the {{options}} map so I just hacked some 
reflection based solution together (similar to what I did for the 
{{ThreadLocal}} variables) in  
[6573176|https://github.com/jolynch/cassandra/commit/6573176ef3ed9601ce7f02602c964d478c6a5741].
 According to Yourkit there are now no more strong references to the 
{{InstanceClassLoader}} instances (attached).

 !no_more_references.png! 

We leak a lot less memory with my latest patches, but ... for some reason the 
class loaders aren't going away and are still retaining some heap, just a lot 
less ... There is still something missing, maybe something like 
{{CMSClassUnloadingEnabled}} or some such?


was (Author: jolynch):
I think I figured it out, it turns out that our version of {{jna.Native}} holds 
a static map that contains the custom classloaders called {{options}}, which is 
what the last two references were talking about. I tried cleanly unloading or 
unregistering those but that still doesn't remove the reference to the 
ClassLoader in the {{options}} map so I just hacked some reflection based 
solution together (similar to what I did for the {{ThreadLocal}} variables) in  
[6573176|https://github.com/jolynch/cassandra/commit/6573176ef3ed9601ce7f02602c964d478c6a5741].
 According to Yourkit there are now no more strong references to the 
{{InstanceClassLoader}} instances (attached).

 !no_more_references.png! 

We leak a lot less memory with my latest patches, but ... for some reason the 
class loaders aren't going away and are still retaining some heap, just a lot 
let ... Still something missing, maybe something like 
{{CMSClassUnloadingEnabled}} or some such?

> In JVM dtests need to clean up after instance shutdown
> ------------------------------------------------------
>
>                 Key: CASSANDRA-14922
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Testing
>            Reporter: Joseph Lynch
>            Assignee: Joseph Lynch
>            Priority: Minor
>         Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac291266660c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac291266660c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to