[
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16737711#comment-16737711
]
Benedict commented on CASSANDRA-14922:
--------------------------------------
bq. can we wait for Alex to see the latest diff though... I've changed the
patch a bit since he last looked.
Sure thing. I'll start the rebase tomorrow in that case. In that case, also,
I've pushed my one nit from a quick look through
[here|https://github.com/belliottsmith/cassandra/tree/14922] for Alex to look
at, that I would have simply ninja'd in (with comment here, of course). This
is just using the {{HintsBuffer.free}} method instead of directly invoking
{{DirectByteBuffer.cleaner().clean()}}.
bq. Regarding the backport, I am slightly concerned about the NativeLibrary
changes being backported in their current form.
Thanks for highlighting this. I'll be sure to take a close look at the
behaviour on each version we backport to. I expect there will be other places
that need similar treatment to what you've done here, as well, so I need to
double check anyway.
bq. I think the Soft references are coming from
java.io.ObjectStreamClass$Caches.localDescs, but the object serder we're doing
in InvokableInstance is a bit beyond my JVM skills I'm afraid.
No worries at all, thanks very much for reproducing this information here for
posterity. If we ever want to clean this up, it would probably be easiest to
simply avoid ser/deser entirely (or use custom ser/deser), but your approach is
a much more suitable compromise for now. Thanks again also for all the
investigative work to plug these gaps.
> In JVM dtests need to clean up after instance shutdown
> ------------------------------------------------------
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
> Issue Type: Bug
> Components: Test/dtest
> Reporter: Joseph Lynch
> Assignee: Joseph Lynch
> Priority: Minor
> Fix For: 4.0
>
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png,
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png,
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png,
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1],
> [example
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1])
> because we use a small container (medium) for unit tests by default and the
> in JVM dtests are leaking a few hundred megabytes of memory per test right
> now. This is not a big deal because the dtest runs with the larger containers
> continue to function fine as well as local testing as the number of in JVM
> dtests is not yet high enough to cause a problem with more than 2GB of
> available heap. However we should fix the memory leak so that going forwards
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I
> believe that we have a few potential issues that are leading to the leaks:
> 1. The
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac291266660c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
> method is not successfully cleaning up all the metrics created by the
> {{CassandraMetricsRegistry}}
> 2. The
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac291266660c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
> method is not waiting for all the instances to finish shutting down and
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
> does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do
> not leak memory.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]