[
https://issues.apache.org/jira/browse/SOLR-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997269#comment-15997269
]
Christine Poerschke commented on SOLR-10506:
--------------------------------------------
bq. ... removing a submitted Zookeeper watcher might be pretty hard ..
Yes, you're right. Somehow I thought it could be done since leader elections
after all can be canceled (there's the ElectionWatcher class in
LeaderElector.java) but that works differently.
Okay, so if we can't remove a watcher, can we perhaps re-use the one we've got?
I've added a commit to my
[jira/solr-10506-branch_6_5|https://github.com/cpoerschke/lucene-solr/tree/jira/solr-10506-branch_6_5]
branch to explore that possibility.
bq. ... Do you have any hints towards an existing test super class? ...
TestReload and TestConfigReload look like possibilities.
StressRamUsageEstimator.testLargeSetOfByteArrays does a before/after memory
measurement. If our test could use a large managed resource (like in your use
case) and do a couple of reloads then perhaps a very measureable difference
could be detected without the fix and a less measureable (but still non-zero)
difference could be detected with the fix, something along those lines?
> Possible memory leak upon collection reload
> -------------------------------------------
>
> Key: SOLR-10506
> URL: https://issues.apache.org/jira/browse/SOLR-10506
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Server
> Affects Versions: 6.5
> Reporter: Torsten Bøgh Köster
> Attachments: solr_collection_reload_13_cores.png,
> solr_gc_path_via_zk_WatchManager.png
>
>
> Upon manual Solr Collection reloading, references to the closed {{SolrCore}}
> are not fully removed by the garbage collector as a strong reference to the
> {{ZkIndexSchemaReader}} is held in a ZooKeeper {{Watcher}} that watches for
> schema changes.
> In our case, this leads to a massive memory leak as managed resources are
> still referenced by the closed {{SolrCore}}. Our Solr cloud environment
> utilizes rather large managed resources (synonyms, stopwords). To reproduce,
> we fired out environment up and reloaded the collection 13 times. As a result
> we fully exhausted our heap. A closer look with the Yourkit profiler revealed
> 13 {{SolrCore}} instances, still holding strong references to the garbage
> collection root (see screenshot 1).
> Each {{SolrCore}} instance holds a single path with strong references to the
> gc root via a `Watcher` in `ZkIndexSchemaReader` (see screenshot 2). The
> {{ZkIndexSchemaReader}} registers a close hook in the {{SolrCore}} but the
> Zookeeper is not removed upon core close.
> We supplied a Github Pull Request
> (https://github.com/apache/lucene-solr/pull/197) that extracts the zookeeper
> `Watcher` as a static inner class. To eliminate the memory leak, the schema
> reader is held inside a `WeakReference` and the reference is explicitly
> removed on core close.
> Initially I wanted to supply a test case but unfortunately did not find a
> good starting point ...
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]