[ 
https://issues.apache.org/jira/browse/SOLR-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985007#comment-15985007
 ] 

Christine Poerschke commented on SOLR-10506:
--------------------------------------------

Hi Torsten, thanks for finding and analysing this issue, and for putting up a 
fix for it.

I've just tried to send you a pull request with two small suggestions from my 
[jira/solr-10506-branch_6_5|https://github.com/cpoerschke/lucene-solr/tree/jira/solr-10506-branch_6_5]
 branch but something in the settings didn't allow it. Anyhow, the commit is 
there at the top of the branch ...

bq. ... To eliminate the memory leak, the schema reader is held inside a 
`WeakReference` and the reference is explicitly removed on core close. ...

As I read the code and patch, this will allow the ZkIndexSchemaReader (and 
SolrCore?) to be garbage collected but the Watcher would stay around, right? 
And ideally we'd also wish for the Watcher to not stay around ...

bq. ... Initially I wanted to supply a test case but unfortunately did not find 
a good starting point ...

Hmm, yeah, tricky. Building upon the above "but the watcher stays around" 
observation, perhaps something like this could work:
{code}
# start instance without cores
# determine baseline number of (managed-schema) watchers
# create a core
# reload the core a couple of times
# delete the core
# determine final number of (managed-schema) watchers
# test that no 'extra' watchers are still around
{code}

> Possible memory leak upon collection reload
> -------------------------------------------
>
>                 Key: SOLR-10506
>                 URL: https://issues.apache.org/jira/browse/SOLR-10506
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Server
>    Affects Versions: 6.5
>            Reporter: Torsten Bøgh Köster
>         Attachments: solr_collection_reload_13_cores.png, 
> solr_gc_path_via_zk_WatchManager.png
>
>
> Upon manual Solr Collection reloading, references to the closed {{SolrCore}} 
> are not fully removed by the garbage collector as a strong reference to the 
> {{ZkIndexSchemaReader}} is held in a ZooKeeper {{Watcher}} that watches for 
> schema changes.
> In our case, this leads to a massive memory leak as managed resources are 
> still referenced by the closed {{SolrCore}}. Our Solr cloud environment 
> utilizes rather large managed resources (synonyms, stopwords). To reproduce, 
> we fired out environment up and reloaded the collection 13 times. As a result 
> we fully exhausted our heap. A closer look with the Yourkit profiler revealed 
> 13 {{SolrCore}} instances, still holding strong references to the garbage 
> collection root (see screenshot 1).
> Each {{SolrCore}} instance holds a single path with strong references to the 
> gc root via a `Watcher` in `ZkIndexSchemaReader` (see screenshot 2). The 
> {{ZkIndexSchemaReader}} registers a close hook in the {{SolrCore}} but the 
> Zookeeper is not removed upon core close.
> We supplied a Github Pull Request 
> (https://github.com/apache/lucene-solr/pull/190) that extracts the zookeeper 
> `Watcher` as a static inner class. To eliminate the memory leak, the schema 
> reader is held inside a `WeakReference` and the reference is explicitly 
> removed on core close.
> Initially I wanted to supply a test case but unfortunately did not find a 
> good starting point ...



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to