[ 
https://issues.apache.org/jira/browse/SOLR-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035222#comment-16035222
 ] 

Christine Poerschke commented on SOLR-10506:
--------------------------------------------

bq. ... to have as a separate issue ...
bq. ... preference would be to open a separate issue for that ...

Sounds good.

I just returned to this, and maybe Friday evening timing was a mistake and it 
will all be clearer next week ... am struggling to convince myself that the 
proposed removal of the watcher re-creation in _ZkIndexSchemaReader.command()_ 
is appropriate. The existing comment on the method says
{code}
  /**
   * Called after a ZooKeeper session expiration occurs; need to re-create the 
watcher and update the current
   * schema from ZooKeeper.
   */
{code}
and _ZkController.addOnReconnectListener(OnReconnect listener)_ method has a 
comment
{code}
  /**
   * Add a listener to be notified once there is a new session created after a 
ZooKeeper session expiration occurs;
   * in most cases, listeners will be components that have watchers that need 
to be re-created.
   */
{code}
and intuitively "we got disconnected and so need to recreate our watchers 
since/if the watchers we had previously were for the connection that got 
disconnected" seems plausible but then equally so "we registered watches with 
the zkclient and wouldn't it be nice for zkclient to take care of watcher 
lifecycle across disconnects?" is not implausible. Need to go checkout ZK docs 
and stuff, not today.


> Possible memory leak upon collection reload
> -------------------------------------------
>
>                 Key: SOLR-10506
>                 URL: https://issues.apache.org/jira/browse/SOLR-10506
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Server
>    Affects Versions: 6.5
>            Reporter: Torsten Bøgh Köster
>            Assignee: Christine Poerschke
>         Attachments: solr_collection_reload_13_cores.png, 
> solr_gc_path_via_zk_WatchManager.png
>
>
> Upon manual Solr Collection reloading, references to the closed {{SolrCore}} 
> are not fully removed by the garbage collector as a strong reference to the 
> {{ZkIndexSchemaReader}} is held in a ZooKeeper {{Watcher}} that watches for 
> schema changes.
> In our case, this leads to a massive memory leak as managed resources are 
> still referenced by the closed {{SolrCore}}. Our Solr cloud environment 
> utilizes rather large managed resources (synonyms, stopwords). To reproduce, 
> we fired out environment up and reloaded the collection 13 times. As a result 
> we fully exhausted our heap. A closer look with the Yourkit profiler revealed 
> 13 {{SolrCore}} instances, still holding strong references to the garbage 
> collection root (see screenshot 1).
> Each {{SolrCore}} instance holds a single path with strong references to the 
> gc root via a `Watcher` in `ZkIndexSchemaReader` (see screenshot 2). The 
> {{ZkIndexSchemaReader}} registers a close hook in the {{SolrCore}} but the 
> Zookeeper is not removed upon core close.
> We supplied a Github Pull Request 
> (https://github.com/apache/lucene-solr/pull/197) that extracts the zookeeper 
> `Watcher` as a static inner class. To eliminate the memory leak, the schema 
> reader is held inside a `WeakReference` and the reference is explicitly 
> removed on core close.
> Initially I wanted to supply a test case but unfortunately did not find a 
> good starting point ...



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to