[
https://issues.apache.org/jira/browse/SOLR-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996583#comment-15996583
]
ASF GitHub Bot commented on SOLR-10550:
---------------------------------------
GitHub user tboeghk opened a pull request:
https://github.com/apache/lucene-solr/pull/198
[SOLR-10550] Improve FileFloatSource eviction // reduce FileFloatSource
memory footprint
As a follow up from `SOLR-10506` we found another possible memory leak in
Solr. The values generated from an `ExternalFileField` are cached in a static
cache inside the `FileFloatSource`. That cache caches both a `IndexReader` and
`FileFloatSource`s loaded using that `IndexReader`.
Cache eviction is left to the internally used WeakHashMap or a full
eviction can be triggered via url. We are dealing with large synonym files and
word lists stored in managed resources. Those are tied to the SolrCore as
described in `SOLR-10506`. We're also using `ExternalFileField`s whose
`FileFloatSource` are cached in said static cache. The FileFloatSource hold
strong (transitive) references to the SolrCore they have been created for.
After a couple of collection reloads, the cache eviction mechanism of the
`WeakHashMap` gets activated pretty close to heap exhaustion. The patch
attached adds a mechanism to evict cache entries created in the context of a
`SolrCore` upon it's close using a close hook in the
`ExternalFileFieldReloader`. It furthermore adds a static cache reset method
for all entries bound to a given IndexReader. I'm not sure, if the added cache
resets are too aggressive or executed too often, I'd like to leave that to the
experts.
N.B.: I did this second PR for the same issue to separate code changes for
both SOLR-10506 and SOLR-10550 which I maintained on the same fork branch :-/
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/shopping24/lucene-solr branch_6_5__SOLR-10550
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/lucene-solr/pull/198.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #198
----
commit 024e950f840b8466272014e4594e198560760c79
Author: Torsten Bøgh Köster <[email protected]>
Date: 2017-04-21T12:51:21Z
clear cached field sources on core close
----
> Improve FileFloatSource eviction // reduce FileFloatSource memory footprint
> ---------------------------------------------------------------------------
>
> Key: SOLR-10550
> URL: https://issues.apache.org/jira/browse/SOLR-10550
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Server
> Affects Versions: 6.5
> Reporter: Torsten Bøgh Köster
> Attachments: solr_filefloatsource.patch
>
>
> As a follow up from {{SOLR-10506}} we found another possible memory leak in
> Solr. The values generated from an {{ExternalFileField}} are cached in a
> static cache inside the {{FileFloatSource}}. That cache caches both a
> {{IndexReader}} and {{FileFloatSource}}s loaded using that {{IndexReader}}.
> Cache eviction is left to the internally used {{WeakHashMap}} or a full
> eviction can be triggered via url. We are dealing with large synonym files
> and word lists stored in managed resources. Those are tied to the
> {{SolrCore}} as described in {{SOLR-10506}}. We're also using
> {{ExternalFileField}}s whose {{FileFloatSource}} are cached in said static
> cache. The {{FileFloatSource}} hold strong (transitive) references to the
> {{SolrCore}} they have been created for.
> After a couple of collection reloads, the cache eviction mechanism of the
> {{WeakHashMap}} gets activated pretty close to heap exhaustion. The patch
> attached adds a mechanism to evict cache entries created in the context of a
> {{SolrCore}} upon it's close using a close hook in the
> {{ExternalFileFieldReloader}}. It furthermore adds a static cache reset
> method for all entries bound to a given {{IndexReader}}. I'm not sure, if the
> added cache resets are too aggressive or executed too often, I'd like to
> leave that to the experts ;-)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]