[
https://issues.apache.org/jira/browse/SOLR-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13682471#comment-13682471
]
Hoss Man commented on SOLR-4872:
--------------------------------
bq. You don't know what you are talking about.
I know enough to know that i don't understand tokenstream re-use enough to have
an opinion about your opinions/concerns relating to Tokenizer.close() -- which
is why i didn't want to talk about that and suggested it was a distinct issue
that should be discussed separately.
Because i also know enough to know that for a lot of users, caching some data
"once per-thread per-field and reused across multiple readers" isn't what they
want -- if it was, then classes like StopFilterFactory wouldn't need to bother
keeping track of "CharArraySet stopWords" from a parsed file and passing it to
each instance the factory creates, they could just let the filter parse the
file in it's constructor.
Some users _may_ want to use/cache data/services in the _factory_ and re-use
those in all instances produced by that factory, even if those instances are
for different fields (or in different threads) ... that was the point behind my
suggestion:
* fix solr to make sure we're closing Analyzers properly
* make analysis factories (optionally?) implement closable so they can cleanup
resources
* make solr's TOkenizerChain (analyzer) call close on the analysis factories
it wraps
Because then that way, if people like benson want to create something similar
to SynonymFilter but backed by some RemoteSynonymNetworkService, he can
instantiate a single RemoteSynonymNetworkService in his
BensonsSynonymFilterFactory, reuse his single instance of
RemoteSynonymNetworkService across all fields (and all threads) that use the
same BensonsSynonymFilterFactory instance (ie: all Solr fields and dynamic
fields that re-use the same <fieldType/>) and tear down his (single)
RemoteSynonymNetworkService when BensonsSynonymFilterFactory.close() is called.
> Allow schema analysis object factories to be cleaned up properly when the
> core shuts down
> -----------------------------------------------------------------------------------------
>
> Key: SOLR-4872
> URL: https://issues.apache.org/jira/browse/SOLR-4872
> Project: Solr
> Issue Type: Improvement
> Affects Versions: 4.3
> Reporter: Benson Margulies
> Attachments: solr-4872.patch, solr-4872.patch
>
>
> I have a need, in an TokenizerFactory or TokenFilterFactory, to have a shared
> cache that is cleaned up when the core is torn down.
> There is no 'close' protocol on these things, and Solr rejects analysis
> components that are SolrCoreAware.
> Possible solutions:
> # add a close protocol to these factories and make sure it gets called at
> core shutdown.
> # allow these items to be 'core-aware'.
> # invent some notion of 'schema-lifecycle-aware'.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]