[
https://issues.apache.org/jira/browse/SOLR-8311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011549#comment-15011549
]
Hoss Man commented on SOLR-8311:
--------------------------------
The motivation for filing this issue was SOLR-8280, where I realized that even
though {{SimilarityFactory}} was an allowed {{SolrCoreAware}} API, there were
situations when dealing with managed schema that would result in a new
SimilariyFactory getting inited at run time w/o ever having
{{inform(SolrCore)}} called...
{quote}
The root problem seems to be that when using the SolrResourceLoader to create
newInstances of objects, the loader is tracking what things are SolrCoreAware,
ResourceLoaderAware, and/or SolrInfoMBean. Then, just before the SolrCore
finishes initialiing itself, it calls a method on SolrResourceLoader to take
appropriate action on to inform those instances (and/or add them to the MBean
registry)
The problem happens when any new instances are created by the
SolrResourceLoader _after_ the SolrCore is up and running -- it currently has a
{{live}} boolean it uses to just flat out ignore wether or not these instances
are SolrCoreAware, ResourceLoaderAware, and/or SolrInfoMBean, meaning that
nothing in the call stack ever informs them about the SolrCore.
It looks like SOLR-4658 included a bit of a hack work arround for the
ResourceLoaderAware schema elements (see IndexSchema's constructor which call's
{{loader.inform(loader);}}...
http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/IndexSchema.java?r1=1463182&r2=1463181&pathrev=1463182
...this seems realy sketchy because it causes *any* ResourceLoaderAware plugins
inited so far by the core to be {{inform(ResourceLoader)}}ed once the first
IndexSchema is created -- even though that's not suppose to happen until mutch
later in the SolrCore constructor just before the CountDownLatch is released.
What it does do however is ensure that when a new schema gets loaded later (by
the REST API, or a schemaless update processor) and ResourceLoaderAware
fieldtypes/analyzers are good to go -- but that doesn't do anything to help
SolrCoreAware plugins like SimilarityFactory.
{quote}
This issue also led to the discovery that {{SimilariyFactory.inform(SolrCore)}}
already had special handling because even on startup it _had_ to be called
before other any other {{SolrCoreAware}} impls might be informed by
{{SolrResourceLoader}} incase they tried to access the core's searcher (which
depends on the Similarity)...
{quote}
* There was already a special kludge for SolrCoreAware SimFactories in
SolrCore.initSchema
** looks like this was originally for ensuring that the SimFactories was usable
when other SolrCoreAware things (like listeners) got informed of the SolrCore
and tried to use the SolrIndexSearcher (which depended on the sim)
So i think the most straight forward solution to the problem
(SimilarityFactory-ies that implement SolrCoreAware playing nice with managed
schema) is to refactor that existing kludge from SolrCore.initSchema to
SolrCore.setLatestSchema
{quote}
SOLR-8280 also contained some discussion about the problem of trying to make a
general fix for this in SolrResourceLoader...
HOSS:
{quote}
I'm attaching a work in progress patch where I attempted to fix the underlying
problem with SolrResourceLoader by having it keep a refrence to the SolrCore
it's tied to such that any new instances after that the would be immediately
informed of the SorlCore/ResourceLoader. This fixes some of the tests I
mentioned before in this issue that have problems with SchemaSimilarityFactory
but causes other failures in other existing test that reload the schema –
because any FieldType that is ResourceLoader aware is now being "informed" of
the loader as soon as it's instantiated – before even basic init() methods are
called. Which makes sense in hind sight – my whole approach here is flawed
because the contract is suppoes to be that the init methods will always be
called first, and any (valid) inform methods will be called at some point after
that once the core/loader is available, but before the instance is used ...
calling "new" then "inform" then "init" is maddness.
I honestly don't know if there is a sane way to solve this problem in the
general case...
{quote}
Alan:
{quote}
I have a half-implemented patch hanging around somewhere that tried to clean
this up a bit. I think the root problem is that there are two circumstances in
which we're using SolrResourceLoader, a) during core initialization when we
need to call init() immediately, but wait to call inform() until after the
loading latch has been released, and then b) to create new objects once the
core is up and serving queries. I tried to split this out into two separate SRL
implementations, one of which is private to SolrCore and used only in the
constructor, and does the call-init-and-then-delay-inform dance, and the other
of which is returned by SolrCore.getResourceLoader() and inits() and informs()
before it returns. To be honest though, I get so confused by the code paths
here that I'm not sure whether or not that would help in this case...
{quote}
> SolrCoreAware and ResourceLoaderAware lifecyel is fragile - particularly with
> objects that can be created after SolrCore is live
> --------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-8311
> URL: https://issues.apache.org/jira/browse/SOLR-8311
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
>
> In general, the situation of when/how {{ResourceLoaderAware}} &
> {{SolrCoreAware}} instances are "informed" of the ResourceLoader & SolrCore
> is very kludgy and involves a lot of special casees.
> For objects initialized _before_ the SolrCore goes "live",
> {{SolrResourceLoader}} tracks these instances internally, and calls
> {{inform()}} on all of them -- but for instances created _after_ the SolrCore
> is live (ex: schema pieces created via runtime REST calls),
> {{SolrResourceLoader}} does nothing to ensure they are later informed (and
> typically can't because that must happen after whatever type specific 'init'
> logic takes place). So there is a lot of special case handling to call
> {{inform}} methods sprinkled through out he code
> This issue serves as a refrence point to track/link various comments on the
> situation, and to cite in comments warning developers about how finicky it is
> to muck with the list of SolrCoreAware & ResourceLoaderAware allowed
> implementations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]