[
https://issues.apache.org/jira/browse/SOLR-16203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris M. Hostetter updated SOLR-16203:
--------------------------------------
Attachment: SOLR-16203_test.patch
Status: Open (was: Open)
I'm attaching a trivial test patch to {{ResolveAnalyzerByNameTest.java}} that
demonstrates the problem
(the existing test only confirms the analyzer exists – this patch shows that
attempting to _use_ the analyzers will fail in exciting ways)
Here are some examples of what these failures can look like with synonym or
stop filters at index or query time...
{noformat}
> org.apache.solr.common.SolrException: Exception writing document id
(null) to the index; possible analysis error.
> at
__randomizedtesting.SeedInfo.seed([444AABB6B9EF0110:6E75A04D77A5D7B3]:0)
> at
app//org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:326)
> at
app//org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:71)
...
> Caused by:
> java.lang.NullPointerException
> at
org.apache.lucene.analysis.StopFilter.accept(StopFilter.java:97)
> at
org.apache.lucene.analysis.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:52)
> at
org.apache.lucene.analysis.miscellaneous.WordDelimiterGraphFilter.incrementToken(WordDelimiterGraphFilter.java:349)
...
{noformat}
{noformat}
> java.lang.NullPointerException
> at
org.apache.lucene.analysis.synonym.SynonymGraphFilterFactory.create(SynonymGraphFilterFactory.java:130)
> at
org.apache.solr.analysis.TokenizerChain.createComponents(TokenizerChain.java:132)
> at
org.apache.lucene.analysis.AnalyzerWrapper.createComponents(AnalyzerWrapper.java:120)
> at
org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:193)
> at
org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:256)
> at
org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:527)
> at
org.apache.solr.parser.QueryParser.newFieldQuery(QueryParser.java:66)
> at
org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:1121)
> at
org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:855)
> at
org.apache.solr.parser.QueryParser.Term(QueryParser.java:452)
> at
org.apache.solr.parser.QueryParser.Clause(QueryParser.java:291)
> at
org.apache.solr.parser.QueryParser.Query(QueryParser.java:171)
> at
org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:141)
> at
org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:274)
> at
org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:51)
> at org.apache.solr.search.QParser.getQuery(QParser.java:188)
> at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:168)
> at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:384)
> at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:207)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2866)
{noformat}
...other factories may fail in other weird ways.
> Using SPI lookups of analysis components results in factories that don't get
> ResourceLoaderAware.inform called on them (When using
> ClassicIndexSchemaFactory)
> -------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-16203
> URL: https://issues.apache.org/jira/browse/SOLR-16203
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 9.0
> Reporter: Chris M. Hostetter
> Priority: Major
> Attachments: SOLR-16203_test.patch
>
>
> Ok ... this is a weird one ...
> SOLR-13593 modified {{FieldTypePluginLoader}} to add the ability for analysis
> Factories to be resolved by SPI name (ex: {{{}name="stop"{}}}, instead of by
> "short class name" (ex: {{{}class="solr.StopFilterFactory"{}}}.
> The code that deals with this just calls (for example)
> {{TokenFilterFactory.forName(name, params);}} and uses the resulting factory
> "as is" – But these means that nothing calls the {{inform(ResourceLoader)}}
> on any factories that implement {{ResourceLoaderAware}} (such as
> {{StopFilterFactory}} or {{{}SynonymGraphFilterFactory{}}})
> In the "short class name" code path (the only option prior to 9.0), the
> {{SolrResourceLoader}} is used to initialize the Factory, and
> {{SolrResourceLoader..newInstance(...)}} takes responsibility of calling
> {{factory.inform(this)}} on everything it instantiates that implements
> {{ResourceLoaderAware}} (as well as some other checks: like
> {{{}SolrCoreAware{}}})
> This discrepancy means that when using the {{name="foo"}} syntax, many
> factories won't be fully initialized – or fail on invalid input – during
> schema initialization. In the case of things like {{StopFilterFactory}} or
> {{SynonymGraphFilterFactory}}) the problem will manifest as some type of
> runtime error when the factory's {{create(...)}} method is called as part of
> creating a new index or query Analyzer.
> ----
> _*...BUT...*_
> ----
> This problem only seems to manifest itself when using
> {{ClassicIndexSchemaFactory}} -- which is why it's not readily apparent when
> using the default configset, or something like {{bin/solr -e techproducts}}
> (but is trivial to reproduce in testcases (since almost every "test"
> solrconfig uses {{ClassicIndexSchemaFactory}})
> I have no idea _why_ using {{ManagedIndexSchemaFactory}} doesn't manifest the
> same problem -- it should be using the same {{FieldTypePluginLoader}} under
> the covers -- but clearly something specific to {{ManagedIndexSchema}} is
> taking responsibility for calling {{ResourceLoaderAware.inform(...)}}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]