Hi,

It is indeed an empty (null) stopwords "CharArraySet" being passed in 
https://github.com/apache/lucene/blob/branch_9_8/lucene/core/src/java/org/apache/lucene/analysis/StopFilter.java#L39

If you flip back to class="solr.StopFilterFactory", does it work then?
Does your stopwords.txt file contain any stopwords or is the file empty?

There may be subtle difference in classloader / resourceLoader behavior when 
loaded with "class" vs "name", if so this is likely a bug to be investigated.

Jan

> 29. des. 2023 kl. 09:23 skrev Danilo Tomasoni <tomas...@cosbi.eu>:
> 
> Hello all again,
> I have a problem indexing new documents in my upgraded solr version (from 
> 8.11 to 9.4)
> I changed the solrconfig.xml to adhere the recent syntax:
>         <filter class="solr.StopFilterFactory" ignoreCase="true" 
> words="stopwords.txt" />
> was changed to
>         <filter name="stop" ignoreCase="true" words="stopwords.txt" />
> 
> The core loads correctly, but when I try to index a document I see an error 
> in the logs
> 
> 2023-12-29 07:46:38.191 ERROR (qtp2035381640-19) [ x:COSBIBioIndexTest 
> t:localhost-41] o.a.s.h.RequestHandlerBase Client exception => 
> org.apache.solr.common.SolrException: Exception writing document id 
> PUBMEDPMC8101124 to the index; possible analysis error.
>         at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:335)
> ...
> Caused by: java.lang.NullPointerException: stopWords
>         at java.util.Objects.requireNonNull(Objects.java:246) ~[?:?]
>         at org.apache.lucene.analysis.StopFilter.<init>(StopFilter.java:39) 
> ~[?:?]
>         at 
> org.apache.lucene.analysis.core.StopFilter.<init>(StopFilter.java:43) ~[?:?]
>         at 
> org.apache.lucene.analysis.core.StopFilterFactory.create(StopFilterFactory.java:91)
>  ~[?:?]
>         at 
> org.apache.solr.analysis.TokenizerChain.createComponents(TokenizerChain.java:132)
>  ~[?:?]
>         at 
> org.apache.lucene.analysis.AnalyzerWrapper.createComponents(AnalyzerWrapper.java:120)
>  ~[?:?]
>         at 
> org.apache.lucene.analysis.AnalyzerWrapper.createComponents(AnalyzerWrapper.java:120)
>  ~[?:?]
>         at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:193) 
> ~[?:?]
>         at org.apache.lucene.document.Field.tokenStream(Field.java:491) ~[?:?]
>         at 
> org.apache.lucene.index.IndexingChain$PerField.invertTokenStream(IndexingChain.java:1162)
>  ~[?:?]
>         at 
> org.apache.lucene.index.IndexingChain$PerField.invert(IndexingChain.java:1146)
>  ~[?:?]
>         at 
> org.apache.lucene.index.IndexingChain.processField(IndexingChain.java:697) 
> ~[?:?]
>         at 
> org.apache.lucene.index.IndexingChain.processDocument(IndexingChain.java:576) 
> ~[?:?]
>         at 
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:242)
>  ~[?:?]
>         at 
> org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:432)
>  ~[?:?]
>         at 
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1545) 
> ~[?:?]
>         at 
> org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1521) 
> ~[?:?]
>         at 
> org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1062)
>  ~[?:?]
>         at 
> org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:421)
>  ~[?:?]
>         at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:374)
>  ~[?:?]
>         at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:311)
>  ~[?:?]
> 
> 
> The file 'stopwords.txt' is present in the core/conf/stopwords.txt
> 
> What is the issue here?
> Thank you for your help and patience
> D
> 
>  <https://www.cosbi.eu/>
> 
> Danilo Tomasoni
> Data Scientist & Software Engineer
> +39 0464 808845
> tomas...@cosbi.eu <mailto:tomas...@cosbi.eu>
> www.cosbi.eu
> 
>  <http://www.cosbi.eu/>
>  <https://twitter.com/FoundationCosbi/> 
> <https://www.linkedin.com/company/cosbi/>

Reply via email to