Re: Solr SynonymGraphFilterFactory use analyzer defined in schema

Markus Jelsma Wed, 22 Dec 2021 15:23:44 -0800

Oh, about those 'strange stemmer induced irregularities' i mentioned.
These can be mitigated by using the KeywordRepeatFilter in index and
query analysis chains.


For example, some language's stemmers see the -s suffix as a plural
form and remove it. This is bad news for a "pils,bier" synonym set. In
Dutch, pilsen becomes pils, but singular pils can become pil, and a
pil is not a beer.

The KeywordRepeat is your friend here, but it will increase the size
of your index a lot.

The repeat filter also helps for finding things that, without it,
become indistinguishable from non-related terms that have been reduced
to the same form by the stemmer, it is highly recommended!

Regards,
Markus



2021-12-23 0:13 GMT+01:00, Markus Jelsma <[email protected]>:
> Hello Robert,
>
> I checked the code, the factory looks for classes on the path, and
> Solr's schema defined analyzers are not loaded as such. Maybe there is
> a fancy trick, but i wouldn't bet on it.
>
> Creating pure Lucene-based analyzer classes is easy, but it takes some
> work to set it all up. Otherwise i would suggest to configure the
> stemmer filter before the synonym filter, and pass all terms in the
> synonym file through a stemmer and/or filter sequence when compiling
> the terms of the synonym file.
>
> We use the latter trick, and it works just fine. Except, of course,
> for strange stemmer induced irregularities.
>
> Regards,
> Markus
>
>
> 2021-12-21 16:06 GMT+01:00, Robert Wenig <[email protected]>:
>> Hello everybody
>>
>> Tldr; Is it possible to set the analyzer field in the
>> SynonymGraphFilterFactory to an analyzer which is defined in the
>> schema.xml.
>>
>> For reference please show my question on stackoverflow.
>> https://stackoverflow.com/questions/70437100/solr-synonymgraphfilterfactory-use-analyzer-defined-in-schema
>>
>> Best regards.
>>
>

Re: Solr SynonymGraphFilterFactory use analyzer defined in schema

Reply via email to