Thanks for sharing; it looks like a nice set of synonyms!

It's good that you already apply them at search-time not index-time.

In that case, you should not use the FlattenGraphFilter, because
SynonymGraphFilter will produce a correct graph (unlike SynonymFilter)
and the Lucene query parsers (not sure about Solr's query parser fork)
will correctly detect the graph and create the right query.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Feb 7, 2017 at 7:46 AM, Bernd Fehling
<bernd.fehl...@uni-bielefeld.de> wrote:
> Years ago (2007) I've installed Eurovoc Thesaurus to work with our
> Search Engine as multilingual search (terms and phrases in 22 languages).
>
> http://www.ub.uni-bielefeld.de/~befehl/base/solr/InsideBase_eurovocThesaurus.html
>
> The synonyms.txt file is 8.8MB in size and gets as FST over 300.000 mappings
> as n-to-m due to permutation.
> You can get from a single term/token several single and multi-word synonyms
> and from multi-word terms/tokens also single and multi-word synonyms.
> Position increment and position length is handled correct.
> And the originating search term with their direct synonyms is/can be boosted.
>
> I will look into SynonymGraphFilter and FlattenGraphFilter to see how it
> compares to my development.
>
> Regards
> Bernd
>
>
> Am 07.02.2017 um 12:34 schrieb Michael McCandless:
>> That's great that multi-token synonyms are working for you; can you
>> describe how use them?
>>
>> This blog post describes some of the problems:
>> http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
>>
>> I'm working on another blog post to describe the recent changes ...
>> should be out in maybe a week or so.
>>
>> Anyway, to just keep doing what you are doing today, you should switch
>> to SynonymGraphFilter followed by FlattenGraphFilter: it will make the
>> same tokens as the current SynonymFilter, but will necessarily be
>> buggy in the multi-token case.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Tue, Feb 7, 2017 at 6:07 AM, Bernd Fehling
>> <bernd.fehl...@uni-bielefeld.de> wrote:
>>> I just tried Solr 6.4.1 and noticed that SynonymFilterFactory is
>>> deprecated, as reported in the logs.
>>>
>>> I hope that this is just to note that there is also an alternative
>>> SynonymGraphFilterFactory now available.
>>>
>>> And _not_ that SynonymFilterFactory will disappear, because it runs my
>>> multi-word Synonyms Thesaurus now for years like a charme.
>>> I hate to reinvent the wheel.
>>>
>>> Regards
>>> Bernd
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>
> --
> *************************************************************
> Bernd Fehling                    Bielefeld University Library
> Dipl.-Inform. (FH)                LibTec - Library Technology
> Universitätsstr. 25                  and Knowledge Management
> 33615 Bielefeld
> Tel. +49 521 106-4060       bernd.fehling(at)uni-bielefeld.de
>
> BASE - Bielefeld Academic Search Engine - www.base-search.net
> *************************************************************
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to