SynonymMap.Builder.add method
i am trying to understand the add method here https://lucene.apache.org/core/6_4_1/analyzers-common/org/apache/lucene/analysis/synonym/SynonymMap.Builder.html /public void add(CharsRef input,// //CharsRef output,// //boolean includeOrig)// //Add a phrase->phrase synonym mapping. Phrases are character sequences where words are separated with character zero (U+). Empty words (two U+s in a row) are not allowed in the input nor the output!// //Parameters:// //input - input phrase// //output - output phrase// //includeOrig - true if the original should be included/ That means if the search string has the input expression, it looks for all output expressions and treats them to be equivalent, right? Best regards
Re: SynonymMap
Ok, it does data deduplication if it is set to true. Best On 9/10/18 1:11 PM, Michael McCandless wrote: The SynonymMap.Builder constructor takes a dedup parameter to tell it what to do in that case (when input and output are identical across added rules). Mike McCandless https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.mikemccandless.com=DwIBaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4=dsASFl-pzyOvkKrtvEgZb9GCCgOES1PuLqYds9VH6GI=1g1QvDMFYT_gctvteGesGu8v4ESORDHlGzgdOiMQAxE= On Thu, Sep 6, 2018 at 2:06 PM, Baris Kazar wrote: Hi,- how does SynonymMap deal with repeated values? Best regards - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: SynonymGraphFilter
Any examples on this? i think it would be nice if Javadocs had an example on this: However, if you use this during indexing, you must follow it with FlattenGraphFilter to squash tokens on top of one another like SynonymFilter, because the indexer can't directly consume a graph. To get fully correct positional queries when your synonym replacements are multiple tokens, you should instead apply synonyms using this TokenFilter at query time and translate the resulting graph to a TermAutomatonQuery e.g. using TokenStreamToTermAutomatonQuery. multiple tokens means: a synonym with multiple equivalents?? or does it mean a synonym with multiple words? this is not clear to me. Best regards On 9/10/18 3:15 PM, baris.ka...@oracle.com wrote: https://lucene.apache.org/core/6_4_1/analyzers-common/org/apache/lucene/analysis/synonym/SynonymGraphFilter.html Does this mean i dont have to repeat it in the search analyzer when i do this at indexing time? Best regards - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
SynonymGraphFilter
https://lucene.apache.org/core/6_4_1/analyzers-common/org/apache/lucene/analysis/synonym/SynonymGraphFilter.html Does this mean i dont have to repeat it in the search analyzer when i do this at indexing time? Best regards - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: SynonymMap
The SynonymMap.Builder constructor takes a dedup parameter to tell it what to do in that case (when input and output are identical across added rules). Mike McCandless http://blog.mikemccandless.com On Thu, Sep 6, 2018 at 2:06 PM, Baris Kazar wrote: > Hi,- > how does SynonymMap deal with repeated values? > Best regards > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >