SynonymMap.Builder.add method

2018-09-10 Thread baris . kazar

i am trying to understand the add method here

https://lucene.apache.org/core/6_4_1/analyzers-common/org/apache/lucene/analysis/synonym/SynonymMap.Builder.html


/public void add(CharsRef input,//
//CharsRef output,//
//boolean includeOrig)//
//Add a phrase->phrase synonym mapping. Phrases are character sequences 
where words are separated with character zero (U+). Empty words (two 
U+s in a row) are not allowed in the input nor the output!//

//Parameters://
//input - input phrase//
//output - output phrase//
//includeOrig - true if the original should be included/


That means if the search string has the input expression, it looks for 
all output expressions and treats them to be equivalent, right?


Best regards



Re: SynonymMap

2018-09-10 Thread baris . kazar

Ok, it does data deduplication if it is set to true.

Best


On 9/10/18 1:11 PM, Michael McCandless wrote:

The SynonymMap.Builder constructor takes a dedup parameter to tell it what
to do in that case (when input and output are identical across added rules).

Mike McCandless

https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.mikemccandless.com=DwIBaQ=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4=dsASFl-pzyOvkKrtvEgZb9GCCgOES1PuLqYds9VH6GI=1g1QvDMFYT_gctvteGesGu8v4ESORDHlGzgdOiMQAxE=

On Thu, Sep 6, 2018 at 2:06 PM, Baris Kazar  wrote:


Hi,-
how does SynonymMap deal with repeated values?
Best regards

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org





-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: SynonymGraphFilter

2018-09-10 Thread baris . kazar
Any examples on this? i think it would be nice if Javadocs had an 
example on this:


However, if you use this during indexing, you must follow it with 
FlattenGraphFilter to squash tokens on top of one another like 
SynonymFilter, because the indexer can't directly consume a graph. To 
get fully correct positional queries when your synonym replacements are 
multiple tokens, you should instead apply synonyms using this 
TokenFilter at query time and translate the resulting graph to a 
TermAutomatonQuery e.g. using TokenStreamToTermAutomatonQuery.


multiple tokens means: a synonym with multiple equivalents??

or does it mean a synonym with multiple words?

this is not clear to me.

Best regards


On 9/10/18 3:15 PM, baris.ka...@oracle.com wrote:
https://lucene.apache.org/core/6_4_1/analyzers-common/org/apache/lucene/analysis/synonym/SynonymGraphFilter.html 



Does this mean i dont have to repeat it in the search analyzer when i 
do this at indexing time?


Best regards





-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



SynonymGraphFilter

2018-09-10 Thread baris . kazar

https://lucene.apache.org/core/6_4_1/analyzers-common/org/apache/lucene/analysis/synonym/SynonymGraphFilter.html

Does this mean i dont have to repeat it in the search analyzer when i do 
this at indexing time?


Best regards



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: SynonymMap

2018-09-10 Thread Michael McCandless
The SynonymMap.Builder constructor takes a dedup parameter to tell it what
to do in that case (when input and output are identical across added rules).

Mike McCandless

http://blog.mikemccandless.com

On Thu, Sep 6, 2018 at 2:06 PM, Baris Kazar  wrote:

> Hi,-
> how does SynonymMap deal with repeated values?
> Best regards
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>