Re: SynonymGraphFilter

baris . kazar Thu, 13 Sep 2018 06:28:38 -0700

Thanks Michael. I think this clears my questions.

Best regards



On 9/12/18 8:23 PM, Michael Sokolov wrote:

Usually one will either apply synonyms at index time or apply them at query
time, but not both. I think the situation is that you will get most correct
behavior, respecting synonym graph structure, with query time synonyms.

Index time synonyms may give better performance, but at the cost of some
overlap along time positions that results from the need for flattening, as
in the quote you provided. If you use only query time synonyms there is no
need to flatten.

On Thu, Sep 13, 2018, 12:59 AM <baris.ka...@oracle.com> wrote:

Any examples on the following note on the Javadocs at

https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F4-5F1_analyzers-2Dcommon_org_apache_lucene_analysis_synonym_SynonymGraphFilter.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=jjVzb2BqmqJ8noR0AT4fAenDR5scVDEiq9sAcfDmSjM&s=S02bxwhpCKvLzibdipBlbNQUEcnYsXVBBIiOV2fUKNM&e=


Quoted from the above url:

*/However, if you use this during indexing, you must follow it with
FlattenGraphFilter to squash tokens on top of one another like
SynonymFilter, because the indexer can't directly consume a graph. To
get fully correct positional queries when your synonym replacements are
multiple tokens, you should instead apply synonyms using this
TokenFilter at query time and translate the resulting graph to a
TermAutomatonQuery e.g. using TokenStreamToTermAutomatonQuery./*

End of quote


This will make the code really hard to maintain if we separate synonyms
based on the number of tokens.

Any suggestions please?

Best regards




On 9/11/18 1:45 PM, baris.ka...@oracle.com wrote:

Mike,-

Great article, thanks for that; and i was exactly thinking about
reverse mapping when

i was writing this question. i guess Lucene would be nicer to both
mappings when one is called for or another parameter to activate this
double mapping.


My next question is: can a synonmy be separated by space ?

Next last question on this: should i repeat this both at index and
query times?
Best regards

On 9/11/18 1:39 PM, Michael McCandless wrote:

Try reading the blog post I wrote about token stream graphs?

https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.mikemccandless.com_2012_04_lucenes-2Dtokenstreams-2Dare-2Dactually.html&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=dFW7hW4Pkle8VsJIr-2hnjRiyzutTBueNt4tylmWfGA&s=VmAivANEDBIW2o1yuPeArZ9TEaeUW33HDiwFFLRZMxU&e=


Mike McCandless

https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.mikemccandless.com&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=dFW7hW4Pkle8VsJIr-2hnjRiyzutTBueNt4tylmWfGA&s=UPmHXdrk9T2XCSkJrvxNMIqQo5Bducmp5rQRwpZ8UHo&e=


On Tue, Sep 11, 2018 at 1:35 PM, <baris.ka...@oracle.com> wrote:

Any comments please?

Thanks


On 9/10/18 5:07 PM, baris.ka...@oracle.com wrote:

Any examples on this? i think it would be nice if Javadocs had an
example
on this:

However, if you use this during indexing, you must follow it with
FlattenGraphFilter to squash tokens on top of one another like
SynonymFilter, because the indexer can't directly consume a graph.
To get
fully correct positional queries when your synonym replacements are
multiple tokens, you should instead apply synonyms using this
TokenFilter
at query time and translate the resulting graph to a
TermAutomatonQuery
e.g. using TokenStreamToTermAutomatonQuery.

multiple tokens means: a synonym with multiple equivalents??

or does it mean a synonym with multiple words?

this is not clear to me.

Best regards


On 9/10/18 3:15 PM, baris.ka...@oracle.com wrote:

https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.
apache.org_core_6-5F4-5F1_analyzers-2Dcommon_org_apache_luce
ne_analysis_synonym_SynonymGraphFilter.html&d=DwICaQ&c=RoP1Y
umCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BK
NeyLlULCbaezrgocEvPhQkl4&m=E2-7wwk3FgEU_ykuPnXNoOe0IIkgxivSa
YV3p-2lGfY&s=guRDJ6HEg5JJkMQqdDVZkKs0gbuI7naZK2TUXFHN9w8&e=

Does this mean i dont have to repeat it in the search analyzer
when i do
this at indexing time?

Best regards

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: SynonymGraphFilter

Reply via email to