Yes, you can use two different analyzers. In your case what you can do is: - for the the indexation you apply a shingle filter. - for the query you also apply a shingle filter, but this time you disable the unigrams (output_unigrams: false), so it will only generate the shingles, in your case : "t1 t2" and "t2 t3". It will match your document. Cédric Hourcade [email protected]
On Fri, Jun 20, 2014 at 12:30 PM, 陳智清 <[email protected]> wrote: > Hello Hourcade, Thanks for your response. > > Does that mean different values should be set to "index_analyzer" and > "search_analyzer"? (e.g. "index_analyzer": "shingle", and "search_analyzer": > "standard") > What if I want to re-use the same "shingle" analyzer in both index and > search? will the match_phrase "t1 t2 t3" still give me a match? > > I know that set a different analyzer to "search_analyzer" makes match_phrase > "t1 t2 t3" searchable, but if I do that, then I get no benefit from > "shingle", right? Instead I get a bigger index size. > > I assume "shingle" is used for faster "match_phrase" searches. But after > shingle, searching a phrase of 3 tokens "t1 t2 t3" becomes searching a > phrase of 5 tokens plus I don't know how "shingle" arranges the positions > for a correct phrase query. So how can "match_phrase" be faster? Thank you. > > Cédric Hourcade於 2014年6月20日星期五UTC+8下午4時18分03秒寫道: >> >> Hello, >> >> Let's say you have an indexed text "t1 t3 t3" with shingles. The token >> positions are also indexed, so you get : t1 (at pos 1), "t1 t2" (pos >> 1), t2 (pos 2), "t2 t3" (pos 2) and t3 (pos 3). >> >> So if you are searching with a match_phrase for "t1 t2 t3" (even if >> not tokenized as shingles) it will matches the document, because t1, >> t2 and t3 are considered next to each others (based on there recorded >> position) for this document. >> >> Cédric Hourcade >> [email protected] >> >> >> On Fri, Jun 20, 2014 at 7:04 AM, 陳智清 <[email protected]> wrote: >> > How does shingle filter work on match_phrase in query phase? >> > >> > After analyzing phrase "t1 t2 t3", shingle filter produced five tokens, >> > t1 >> > t2 >> > t3 >> > "t1 t2" >> > "t2 t3" >> > >> > Will match_phrase still give "t1 t2 t3" a match? How it works? Thank >> > you. >> > >> > -- >> > You received this message because you are subscribed to the Google >> > Groups >> > "elasticsearch" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> > an >> > email to [email protected]. >> > To view this discussion on the web visit >> > >> > https://groups.google.com/d/msgid/elasticsearch/33889bbd-9b01-4414-b579-4e625f0eec17%40googlegroups.com. >> > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/602477cb-d8f4-459b-8888-e6174662fbfd%40googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJQxjPMAEGK%3DSxYfoBtjgcdZYPHqAAiSPpQBjh1fvtXgkwWuLA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
