I searched online and found that shingles tokenizer that comes pre-installed with elastic search can give bigrams, trigrams etc.
I want to retrieve skip-grams from my documents for indexing, along with words, bigrams and trigrams. Further search revealed that I might have to write a custom plugin for such tokenizer. But I could not find proper documentation for writing one. Can anyone point me to the right resources which I might need for the task. Thanks! -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ecb26d94-1255-4402-a560-359df78e29e1%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
