On Tue, Mar 3, 2015 at 1:02 PM, Sagar Shah <[email protected]> wrote:
Hello everyone, > I am working on a defining a mapping in elastic search, which can have few > fields on the fly. I can define the types & index using dynamic templates, > but I would like to know the difference between following two and which one > is preferred over the other. > I do not want to break down the string into tokens but use it in single > complete string > > Option 1. Field Index : not_analyzed > Option 2. Field Index: customer analyzer with no tokenizer > > Are there any performance differences for the above two approaches? > How does the 2nd option works (A custom analyzer with no tokenizer)? and > how can I create mapping for the same? > > I believe if you leave the tokenizer out you get the StandardTokenizer. Its very different from not_analyzed. not_analyzed is like "don't break this up - don't change it at all - I'm going to search for it _exactly_ how I send it to you". I believe the standard tokenizer does icu word segmentation: http://unicode.org/reports/tr29/ . Nik -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3fx1hhX7RE9w_jK1YZK47URUiaUjK%3D2yXzyPQGPMd4BA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
