Re: Custom analyzer without a tokenizer

Nikolas Everett Tue, 03 Mar 2015 10:39:03 -0800

On Tue, Mar 3, 2015 at 1:02 PM, Sagar Shah <[email protected]> wrote:


Hello everyone,
> I am working on a defining a mapping in elastic search, which can have few
> fields on the fly. I can define the types & index using dynamic templates,
> but I would like to know the difference between following two and which one
> is preferred over the other.
> I do not want to break down the string into tokens but use it in single
> complete string
>
> Option 1. Field Index : not_analyzed
> Option 2. Field Index:  customer analyzer with no tokenizer
>
> Are there any performance differences for the above two approaches?
> How does the 2nd option works (A custom analyzer with no tokenizer)? and
> how can I create mapping for the same?
>
>
I believe if you leave the tokenizer out you get the StandardTokenizer.
Its very different from not_analyzed.  not_analyzed is like "don't break
this up - don't change it at all - I'm going to search for it _exactly_ how
I send it to you".  I believe the standard tokenizer does icu word
segmentation: http://unicode.org/reports/tr29/ .

Nik

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3fx1hhX7RE9w_jK1YZK47URUiaUjK%3D2yXzyPQGPMd4BA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Custom analyzer without a tokenizer

Reply via email to