Hi I was analyzing some analyzer weird behaviour, and try to understand why it happens and how to fix it
here what token I get for standard analyzer for text: "[email protected]:test1234" curl -XGET 'localhost:9200/_analyze?analyzer=standard&pretty=true' -d '[email protected]:test1234' { "tokens" : [ { "token" : "myemail", "start_offset" : 0, "end_offset" : 7, "type" : "<ALPHANUM>", "position" : 1 }, { "token" : "email.com:test1234", "start_offset" : 8, "end_offset" : 26, "type" : "<ALPHANUM>", "position" : 2 } ] } so question is why I am getting that as one token: "email.com:test1234" why it is not devided to tokens by . and : ? and what analyzer/tokenizer/filter can I use that can help with it? Thanks, Igor -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/826eb584-3408-404a-b87c-2c44e455bb65%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
