Hi Zheng,

UAX29UET recognizes URLs and e-mails. It does not tokenize them. It keeps them 
single token.

StandardTokenizer produce two or more tokens for an entity.

Please try them using the analysis page, use which one suits your requirements.

Ahmet



On Friday, November 24, 2017, 11:46:57 AM GMT+3, Zheng Lin Edwin Yeo 
<edwinye...@gmail.com> wrote: 





Hi,

I am indexing email addresses into Solr via EML files. Currently, I am
using ClassicTokenizerFactory with LowerCaseFilterFactory. However, I also
found that we can also use UAX29URLEmailTokenizerFactory with
LowerCaseFilterFactory.

Does anyone have any recommendation on which Tokenizer is better?

I am currently using Solr 6.5.1, and planning to upgrade to Solr 7.1.0.

Regards,
Edwin

Reply via email to