Hi All, i want to index email fields as both analyzed and not analyzed using custom analyzer.
for example, sm...@yahoo.com will.sm...@yahoo.com that is, indexing sm...@yahoo.com as single token as well as analyzed tokens in same email field... My existing custom analyzer, public class CustomSearchAnalyzer extends StopwordAnalyzerBase { public CustomSearchAnalyzer(Version matchVersion, Reader stopwords) throws Exception { super(matchVersion, loadStopwordSet(stopwords, matchVersion)); } @Override protected Analyzer.TokenStreamComponents createComponents(final String fieldName, final Reader reader) { final ClassicTokenizer src = new ClassicTokenizer(getVersion(), reader); src.setMaxTokenLength(ClassicAnalyzer.DEFAULT_MAX_TOKEN_LENGTH); TokenStream tok = new ClassicFilter(src); tok = new LowerCaseFilter(getVersion(), tok); tok = new StopFilter(getVersion(), tok, stopwords); tok = new ASCIIFoldingFilter(tok); // to enable AccentInsensitive search return new Analyzer.TokenStreamComponents(src, tok) { @Override protected void setReader(final Reader reader) throws IOException { src.setMaxTokenLength(ClassicAnalyzer.DEFAULT_MAX_TOKEN_LENGTH); super.setReader(reader); } }; } } And so i want to achieve like, 1.if i search using query "sm...@yahoo.com", records with will.sm...@yahoo.com should not come... 2.Also i should be able to search using query "smith" in that field 3.if possible, should be able to detect email values in all other fields and apply the same type of tokenization How to achieve point 1 and 2 using UAX29URLEmailTokenizer? how to add UAX29URLEmailTokenizer in my existing custom analyzer without using email analyzer ( perfieldanalyzer ) for email field.. And so i can apply this tokenizer for email terms of all fields.. - Kumaran R