Re: Email id tokenizer (actual email id & multiple terms)

suriya prakash Wed, 21 Dec 2016 04:24:25 -0800

Hi,

Thanks for your reply.

I might have one or more emailds in a single record.  So I have to index it
with white space analyser after filtering emailid alone(may be using email
id tokenizer).

Tokenization will happen twice( for normal indexing and for special emailid
field indexing) which is costly for content field.

Is there any way to do it efficiently? will TeeSinkTokenFilter help for my
case?

On Tue, Dec 20, 2016 at 7:45 PM, suriya prakash <suriy...@gmail.com> wrote:

> Hi,
>
> I am using standard analyzer and want to split token for email_id "
> luc...@gmail.com" as "lucene", "gmail","com","luc...@gmail.com" in a
> single pass.
>
> I have already changed jflex to split email id as separate words(lucene,
> gmail, com). But we need to do phrase search which will not be efficient.
> So i want to index actual email id and splitted words.
>
> Can you please help me to achieve this. OR let me know whether phrase
> search is efficient for this case?
>
>
> Regards,
> Suriya
>

Re: Email id tokenizer (actual email id & multiple terms)

Reply via email to