Re: Lucene 4.7 intermittently not applying query filter

2014-03-28 Thread Jamie
Steve Thank for the contact. I believe UAX29URLEmailTokenizer tokenizes email addresses as follows: john@mycompany.com.au john.doe mycompany.com.au john doe mycompany com au com.au.We have an overridden query parser that swaps out anyaddress: with to, from, cc, bcc, etc. Inside the overri

Re: Lucene 4.7 intermittently not applying query filter

2014-03-28 Thread Steve Rowe
Hi Jamie, What does EmailFilter do? Why is the expanded form "required for the UAX29URLEmailTokenizer"? Seems like an exact match would work on the email address alone, without the expanded components? Do you have an example of a query that reproducibly matches more documents than it shoul

RE: Lucene 4.7 intermittently not applying query filter

2014-03-28 Thread Uwe Schindler
Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Jamie [mailto:ja...@mailarchiva.com] > Sent: Friday, March 28, 2014 4:41 PM > To: java-user@lucene.apache.org > Subject: Re: Lucene 4.

Re: Lucene 4.7 intermittently not applying query filter

2014-03-28 Thread Jamie
I beg your pardon. Its our EmailFilter class that emits the tokens. We do it this way, since users like to search using individual components of an email address. e.g. joe or mycompany.com.au. I think we may have a synchronization issue at play. I will perform some further testing and will get

Re: Lucene 4.7 intermittently not applying query filter

2014-03-28 Thread Steve Rowe
Jamie, UAX29URLEmailTokenizer does not emit email components as tokens; “john@mycompany.com.au” will be tokenized as “john@mycompany.com.au”, nothing more. That’s why I asked what EmailFilter does. If the filter really is ignored by Lucene, that would be a bug in Lucene. I think some