Steve
Thank for the contact. I believe UAX29URLEmailTokenizer tokenizes email
addresses as follows: john....@mycompany.com.au john.doe
mycompany.com.au john doe mycompany com au com.au.We have an overridden
query parser that swaps out anyaddress: with to, from, cc, bcc, etc.
Inside the overridden query parser, we call getFieldQuery() to build the
clauses...
Query q = super.getFieldQuery(field, emailAddress, true);
if (slop!=-1) {
applySlop(q,slop);
}
clauses.add(new BooleanClause(q, BooleanClause.Occur.SHOULD));
The query is outputted below. Sometimes when it is executed by Lucene,
the filter is ignored.
I am busy trying to isolate the issue, since the code is running in a
wider system among other complexities.
Jamie
On 2014/03/28, 4:08 PM, Steve Rowe wrote:
Hi Jamie,
What does EmailFilter do?
Why is the expanded form "required for the UAX29URLEmailTokenizer"? Seems like
an exact match would work on the email address alone, without the expanded components?
Do you have an example of a query that reproducibly matches more documents than
it should, and a document that matched but shouldn’t have?
Steve
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org