Re: WilcardQuery and memory

2007-03-09 Thread Erick Erickson
You can also use a filter. The basic idea is to construct a Lucene Filter, probably using something like RegexTermEnum/TermDocs. It's faster than you think . This, in combination with ConstantScoreQuery should fix you right up. Several things: 1> you lose scoring with the filter part of a query w

Re: WilcardQuery and memory

2007-03-09 Thread Joe
Hi Rob, For indexing e-mail, I recommend that you tokenise the e-mail addresses into fragments and query on the fragments as whole terms rather than using wildcards. [example] Hm for email adresses this isnt a big problem here. The real problem is the query on the body part of an email, wh

RE: WilcardQuery and memory

2007-03-09 Thread Rob Staveley (Tom)
For indexing e-mail, I recommend that you tokenise the e-mail addresses into fragments and query on the fragments as whole terms rather than using wildcards. Rather than looking for fischauto333* in (say) smtp-from, look for fischauto333 in (say) an additional field called smtp-from-fragments to