I'm using Lucene to index MIME messages and have a couple of questions.
1) What is the best way to handle keyword fields which are repeated? Like "recipient" for example.
At the moment I have a for loop doing
document.add(Field.Keyword("recipient", address));
But this seems to limit query results to messages that were sent to only the person I'm searching for...
Or, should I use Field.Text instead and write a custom analyzer which doesn't split email addresses. Then, store one field "recipients" which is a whitespace separated list of all the recipients?
2) I also store the sender in a keyword field, but searching isn't consistent. I can find some addresses, but not others. Where should I start looking for information to help with debugging?
3) Also, how do I make sure query terms that are for untokenized keyword fields don't get tokenized by QueryParse.parse()? I tried using the WhiteSpaceAnalyzer, but searching was still inconsistent.
Thanks in advance!
Ashley Collins
_________________________________________________________________
Add photos to your messages with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
- Re: Indexing email messages? Ashley Collins
- Re: Indexing email messages? petite_abeille
- Re: Indexing email messages? Ashley Collins
