Big fan of lucene already. Just looking for some advice, with apologies in advance if it's been already answerd in the list and I just didn't search right.
1. Lets say I want to store a term in MORE than one way: e.g., I want to store the soundex version of a word and the real version of a word. All of the examples of extending analyzer return one thing (I think it's a token string). But what I want to do is something like IF this is just a string of more than 4 characters, THEN store its literal AND soundex versions. I'm thinking I need to do something to tokenstream, but I'm not sure what. 2. I've got a bunch of names assocated with a single person (aliases) (document): e.g., "Gary Furash", "Gary 'The Nose' Furash", "Gary Furnham". If I stick them all in the same field ("names"), and search on "Gary", that document gets overly weighted - since the name shows up 3 times. So, I could just override the analyzer and only put in Gary once (dedupe the names), but then I loose some of the nearness stuff: that is, if a user types "Gary Furash", the document should hit higher - those words are close together. Thanks all. Gary Furash, MBA, PMP, Applications Manager Maricopa County Attorney's Office --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]