Hi All: I've only been looking at Lucene for about a week now. I'm using 1.3 RC2.
I am searching a moderately sized repository of documents containing assembly language source code and I'm trying to implement both case sensitive and case insensitive queries (that's what the users want). I'm using the WhitespaceAnalyzer to index the files, thereby preserving the case of the tokens in the index. I'm now trying to query that index. The Analyzer used for my QueryParser is either a standard WhitespaceAnalyzer (for case sensitive) or a "LowerCaseWhitespaceAnalyzer" whose underlying tokenizer simply extends the WhitespaceTokenizer and overrides normalize to lowercase the characters (for case insensitive). I'm also setting the QueryParser's lowercaseWildcardTerms attribute appropriately. For case sensitive searches, I use a standard IndexSearcher, whose IndexReader provides tokens directly from the index (which have their case's preserved). For case insensitive searches, I believe that I want to lowercase the tokens from the index. Is this what the FilterIndexReader is for? Should I extend that Reader and override its terms(...) methods to provide lowercased terms? Or am I just on the wrong path altogether? Help? Thanks -Brent Schneeman __________________________________ Do you Yahoo!? The New Yahoo! Shopping - with improved product search http://shopping.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
