Re: Newbie Questions

Mark Woon Tue, 26 Aug 2003 18:59:48 +0000

Gregor Heinrich wrote:

ad 1: MultiFieldQueryParser is what you might want: you can specify the fields to run the query on. Alternatively, the practice of duplicating the contents of all separate fields in question into one additional merged field has been suggested, which enables you to use QueryParser itself.

Ah, I've been testing out something similar to the latter. I've been adding multiple values on the same key. Won't this have the same effect? I've been assuming that if I do

doc.add(Field.Keyword("content", "value1");
doc.add(Field.Keyword("content", "value2");

And did a search on the "content" field for either value, I'd get a hit, and it seems to work. This way, I figure I'd be able to differentiate between values that I want tokenized and values that I don't.

Is there a difference between this and building a StringBuffer containing all the values and storing that as a single field-value?

ad 2: Depending on the Analyzer you use, the query is normalised, i.e., stemmed (remove suffices from words) and stopword-filtered (remove highly frequent words). Have a look at StandardAnalyzer.tokenStream(...) to see how the different filters work. In the analysis package the 1.3rc2 Lucene distribution has a Porter stemming algorithm: PorterStemmer.

There's an rc2 out? Where?? I just checked the Lucene website and only see rc1.

Thanks everyone for all the quick responses!

-Mark

Re: Newbie Questions

Reply via email to