Gregor Heinrich wrote:

ad 1: MultiFieldQueryParser is what you might want: you can specify the
fields to run the query on. Alternatively, the practice of duplicating the
contents of all separate fields in question into one additional merged field
has been suggested, which enables you to use QueryParser itself.



Ah, I've been testing out something similar to the latter. I've been adding multiple values on the same key. Won't this have the same effect? I've been assuming that if I do


doc.add(Field.Keyword("content", "value1");
doc.add(Field.Keyword("content", "value2");

And did a search on the "content" field for either value, I'd get a hit, and it seems to work. This way, I figure I'd be able to differentiate between values that I want tokenized and values that I don't.

Is there a difference between this and building a StringBuffer containing all the values and storing that as a single field-value?


ad 2: Depending on the Analyzer you use, the query is normalised, i.e.,
stemmed (remove suffices from words) and stopword-filtered (remove highly
frequent words). Have a look at StandardAnalyzer.tokenStream(...) to see how
the different filters work. In the analysis package the 1.3rc2 Lucene
distribution has a Porter stemming algorithm: PorterStemmer.



There's an rc2 out? Where?? I just checked the Lucene website and only see rc1.



Thanks everyone for all the quick responses!


-Mark



Reply via email to