On Wednesday 30 October 2002 20:58, Michael McDonald wrote: > Is there a way to arrange indexing and searching so that when searching > for "Lucene", the term "Lucene" would be given more boost than the term > "lucene", and ideally "lucene" would have more boost than "LUCENE"?
Use an analyzer that keeps the original case for indexing and query eg. like this: Lucene^10 lucene^8 LUCENE^6 You want different weights per term, and you can't influence these directly in the index. Therefore you'll have to query with different term weights. A problem arises when there are 100 documents mentioning Lucene, and one document mentioning LUCENE. With the above query, the LUCENE document will likely get the highest score. So you'll have to adapt the weights in the query by using the scoring formula and correcting for the nrs of documents containing each of the terms. You can get these from IndexReader.docFreq(). And you'll have to do that for each casing of the queried term, ie. 2 ** (length of term) times, skipping the ones having zero frequency. Kind regards, Ype -- To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@;jakarta.apache.org> For additional commands, e-mail: <mailto:lucene-user-help@;jakarta.apache.org>
