Clemens Marschner wrote: > 1. I think the new document boost is missing, isn't it? > With that it should be something like > > score_d = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t * boost_t) > * coord_q_d * boost_d > Is that correct?
Almost. This should actually be boost_d * boost_d_t, the boost factor for the document multiplied by the boost for t's field in d. > 2. If I like the score to be independent of the number of terms in the > document (regarding them as essentially constant), is it enough to leave out > the norm_d_t factor? Yes. Note however that the quantity called 'norm' in the code is now frequently actually norm_d_t * boost_t * boost_d_t. This quantity is now computed at index time and stored in the norms file. > I have seen that a norm factor between 0 and 255 is read with > IndexReader.norms() in TermScorer.score(). Is that the one? Yes, although see my note above. > From what I further understand (and from digging in Witten/Moffat/Bell) the > norm_q factor is not calculated, since it stays the same for one query. Lucene calculates it anyway. It's cheap to compute: it is multiplied together with the term boost and idf once per query term, then this weight is used in subsequent computations. Doug -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>