Thanks Yonik for the reply. I got just a couple more questions, 1) Why does the explanantion print so many times?
2) Since my query is made up of multiple terms how do I know what term "x" is referring to? On 3/3/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > I think Lucene in Action does a good job of it. > There is also a formula given in the javadoc for DefaultSimilarity > > http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html > > See my comments below (inline) > > On 3/2/06, Eugene <[EMAIL PROTECTED]> wrote: > > Hi All, > > > > I'm not sure how to interpret the result of the toString method of > > Explanation. I'm trying to see the values of each component of the > > Default Similarity formula for a particular query and a doc. Given > > below is a sample of my Explanation output. Many thanks if anyone could > > help explain some of the values or direct me to a place that does so. > > > > Explanation = 0.683103 = product of: > > 1.7077575 = sum of: > > 0.184242 = weight(Contents:x in 78), product of: > > 0.13565542 = queryWeight(Contents:x), product of: > the queryWeight is query-specific... it will have the same value > for all documents matching the query. > > 2.509232 = idf(docFreq=85) > inverse document frequency... term "x" appears in 85 documents. > > 0.054062527 = queryNorm > queryNorm is a normalization factor... 1/sqrt(sum of all query weights > squared) > > If you had a boost, it would also be multiplied into the queryWeight > at this point. > > 1.3581617 = fieldWeight(Contents:x in 78), product of: > fieldWeight components are document specific. > > 1.7320508 = tf(termFreq(Contents:x)=3) > "x" appears 3 times in the field for this document > > 2.509232 = idf(docFreq=85) > same as the previous idf factor - 85 documents contain "x" > > 0.3125 = fieldNorm(field=Contents, doc=78) > the norm is calculated at index time... it's the length normalization > factor (1/sqrt(num tokens in this field)) multipled by any on the > field or document. > > > 0.184242 = weight(Contents:x in 78), product of: > > 0.13565542 = queryWeight(Contents:x), product of: > > 2.509232 = idf(docFreq=85) > > 0.054062527 = queryNorm > > 1.3581617 = fieldWeight(Contents:x in 78), product of: > > 1.7320508 = tf(termFreq(Contents:x)=3) > > 2.509232 = idf(docFreq=85) > > 0.3125 = fieldNorm(field=Contents, doc=78) > > 0.26218253 = weight(Contents:y in 78), product of: > > 0.16182467 = queryWeight(Contents:y), product of: > > 2.9932873 = idf(docFreq=52) > > 0.054062527 = queryNorm > > 1.6201642 = fieldWeight(Contents:y in 78), product of: > > 1.7320508 = tf(termFreq(Contents:y)=3) > > 2.9932873 = idf(docFreq=52) > > 0.3125 = fieldNorm(field=Contents, doc=78) > > > -Yonik > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Regards, Eugene