That is strange... did you re-index or change the index? If so, you might want to verify that docid=3454 still corresponds to the same document you queried earlier.
-Yonik On Wed, Jun 11, 2008 at 1:09 PM, Brendan Grainger <[EMAIL PROTECTED]> wrote: > I've just changed the stemming algorithm slightly and am running a few tests > against the old stemmer versus the new stemmer. I did a query for 'hanger' > and using the old stemmer I get the following scoring for a document with > the title: Converter Hanger Assembly Replacement > > 6.4242806 = (MATCH) sum of: > 2.5697122 = (MATCH) max of: > 0.2439919 = (MATCH) weight(markup_t:hanger in 3454), product of: > 0.1963516 = queryWeight(markup_t:hanger), product of: > 6.5593724 = idf(docFreq=6375, numDocs=1655591) > 0.02993451 = queryNorm > 1.2426275 = (MATCH) fieldWeight(markup_t:hanger in 3454), product of: > 1.7320508 = tf(termFreq(markup_t:hanger)=3) > 6.5593724 = idf(docFreq=6375, numDocs=1655591) > 0.109375 = fieldNorm(field=markup_t, doc=3454) > 2.5697122 = (MATCH) weight(title_t:hanger^2.0 in 3454), product of: > 0.5547002 = queryWeight(title_t:hanger^2.0), product of: > 2.0 = boost > 9.265229 = idf(docFreq=425, numDocs=1655591) > 0.02993451 = queryNorm > 4.6326146 = (MATCH) fieldWeight(title_t:hanger in 3454), product of: > 1.0 = tf(termFreq(title_t:hanger)=1) > 9.265229 = idf(docFreq=425, numDocs=1655591) > 0.5 = fieldNorm(field=title_t, doc=3454) > 3.8545685 = (MATCH) max of: > 0.12199595 = (MATCH) weight(markup_t:hanger^0.5 in 3454), product of: > 0.0981758 = queryWeight(markup_t:hanger^0.5), product of: > 0.5 = boost > 6.5593724 = idf(docFreq=6375, numDocs=1655591) > 0.02993451 = queryNorm > 1.2426275 = (MATCH) fieldWeight(markup_t:hanger in 3454), product of: > 1.7320508 = tf(termFreq(markup_t:hanger)=3) > 6.5593724 = idf(docFreq=6375, numDocs=1655591) > 0.109375 = fieldNorm(field=markup_t, doc=3454) > 3.8545685 = (MATCH) weight(title_t:hanger^3.0 in 3454), product of: > 0.8320503 = queryWeight(title_t:hanger^3.0), product of: > 3.0 = boost > 9.265229 = idf(docFreq=425, numDocs=1655591) > 0.02993451 = queryNorm > 4.6326146 = (MATCH) fieldWeight(title_t:hanger in 3454), product of: > 1.0 = tf(termFreq(title_t:hanger)=1) > 9.265229 = idf(docFreq=425, numDocs=1655591) > 0.5 = fieldNorm(field=title_t, doc=3454) > > Using the new stemmer I get: > > 5.621245 = (MATCH) sum of: > 2.248498 = (MATCH) max of: > 0.24399184 = (MATCH) weight(markup_t:hanger in 3454), product of: > 0.19635157 = queryWeight(markup_t:hanger), product of: > 6.559371 = idf(docFreq=6375, numDocs=1655589) > 0.029934512 = queryNorm > 1.2426274 = (MATCH) fieldWeight(markup_t:hanger in 3454), product of: > 1.7320508 = tf(termFreq(markup_t:hanger)=3) > 6.559371 = idf(docFreq=6375, numDocs=1655589) > 0.109375 = fieldNorm(field=markup_t, doc=3454) > 2.248498 = (MATCH) weight(title_t:hanger^2.0 in 3454), product of: > 0.5547002 = queryWeight(title_t:hanger^2.0), product of: > 2.0 = boost > 9.265228 = idf(docFreq=425, numDocs=1655589) > 0.029934512 = queryNorm > 4.0535374 = (MATCH) fieldWeight(title_t:hanger in 3454), product of: > 1.0 = tf(termFreq(title_t:hanger)=1) > 9.265228 = idf(docFreq=425, numDocs=1655589) > 0.4375 = fieldNorm(field=title_t, doc=3454) > 3.372747 = (MATCH) max of: > 0.12199592 = (MATCH) weight(markup_t:hanger^0.5 in 3454), product of: > 0.09817579 = queryWeight(markup_t:hanger^0.5), product of: > 0.5 = boost > 6.559371 = idf(docFreq=6375, numDocs=1655589) > 0.029934512 = queryNorm > 1.2426274 = (MATCH) fieldWeight(markup_t:hanger in 3454), product of: > 1.7320508 = tf(termFreq(markup_t:hanger)=3) > 6.559371 = idf(docFreq=6375, numDocs=1655589) > 0.109375 = fieldNorm(field=markup_t, doc=3454) > 3.372747 = (MATCH) weight(title_t:hanger^3.0 in 3454), product of: > 0.83205026 = queryWeight(title_t:hanger^3.0), product of: > 3.0 = boost > 9.265228 = idf(docFreq=425, numDocs=1655589) > 0.029934512 = queryNorm > 4.0535374 = (MATCH) fieldWeight(title_t:hanger in 3454), product of: > 1.0 = tf(termFreq(title_t:hanger)=1) > 9.265228 = idf(docFreq=425, numDocs=1655589) > 0.4375 = fieldNorm(field=title_t, doc=3454) > > The thing that is perplexing is that the fieldNorm for the title_t field is > different in each of the explanations, ie: the fieldNorm using the old > stemmer is: 0.5 = fieldNorm(field=title_t, doc=3454). For the new stemmer > 0.4375 = fieldNorm(field=title_t, doc=3454). I ran the title through both > stemmers and get the same number of tokens produced. I do no index time > boosting on the title_t field. I am using DefaultSimilarity in both > instances. So I figured the calculated fieldNorm would be: > > field boost * lengthNorm = 1 * 1/sqrt(4) = 0.5 > > I wouldn't have thought that changing the stemmer would have any impact on > the fieldNorm in this case. Any insight? Please kick me over to the lucene > list if you feel this isn't appropriate here. > > Regards > Brendan