Hi.We are using Nutch with a solr backend. I have some questions about the field boost used by Nutch when indexing documents. I can't find the numbers anywhere, but it seems like nutch is not using the default values?
When the document is indexed by nutch I get this result when searching for the url: 0.0014793393 = fieldWeight(url:"super secret url" in 22), product of: 1.0 = tf(phraseFreq=1.0) 32.31666 = idf(url: www=7327 host=321 com=7327 something=2456 something=2 something=44 704290075=1) 4.5776367E-5 = fieldNorm(field=url, doc=22) After retrieving the document from the solr index and writing it back with default field boost of 1.0, I get these values. 9.874598 = fieldWeight(url:"super secret url" in 0), product of: 1.0 = tf(phraseFreq=1.0) 31.598713 = idf(url: www=7328 host=322 com=7328 something =2457 something =3 something =45 704290075=2) 0.3125 = fieldNorm(field=url, doc=0) As you can see, fieldNorm has changed significantly. The fieldNorm is calculated using this algorithm: document boost * field boost * (1/sqrt(terms in field)) Document boost is equal. Terms in field is equal. The only thing that may have changed is "field boost" So the question is: What kind of index-time field boost does nutch use? -- Ole-Martin Mørk http://twitter.com/olemartin http://flickr.com/olemartin