Doug
Terry Steichen wrote:
Doug,
(1) No, I did *not* boost the pub_date field, either in the indexing process or in the query itself.
(2) And, each pub_date field of each document (which is in XML format) contains only one instance of the date string.
(3) And only the pub_date field itself is indexed. There are other attributes of this field that may contain the date string, but they aren't indexed - that is, they are not included in the instantiated Document class.
Regards,
Terry
----- Original Message ----- From: "Doug Cutting" <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Wednesday, September 17, 2003 5:51 PM Subject: Re: Lucene Scoring Behavior
Terry Steichen wrote:
0.03125 = fieldNorm(field=pub_date, doc=90992) 1.0 = fieldNorm(field=pub_date, doc=90970)
It looks like the fieldNorm's are what differ, not the IDFs. These are the product of the document and/or field boost, and 1/sqrt(numTerms) where numTerms is the number of terms in the "pub_date" field of the document. Thus if each document is only assigned one date, and you didn't boost the field or the document when you indexed it, this should be 1.0. But if the document has two dates, then this would be 1/sqrt(2). Or if you boosted this document pub_date field, then this will have whatever boost you provided.
So, did you boost anything when indexing? Or could a single document have two or more different values for pub_date? Either would explain
this.
Doug
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
