Re: [htdig] score

Gilles Detillieux Fri, 06 Jul 2001 09:14:02 -0700
According to Malcolm Austen:
> On Mon, 2 Jul 2001, Michael Maria wrote:
> + I am curious as to how htdig determines the score for any returned
> + items.� Is it as simple as the frequency of the searched for term?
> 
> In order to sort things out in my own mind ready for a debate that seems
> to have fizzled out here I collected togther all the bits about scores
> from the documentation. This is for 3.1.5 ...
> 
> http://wwwsearch.ox.ac.uk/scores.html

That's a great writeup, and I agree with your recommendations.  We may want
to consider changing the defaults in upcoming releases.

I do want to correct one inaccuracy in the document, though.  You say:

   There is another factor that appears to be undocumented. This is the
   'location' of the word relative to the start of the document. This
   starts at 1000 and tails off, one per word, until it drops to zero
   1000 words into the document. I can't justify this, I'm not even sure
   the authors can now as it will be done differently in the next release
   (sometime, maybe). However, that's how it is for the time being.

It actually doesn't tail off one per word, but rather the factor of 0-1000
indicates tens of percentages from the end of the document, so it doesn't
actually hit zero except maybe for the last word or so of large documents.
I would tend to agree that this isn't a great idea, but I'm not sure I
should take that out for 3.1.6.  Maybe another config attribute?

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html
Re: [htdig] score

Reply via email to