According to Ted Stresen-Reuter:
> Any idea why pdf and Word files are consistently ranked higher than html 
> files (which have keyword meta tags, TITLE tags, and H1 tags with closer 
> matches)?

Not really, but you're not the first person to complain about it.
I think in the past it's usually boiled down to the fact that the word
appears many more times in the text of the PDF or Word document than
in the HTML files.

Is this still with a recent 3.2.0b4 snapshot, or have you gone back to
3.1.6 now?  Another scoring quirk in 3.1.x is that words near the start
of a document are ranked higher than words near the end.  Mind you, meta
tags, titles and h1 tags tend to be near the start, so they should be
ranked high in 3.1.x.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to