Hi,

I am noticing this too in 3.2.0b4 snapshot - even PDF's with only one word
occurrence seem to rank much higher than HTML files with many occurrences of
the search term... we can't seem to get to the bottom of this...

Todd

Todd Hooge
Director of Website Development
Metamend Software & Design Ltd.
http://www.metamend.com/


-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Gilles
Detillieux
Sent: Thursday, October 03, 2002 12:38 PM
To: Ted Stresen-Reuter
Cc: htdig
Subject: Re: [htdig] pdf and word files ranked higher than html files


According to Ted Stresen-Reuter:
> Any idea why pdf and Word files are consistently ranked higher than html
> files (which have keyword meta tags, TITLE tags, and H1 tags with closer
> matches)?

Not really, but you're not the first person to complain about it.
I think in the past it's usually boiled down to the fact that the word
appears many more times in the text of the PDF or Word document than
in the HTML files.

Is this still with a recent 3.2.0b4 snapshot, or have you gone back to
3.1.6 now?  Another scoring quirk in 3.1.x is that words near the start
of a document are ranked higher than words near the end.  Mind you, meta
tags, titles and h1 tags tend to be near the start, so they should be
ranked high in 3.1.x.

--
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to
<[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to