At 10:21 PM -0500 7/26/00, David Gibbs wrote:
>I am indexing a large mailing list archive with a lot of keywords, 
>acronyms, and technical terms (I'm not sure this makes a difference).
>
>When I do a search, almost always first file it returns is 
>completely irrelevant... it has nothing to do with the search terms. 
>The 2nd file, however, is appropriate for the search.

Is the first file usually the same file? Is it a file that might be 
linked from a number of different places? You probably want to change 
your backlink_factor attribute in that case.

<http://www.htdig.org/attrs.html#backlink_factor>

Unfortunately, mailing list archives can wreck havoc on any sort of 
"importance weighting" which is the idea behind the backlink_factor. 
Almost no matter how you do it, the index pages get too much weight.

If fiddling with the backlink_factor doesn't help, some more 
specifics about the pages might make it easier to give a better 
suggestion.

Cheers,

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to