htdig: Re: Excluding directories and duplicate URLs patch

Geoff Hutchison Sun, 13 Sep 1998 08:42:20 -0400

>have not been applied to ht://Dig 3.1.0b1; I applied it manually and
>recompiled htdig and reran rundig.  My databases shrank to their normal
>size; no more duplicates;-)  Please include this patch in your next
>release.

The reason I did not apply this patch to the 3.1.0b1 release is because it
only applies to local indexing. So I didn't want to announce "elimination
of duplicate files" until I had a patch ready for HTTP access as well
(which I don't).

I also don't think direct elimination is the correct approach. I'd rather
*detect* duplicates and store multiple URLs for each page. This is the
approach used by other search engines for mirrors, so I think this should
be the approach for ht://Dig too.


-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

htdig: Re: Excluding directories and duplicate URLs patch

Reply via email to