>have not been applied to ht://Dig 3.1.0b1; I applied it manually and
>recompiled htdig and reran rundig. My databases shrank to their normal
>size; no more duplicates;-) Please include this patch in your next
>release.
The reason I did not apply this patch to the 3.1.0b1 release is because it
only applies to local indexing. So I didn't want to announce "elimination
of duplicate files" until I had a patch ready for HTTP access as well
(which I don't).
I also don't think direct elimination is the correct approach. I'd rather
*detect* duplicates and store multiple URLs for each page. This is the
approach used by other search engines for mirrors, so I think this should
be the approach for ht://Dig too.
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.