On Fri, 10 Sep 1999, Nick O'Brien wrote:
> Date: Fri, 10 Sep 1999 15:13:20 +0100 (GMT Daylight Time)
> From: Nick O'Brien <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: [htdig] htdig and symbolic links
>
>
> Hi,
>
> We are implementing htdig (v3.1.2 + the patch kit on Solaris 2.6) on our
> main web server. One comment we have had is that there are alot of
> duplicate search results pointing to the same web pages. This is usually
> caused by having several different Unix symbolic links pointing to the
> same directory/file in the web document tree.
>
> Is there any way we can prevent the indexing of these duplicates? I see
> from the mailing list archives that for previous versions of htdig there
> were patches to fix this issue but they are not available for the current
> version.
>
> I see from the bug database the latest advice is to eliminate symbolic
> links - however for many practical reasons it is not possible for us to
> do this.
>
>
> Is it for example possible to configure htdig to index our URLs via the
> filesystem instead of HTTP (i.e using local_urls) and to ignore the
> symbolic links?
>
> How are people on the list working round this problem? Or is this an
> unresolved bug I will need to (re)log with the htdig developers?
Our site is in the same boat that your site is in; I use the same old
patch for version 3.0.8b2, but I apply it manually at every new release.
You can get it from:
ftp://sol.ccsf.cc.ca.us/htdig-patches/3.0.8b2/Retriever.cc.0
Then with an ugly extensive set of local_urls for each and every symbolic
link in the site:( I mange to suppress duplicates, quadruplicates, and
multuplicates;)
Boy, do I look forward to 3.2, which is promised to take care of the
menace of duplicates.
Regards,
Joe
--
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah [EMAIL PROTECTED]
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word unsubscribe in
the SUBJECT of the message.