On Mon, 24 Sep 2001, Evaldas wrote:

> Date: Mon, 24 Sep 2001 22:04:13 +0100
> From: Evaldas <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: [htdig] Eliminating repeating search results
> 
> 
> Hi,
> 
> I have a perl script that uses 'path_info' to display the appropriate
> record from a database. The htdig works fine indexing the pages, so no
> problem with that.
> 
> The problem is that there can be different URLs which actually display
> the same page, f.e.:
> http://domain.com/one/two/A
> http://domani.com/one/three/A
> htpp://domani.com/one/four/A
> 
> are all displaying the same record from a database. htsearch displays
> them as a separate found pages. What I would like is to eliminate all
> repeating pages and display only one of them.
> 
> Is this possible with htdig?

That depends on what version of htdig at what patch level you use.  If you
use the unpatched 3.1.5, you can either apply the following patch and read
its documentation carefully:

 ftp://ftp.ccsf.org/htdig-patches/3.1.5/htdig-3.1.5.aarmstrong.README
 ftp://ftp.ccsf.org/htdig-patches/3.1.5/htdig-3.1.5.aarmstrong.tar.gz

And add lines like the following to your htdig configuration file and
re-index:
-------------------------------8<---------------------------------
 url_rewrite_rules:       \
 http://domain.com/one/two/(.*)  http://domain.com/one/one/\\1 \
 http://domain.com/one/three/(.*)  http://domain.com/one/one/\\1 \
 http://domain.com/one/four/(.*)  http://domain.com/one/one/\\1 
-------------------------------8<---------------------------------

You can also wait for 3.1.6, soon to be released, which will allow URL
rewrite; use similar lines in your htdig configuration file and re-index.

Regards,

Joe
-- 
     _/   _/_/_/       _/              ____________    __o
     _/   _/   _/      _/         ______________     _-\<,_
 _/  _/   _/_/_/   _/  _/                     ......(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah        [EMAIL PROTECTED]


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to