Re: [htdig] Different domains?

David Adams Fri, 28 Jul 2000 04:28:09 -0700
Quoting Ken Convery <[EMAIL PROTECTED]>:

> ht://dig looks like a great tool for maintainers of intranets and/or several
> internet web servers.  I have a question about it's application to something
> we are trying to do here.  We are developing relationships with a few other
> online companies and want to make content from their sites available by link
> on our site.  We are thinking we can use ht://dig to index those other sites
> so we can search out and display the pertinent information on our site in
> summary form and provide the link to a specific page on their site for more
> information.
> 
> In a nutshell: can ht://dig index other web servers specified outside my
> domain or network?

Yes.  I maintain a "local community" index which now covers almost a thousand 
servers (real and virtual), most of them commercial.

I would recommend that you access them through a proxy, specified by the 
http_proxy: statement in the configuration file.

> 
> If so would we need other than http to these other servers or any special
> access such as file system privileges?
> 

No, but https servers are a special case, I can't answer for them.

> secondly are there any problems with sites that generate content
> dynamically?  Or will ht://dig simply look at static HTML pages or other
> static documents?
> 

There are usually no difficulties with dynamic pages, but problems can occur.  
The exclude_urls: statement is intended to trap them.  In my case I only have

exclude_urls:   &referer=

I suggest caution, adding sites one by one to your search list, and keeping
max_hop_count and server_max_docs low at first.

> 
> Thank you very much
> Ken Convery
> Avian Pilot Systems Inc.
> 


David Adams
<[EMAIL PROTECTED]>
Computing Services
Southampton University

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
Re: [htdig] Different domains?

Reply via email to