First off thanks for the response. 1) I was trying 3.2 because I thought the searching multiple dbs would make this eaiser. Should I be using 3.1? I did have to do some creative setup to get around 3.2 not having search_rewrite_rules support.
2) Read the man page of htdig about the -m. the url file should be urls delimted by space correct? Does htdig recursively spider these urls? How does limit_urls_to and exclude_urls affect this? 3) I wasn't trying to run the concurrently, but just not to choke the system and let it deal w/ smaller chunks. 4) I compiled from source. I will look again for the docs. 5) Thanks again for the urls. -Ry -----Original Message----- From: Geoff Hutchison [mailto:[EMAIL PROTECTED]] Sent: Tuesday, July 02, 2002 12:09 PM To: Rylan W. Hazelton Cc: [EMAIL PROTECTED] Subject: Re: [htdig] htdig 3.2 LARGE site On Tue, 2 Jul 2002, Rylan W. Hazelton wrote: 1) I'm curious why you're using 3.2. Indexing speed at the moment is certainly slower than 3.1--it's indexing and storing a significantly large amount of information. Plus, it's assembling the databases on-the-fly rather than requiring the separate htmerge step. 2)You should also take a look at the -m flag to htdig. This will only index a set of URLs and do nothing else. (Valid for 3.1.6 and 3.2 betas.) 3)This depends on how much load your server and CGIs can handle. If you think the server can handle indexing two sets at once, this will be faster. If you'd have to do one set, then another, etc. then this will definitely be slower. 4)The installation you have should have full documentation. From a source .tar.gz, it will be in htdoc/. If you installed from a binary package, it should include docs as well. Beyond that, see: <http://www.htdig.org/dev/htdig-3.2/> 5)To search multiple DB at the same time, you'll need to set up "collections." You should specify multiple config names to htsearch, separated by "|" characters. You could also specify one "master" config with a collection_names attribute. <http://www.htdig.org/dev/htdig-3.2/attrs.html#collection_names> These are standard Last-Modified: headers: http://www.w3.org/Protocols/HTTP/Object_Headers.html#last-modified But in order for the stored date to be useful for speeding indexing, the server/CGI would need to recognize the If-Modified-Since: headers http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

