First off thanks for the response.

1) I was trying 3.2 because I thought the searching multiple dbs would
make this eaiser.  Should I be using 3.1?  I did have to do some
creative setup to get around 3.2 not having search_rewrite_rules
support.

2) Read the man page of htdig about the -m.  the url file should be urls
delimted by space correct?  Does htdig recursively spider these urls?
How does limit_urls_to and exclude_urls affect this?

3) I wasn't trying to run the concurrently, but just not to choke the
system and let it deal w/ smaller chunks.

4) I compiled from source.  I will look again for the docs.

5) Thanks again for the urls.

-Ry

-----Original Message-----
From: Geoff Hutchison [mailto:[EMAIL PROTECTED]] 
Sent: Tuesday, July 02, 2002 12:09 PM
To: Rylan W. Hazelton
Cc: [EMAIL PROTECTED]
Subject: Re: [htdig] htdig 3.2 LARGE site

On Tue, 2 Jul 2002, Rylan W. Hazelton wrote:

1) I'm curious why you're using 3.2. Indexing speed at the moment is
certainly slower than 3.1--it's indexing and storing a significantly
large
amount of information. Plus, it's assembling the databases on-the-fly
rather than requiring the separate htmerge step.

2)You should also take a look at the -m flag to htdig. This will only
index
a set of URLs and do nothing else. (Valid for 3.1.6 and 3.2 betas.)

3)This depends on how much load your server and CGIs can handle. If you
think the server can handle indexing two sets at once, this will be
faster. If you'd have to do one set, then another, etc. then this will
definitely be slower.

4)The installation you have should have full documentation. From a
source
.tar.gz, it will be in htdoc/. If you installed from a binary package,
it
should include docs as well. Beyond that,
see: <http://www.htdig.org/dev/htdig-3.2/>

5)To search multiple DB at the same time, you'll need to set up
"collections." You should specify multiple config names to htsearch,
separated by "|" characters. You could also specify one "master" config
with a collection_names attribute.

<http://www.htdig.org/dev/htdig-3.2/attrs.html#collection_names>

These are standard Last-Modified: headers:
http://www.w3.org/Protocols/HTTP/Object_Headers.html#last-modified

But in order for the stored date to be useful for speeding indexing, the
server/CGI would need to recognize the If-Modified-Since: headers

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/







-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to