Re: [htdig] Maintaining a database

Geoff Hutchison Sat, 13 Apr 2002 14:11:59 -0700


On Thursday, April 11, 2002, at 04:04  PM, Conor Stapleton wrote:


> If htdig is ended prematurely and the restarted, does it begin from 
> scratch, as the database it had created just gets bigger?

You don't mention what version you're using. For versions before 3.1.6 
(or the 3.2 betas), it's a bad idea to stop the search prematurely--the 
database usually will be corrupted.

If you run htdig again (i.e. there's a database there and you don't use 
the -i flag), it will pick up the old URLs, check if they've changed 
from the version in the database and index any new URLs it finds.

> Has anyone any experience as to whether i should split these tasks into 
> maybe 16 / 32 / 64 smaller sets or urls and schedule these to run, with 
> a htmerge at the end of each?

It depends on how good your network connection is and how fast the 
servers return data. Indexing can often flood your connection, in which 
case splitting into a smaller set will only add overhead. If you have a 
network monitor of some sort or are willing to look at your 'net 
connection lights, you'll have some idea of your bandwidth utilization.

If you're just adding in new URLs to a current database, you may also 
find the -m flag to htdig useful:
<http://www.htdig.org/htdig.html>

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Re: [htdig] Maintaining a database

Reply via email to