On Thursday, April 11, 2002, at 04:04 PM, Conor Stapleton wrote:
> If htdig is ended prematurely and the restarted, does it begin from > scratch, as the database it had created just gets bigger? You don't mention what version you're using. For versions before 3.1.6 (or the 3.2 betas), it's a bad idea to stop the search prematurely--the database usually will be corrupted. If you run htdig again (i.e. there's a database there and you don't use the -i flag), it will pick up the old URLs, check if they've changed from the version in the database and index any new URLs it finds. > Has anyone any experience as to whether i should split these tasks into > maybe 16 / 32 / 64 smaller sets or urls and schedule these to run, with > a htmerge at the end of each? It depends on how good your network connection is and how fast the servers return data. Indexing can often flood your connection, in which case splitting into a smaller set will only add overhead. If you have a network monitor of some sort or are willing to look at your 'net connection lights, you'll have some idea of your bandwidth utilization. If you're just adding in new URLs to a current database, you may also find the -m flag to htdig useful: <http://www.htdig.org/htdig.html> -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

