According to Matthew Nuzum:
> Ht://Dig uses up nearly a gig of my bandwidth every month.  It tends to
> lap itself if I run it daily, so I've started running it every other
> day.  Otherwise it runs great.
> 
> I am now presented with the task of needing to mirror sites, in addition
> to index them with my search engine.  I shudder to think of the
> resources this will use; both on my web servers to be mirrored/indexed
> and my bandwidth.  Disk space, is not a big concern to me though.
> 
> Is it possible to create a mirror of a site using the information in
> ht://dig's databases so that I can save the extra effort of mirroring?
> 
> I was using rsync to keep my bandwidth low, but now I need to switch to
> something that works like wget so that I can get static html snapshots
> instead of the actual cgi/php/asp source pages.

Just to add to what Geoff and Torsten have already written, you should be
aware that indexing, mirroring or caching dynamic content (cgi/php/asp)
will tend to be a high-bandwidth proposition because every time a page
is loaded, it's regarded as "new" - there is no Last-Modified header to
tell the client that the page hasn't been changed since the last time.
Static HTML pages don't have that problem.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)

_______________________________________________________________

Hundreds of nodes, one monster rendering program.
Now that�s a super model! Visit http://clustering.foundries.sf.net/
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to