Re: re-mirror + no-clobber

2008-10-25 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jonathan Elsas wrote:
...
 I've issued the command
 
 wget -nc -r -l inf -H -D www.example.com,www2.example.com
 http://www.example.com
 
 but, I get the message:
 
 
 file 'www.example.com/index.html' already there; not retrieving.
 
 
 and the process exits.   According to the man page files with .html
 suffix will be loaded off disk and parsed but this does not appear to
 be happening.   Am I missing something?

Yes. It has to download the files before they can be loaded from the
disk and parsed. When it encounters a file at a given location, it
doesn't have any way to know that that file corresponds to the one it's
trying to download. Timestamping with -N may be more what you want,
rather than -nc?

I'm open to suggestions on clarifying the documentation.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
GNU Maintainer: wget, screen, teseq
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJA7Ds7M8hyUobTrERAsONAJ0dqYh0av7rQ80F8JIcvxhZ1ee7fwCdFG+y
AJJxMPVzHpmqAy7iGVRWmCU=
=wwns
-END PGP SIGNATURE-


re-mirror + no-clobber

2008-10-24 Thread Jonathan Elsas

Hi --

I'm using wget 1.10.2

I'm trying to mirror a web site with the following command:

wget -m http://www.example.com

After this process finished, I realized that I also needed pages from  
a subdomain (eg. www2)


To re-start the mirror process without downloading the same pages  
again, I've issued the command


wget -nc -r -l inf -H -D www.example.com,www2.example.com http://www.example.com

but, I get the message:


file 'www.example.com/index.html' already there; not retrieving.


and the process exits.   According to the man page files with .html  
suffix will be loaded off disk and parsed but this does not appear to  
be happening.   Am I missing something?


thanks in advance for your help