Re: wget timestamping (-N) bug/feature?

Ian Abbott Sat, 04 Aug 2001 08:50:05 -0700
On 4 Aug 2001, at 3:25, Bao, Jiangcheng wrote:

> Suppose I have page a.html, which has a link to b.html. If a is not
> changed, and b is changed. When I process a, I have no way to check a so
> that I can process b too, without downloading a. -N will cause a not to be
> downloaded, but not processed either, so change of b will be ignored. If I
> will -nc, then a will be processed, and b too, but b's change will still
> be ignored. If a new page c is linked into b now, then c won't get
> noticed.
> 
> Is this a feature or it's a bug? Thanks.

What version of wget are you using, since I've just tried your 
scenario in wget 1.7 and it worked correctly. I used the -N and -r 
options, but you could just use the -m option instead which combines 
these two options (timestamp checks and recursion).

If you use the -nc option then no new versions of the pages that are 
already stored locally will be downloaded so you won't see the 
changed versions and (if using recursion) wget will process the old, 
existing versions. In this case, wget doesn't need to request any 
thing at all from the web server for files it already has.

If you use the -N option (or -m) then if the local file exists, wget 
will at first ask the web server for just the HTTP headers, rather 
than the whole document. The "Last-Modified" and "Content-Length" 
HTTP response headers will be compared with the last-modified time 
and length of the local file (if it exists) and if the local file 
does not exists, the lengths are different or the local file is older 
than than the new content, another request will be sent to the web 
server to retrieve the whole document.

So when you say -N will cause files that are not downloaded to not be 
processed, you are incorrect, at least as far as wget 1.7 is 
concerned.
Re: wget timestamping (-N) bug/feature?

Reply via email to