Re: wget 1.9 - behaviour change in recursive downloads

2003-10-07 Thread Jochen Roderburg
Zitat von Hrvoje Niksic [EMAIL PROTECTED]:

 Jochen Roderburg [EMAIL PROTECTED] writes:
 
  Zitat von Hrvoje Niksic [EMAIL PROTECTED]:
 
  It's a feature.  `-A zip' means `-A zip', not `-A zip,html'.  Wget
  downloads the HTML files only because it absolutely has to, in order
  to recurse through them.  After it finds the links in them, it deletes
  them.
 
  Hmm, so it has really been an undetected error over all the years
  ;-) ?
 
 s/undetected/unfixed/
 
 At least I've always considered it an error.  I didn't know people
 depended on it.

Well, *depend* is a rather strong expression for that ;-)
It worked that way always, I got used to it, I never really thought if it was
correct or not, because I had a use for it. So I was astonished, when these
files suddenly disappeared.

As I wrote already, I will mention them explicitly now. I think, the worst that
will happen is that I get a few more of them than before.

Perhaps the whole thing could be mentioned in the documentation of the
accept/reject option. Current there is only this sentence there:

 Note that these two options do not affect the downloading of HTML
 files; Wget must load all the HTMLs to know where to go at
 all--recursive retrieval would make no sense otherwise.

J. Roderburg





wget 1.9 - behaviour change in recursive downloads

2003-10-03 Thread Jochen Roderburg

Hi,

I've found a situation where the new version 1.9beta behaves differently than
earlier version. I'm not sure if this is an corrected error or a new bug, I
personally would prefer the old behaviour.

When I do a recursive download with an accept list like

  wget -r -l1 -nd -A zip http://some.host.com/index.htm

it downloads the index.htm file and all the zip files mentioned therein.
With older versions the start file index.htm itself stays there in the end.

Version 1.9 downloads the index.htm and deletes it immediately with the message 
   

  Removing index.htm since it should be rejected.

The recursion is then done correctly.

Best Regards,

Jochen Roderburg
ZAIK/RRZK
University of Cologne
Robert-Koch-Str. 10 Tel.:   +49-221/478-7024
D-50931 Koeln   E-Mail: [EMAIL PROTECTED]
Germany




Re: wget 1.9 - behaviour change in recursive downloads

2003-10-03 Thread Hrvoje Niksic
It's a feature.  `-A zip' means `-A zip', not `-A zip,html'.  Wget
downloads the HTML files only because it absolutely has to, in order
to recurse through them.  After it finds the links in them, it deletes
them.


Re: wget 1.9 - behaviour change in recursive downloads

2003-10-03 Thread Jochen Roderburg
Zitat von Hrvoje Niksic [EMAIL PROTECTED]:

 It's a feature.  `-A zip' means `-A zip', not `-A zip,html'.  Wget
 downloads the HTML files only because it absolutely has to, in order
 to recurse through them.  After it finds the links in them, it deletes
 them.

Hmm, so it has really been an undetected error over all the years ;-) ?

Ok, I see, if adding explicit html im my scripts helps, I like to keep those
files  because they show me the date when the last change has occured in a
directory.

Regards, J.Roderburg






Re: wget 1.9 - behaviour change in recursive downloads

2003-10-03 Thread Fred Holmes
At 12:05 PM 10/3/2003, Hrvoje Niksic wrote:
It's a feature.  `-A zip' means `-A zip', not `-A zip,html'.  Wget
downloads the HTML files only because it absolutely has to, in order
to recurse through them.  After it finds the links in them, it deletes
them.
How about a switch to keep the .html file, similar to the -nr switch that 
keeps the .listing file for ftp downloads? 



Re: wget 1.9 - behaviour change in recursive downloads

2003-10-03 Thread Hrvoje Niksic
Jochen Roderburg [EMAIL PROTECTED] writes:

 Zitat von Hrvoje Niksic [EMAIL PROTECTED]:

 It's a feature.  `-A zip' means `-A zip', not `-A zip,html'.  Wget
 downloads the HTML files only because it absolutely has to, in order
 to recurse through them.  After it finds the links in them, it deletes
 them.

 Hmm, so it has really been an undetected error over all the years
 ;-) ?

s/undetected/unfixed/

At least I've always considered it an error.  I didn't know people
depended on it.