Re: wget 1.9 - behaviour change in recursive downloads
Zitat von Hrvoje Niksic <[EMAIL PROTECTED]>: > Jochen Roderburg <[EMAIL PROTECTED]> writes: > > > Zitat von Hrvoje Niksic <[EMAIL PROTECTED]>: > > > >> It's a feature. `-A zip' means `-A zip', not `-A zip,html'. Wget > >> downloads the HTML files only because it absolutely has to, in order > >> to recurse through them. After it finds the links in them, it deletes > >> them. > > > > Hmm, so it has really been an undetected error over all the years > > ;-) ? > > s/undetected/unfixed/ > > At least I've always considered it an error. I didn't know people > depended on it. Well, *depend* is a rather strong expression for that ;-) It worked that way always, I got used to it, I never really thought if it was correct or not, because I had a use for it. So I was astonished, when these files suddenly disappeared. As I wrote already, I will mention them explicitly now. I think, the worst that will happen is that I get a few more of them than before. Perhaps the whole thing could be mentioned in the documentation of the accept/reject option. Current there is only this sentence there: >> Note that these two options do not affect the downloading of HTML >> files; Wget must load all the HTMLs to know where to go at >> all--recursive retrieval would make no sense otherwise. J. Roderburg
Re: wget 1.9 - behaviour change in recursive downloads
Jochen Roderburg <[EMAIL PROTECTED]> writes: > Zitat von Hrvoje Niksic <[EMAIL PROTECTED]>: > >> It's a feature. `-A zip' means `-A zip', not `-A zip,html'. Wget >> downloads the HTML files only because it absolutely has to, in order >> to recurse through them. After it finds the links in them, it deletes >> them. > > Hmm, so it has really been an undetected error over all the years > ;-) ? s/undetected/unfixed/ At least I've always considered it an error. I didn't know people depended on it.
Re: wget 1.9 - behaviour change in recursive downloads
At 12:05 PM 10/3/2003, Hrvoje Niksic wrote: It's a feature. `-A zip' means `-A zip', not `-A zip,html'. Wget downloads the HTML files only because it absolutely has to, in order to recurse through them. After it finds the links in them, it deletes them. How about a switch to keep the .html file, similar to the -nr switch that keeps the .listing file for ftp downloads?
Re: wget 1.9 - behaviour change in recursive downloads
Zitat von Hrvoje Niksic <[EMAIL PROTECTED]>: > It's a feature. `-A zip' means `-A zip', not `-A zip,html'. Wget > downloads the HTML files only because it absolutely has to, in order > to recurse through them. After it finds the links in them, it deletes > them. Hmm, so it has really been an undetected error over all the years ;-) ? Ok, I see, if adding explicit html im my scripts helps, I like to keep those files because they show me the date when the last change has occured in a directory. Regards, J.Roderburg
Re: wget 1.9 - behaviour change in recursive downloads
It's a feature. `-A zip' means `-A zip', not `-A zip,html'. Wget downloads the HTML files only because it absolutely has to, in order to recurse through them. After it finds the links in them, it deletes them.
wget 1.9 - behaviour change in recursive downloads
Hi, I've found a situation where the new version 1.9beta behaves differently than earlier version. I'm not sure if this is an corrected error or a new bug, I personally would prefer the old behaviour. When I do a recursive download with an accept list like wget -r -l1 -nd -A zip http://some.host.com/index.htm it downloads the index.htm file and all the zip files mentioned therein. With older versions the start file index.htm itself stays there in the end. Version 1.9 downloads the index.htm and deletes it immediately with the message Removing index.htm since it should be rejected. The recursion is then done correctly. Best Regards, Jochen Roderburg ZAIK/RRZK University of Cologne Robert-Koch-Str. 10 Tel.: +49-221/478-7024 D-50931 Koeln E-Mail: [EMAIL PROTECTED] Germany