Re: wget 1.9 - behaviour change in recursive downloads
Jochen Roderburg <[EMAIL PROTECTED]> writes: > Zitat von Hrvoje Niksic <[EMAIL PROTECTED]>: > >> It's a feature. `-A zip' means `-A zip', not `-A zip,html'. Wget >> downloads the HTML files only because it absolutely has to, in order >> to recurse through them. After it finds the links in them, it deletes >> them. > > Hmm, so it has really been an undetected error over all the years > ;-) ? s/undetected/unfixed/ At least I've always considered it an error. I didn't know people depended on it.
Re: wget 1.9 - behaviour change in recursive downloads
At 12:05 PM 10/3/2003, Hrvoje Niksic wrote: It's a feature. `-A zip' means `-A zip', not `-A zip,html'. Wget downloads the HTML files only because it absolutely has to, in order to recurse through them. After it finds the links in them, it deletes them. How about a switch to keep the .html file, similar to the -nr switch that keeps the .listing file for ftp downloads?
Re: wget 1.9 - behaviour change in recursive downloads
Zitat von Hrvoje Niksic <[EMAIL PROTECTED]>: > It's a feature. `-A zip' means `-A zip', not `-A zip,html'. Wget > downloads the HTML files only because it absolutely has to, in order > to recurse through them. After it finds the links in them, it deletes > them. Hmm, so it has really been an undetected error over all the years ;-) ? Ok, I see, if adding explicit html im my scripts helps, I like to keep those files because they show me the date when the last change has occured in a directory. Regards, J.Roderburg
Re: mswindows.h patch
Thanks for the patch, I've now applied it with the following ChangeLog entry: 2003-10-03 Gisle Vanem <[EMAIL PROTECTED]> * connect.c: And don't include them here. * mswindows.h: Include winsock headers here. However, I've postponed applying the part that changes `-d'. I agree that `-d' could stand improvement, but let's wait with that until 1.9 is released.
Re: some wget patches against beta3
Thanks for the contribution. Note that a slightly more correct place to send the patch is the <[EMAIL PROTECTED]> list, followed by people with a keener interest in development. Also, you should send at least a short explanation of what each patch is supposed to do and why one should apply it. (Except in the case of really short, self-explanatory patches, of course.) As for the Polish translation, translations are normally handled through the Translation Project. The TP robot is currently down, but I assume it will be back up soon, and then we'll submit the POT file and update the translations /en masse/.
Re: wget 1.9 - behaviour change in recursive downloads
It's a feature. `-A zip' means `-A zip', not `-A zip,html'. Wget downloads the HTML files only because it absolutely has to, in order to recurse through them. After it finds the links in them, it deletes them.
mswindows.h patch
Regarding my run_with_timeout() patch, I forgot the following patch to mswindows.h (which isnt included in util.c). In my forthcoming patches for IPv6, we need to use the correct Winsock headers. To avoid ifdef clutter throughout the .c-files, I've put them in mswindows.h. So the .c-files should never include it, but only need network headers like this: #ifndef WINDOWS # include # include ... #endif #include "wget.h" The above which includes sysdep.h which includes mswindows.h. --- CVS-latest/src/mswindows.h Tue Sep 30 23:24:36 2003 +++ src/mswindows.h Fri Oct 03 16:57:57 2003 @@ -30,6 +30,37 @@ #ifndef MSWINDOWS_H #define MSWINDOWS_H +#ifndef WGET_H +#error Include mswindows.h inside or after "wget.h" +#endif + +#ifndef WIN32_LEAN_AND_MEAN +#define WIN32_LEAN_AND_MEAN /* Prevent inclusion of in */ +#endif + +#include + +/* Use the correct winsock header; includes only on + * Watcom/MingW. We cannot use for IPv6. Using getaddrinfo() requires + * + */ +#if defined(ENABLE_IPV6) || defined(HAVE_GETADDRINFO) +# include +# include +#else +# include +#endif + +#ifndef EAI_SYSTEM +#define EAI_SYSTEM -1 /* value doesn't matter */ +#endif + +/* Must include because of 'stat' define below. */ +#include + +/* Missing in several .c files. Include here. */ +#include + /* Apparently needed for alloca(). */ #include @@ -81,8 +112,6 @@ # define mkdir(a, b) mkdir(a) #endif /* __BORLANDC__ */ -#include - /* Declarations of various socket errors: */ @@ -136,5 +164,21 @@ char *ws_mypath (void); void ws_help (const char *); void windows_main_junk (int *, char **, char **); + +/* Things needed for IPv6; missing in . */ +#ifdef ENABLE_IPV6 + #ifndef HAVE_NTOP + extern const char *inet_ntop (int af, const void *src, char *dst, size_t size); + #endif + #ifndef HAVE_PTON + extern int inet_pton (int af, const char *src, void *dst); + #endif +#endif /* ENABLE_IPV6 */ - Defining WIN32_LEAN_AND_MEAN also makes it compile much faster. I think it would be handy to have 'opt.debug' in levels of verbosity. I.e. '-dd' gives a more chatty wget. Or should it be '-vv'? I'm a bit confused about the distinction between those options. I propose we add this macro to wget.h: # define DEBUGN(level,x) do { if (opt.debug >= (level)) \ DEBUGP (x); } while (0) And patch init.c: @@ -85,6 +85,7 @@ CMD_DECLARE (cmd_boolean); CMD_DECLARE (cmd_bytes); CMD_DECLARE (cmd_directory_vector); +CMD_DECLARE (cmd_increment); CMD_DECLARE (cmd_lockable_boolean); CMD_DECLARE (cmd_number); CMD_DECLARE (cmd_number_inf); @@ -129,7 +128,7 @@ { "cookies", &opt.cookies, cmd_boolean }, { "cutdirs", &opt.cut_dirs, cmd_number }, #ifdef DEBUG - { "debug", &opt.debug, cmd_boolean }, + { "debug", &opt.debug, cmd_increment }, #endif { "deleteafter", &opt.delete_after, cmd_boolean }, { "dirprefix", &opt.dir_prefix,cmd_directory }, @@ -632,6 +631,17 @@ } *(int *)closure = bool_value; + return 1; +} + +/* Increment a value from VAL to CLOSURE. COM is ignored, + except for error messages. */ +static int +cmd_increment (const char *com, const char *val, void *closure) +{ + int tmp; + if (cmd_boolean(com,val,&tmp)) + (*(int*)closure)++; return 1; } Wadda you think? AFAIK only wget.texi should be updated. Add this to @item -d: To get increased verbosity turn up the debug-level by repeating this option. E.g. @samp{-dd} or @samp{--debug --debug}. And one last patch (close -> CLOSE): --- CVS-latest/src/connect.c Mon Sep 22 15:55:22 2003 +++ src/connect.c Thu Oct 02 16:52:33 2003 @@ -37,9 +37,7 @@ #endif #include -#ifdef WINDOWS -# include -#else +#ifndef WINDOWS # include # include # include @@ -201,7 +199,7 @@ wget_sockaddr_set_address (&bsa, ip_default_family, 0, &bind_address); if (bind (sock, &bsa.sa, sockaddr_len ())) { - close (sock); + CLOSE (sock); sock = -1; goto out; } @@ -211,7 +209,7 @@ if (connect_with_timeout (sock, &sa.sa, sockaddr_len (), opt.connect_timeout) < 0) { - close (sock); + CLOSE (sock); sock = -1; goto out; } -- --gv
some wget patches against beta3
Hi, Here is few patches against test3: http://cvs.pld-linux.org/cgi-bin/cvsweb/SOURCES/wget-ac.patch?rev=1.4 (some autoconf 2.5x things) http://cvs.pld-linux.org/cgi-bin/cvsweb/SOURCES/wget-pl.patch?rev=1.3 (Polish translation update) -- Arkadiusz MiĆkiewiczCS at FoE, Wroclaw University of Technology arekm.pld-linux.org AM2-6BONE, 1024/3DB19BBD, arekm(at)ircnet, PLD/Linux
wget 1.9 - behaviour change in recursive downloads
Hi, I've found a situation where the new version 1.9beta behaves differently than earlier version. I'm not sure if this is an corrected error or a new bug, I personally would prefer the old behaviour. When I do a recursive download with an accept list like wget -r -l1 -nd -A zip http://some.host.com/index.htm it downloads the index.htm file and all the zip files mentioned therein. With older versions the start file index.htm itself stays there in the end. Version 1.9 downloads the index.htm and deletes it immediately with the message Removing index.htm since it should be rejected. The recursion is then done correctly. Best Regards, Jochen Roderburg ZAIK/RRZK University of Cologne Robert-Koch-Str. 10 Tel.: +49-221/478-7024 D-50931 Koeln E-Mail: [EMAIL PROTECTED] Germany