Re: FAQ needed (was wget: relative link to non-relative)
On 2002-10-17 12:16 -0600, Daniel Webb wrote: Also, concerning the mailing list, I am not interested in using a kludgy web-based interface to an email archive. Where are the mbox download links? Amen. -- André Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Possible bug : hosts spanned by default
I've just had a recursive wget do something unexpected : it spanned hosts even though I didn't give the -H option. The command was : wget -r -l20 http://www.modcan.com/page2.html http://www.modcan.com/pg2_main.html contains a link to www.paypal.com, and that link was followed. That was Wget 1.8.2 (the 1.8.2-5 Debian package). Have your ever seen this behaviour ? -- André Majorel URL:http://www.teaser.fr/~amajorel/ std::disclaimer (Not speaking for my employer);
Re: wget tries to print the file prn.html
On 2002-09-20 08:15 +0200, Dominic Chambers wrote: I am using wget 1.82 on Win2K SP2, and wget froze on the fifth 1.8.2. downloaded file 'prn.html' using the command line: wget -r -l0 -A htm,html,png,gif,jpg,jpeg --no-parent http://java.sun.com/products/jlf/at/book About twenty seconds after it stops, I get Windows complaining that there is no available printer (I don't have one), and canceling the job does not cause wget to resume processing. If I remember correctly (it's been a long time), DOS knows when you're trying to access a device by looking at the *basename* (minus path and extension) of the file. As of MS-DOS 6.22, the list of reserved names was AUX, COM{1,2,3,4}, CON, LPT{1,2,3}, NUL and PRN. I'm not sure what the list is in the various incarnations of Windows, nor if it's set in stone (could new reserved names be added by loading drivers ?). -- André Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: Apology for absence
On 2002-07-26 01:59 +0200, Hrvoje Niksic wrote: Only the bare minimum of characters should be encoded. The ones that come to mind are '/' (illegal), '~' (rm -r ~foo dangerous), '*' and '?' (used in wildcards), control characters 0-31 (controls), and chars 128-159 (non-printable). lobbying While quoting / is mandatory, I'm not sure it's a good idea to quote ~ * ? and control or non-ASCII characters. In fact, the more I think of it, the more I'm convinced we shouldn't... /lobbying -- André Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
HTML served over FTP
I'm trying to snarf a web site that is served over FTP. wget -r doesn't work probably because Wget doesn't parse HTML documents retrieved with FTP (which is reasonable). Is there a sort of --follow-html option to force Wget to parse HTML documents served over FTP and follow the links, as if they came from HTTP ? -- André Majorel URL:http://www.teaser.fr/~amajorel/ std::disclaimer (Not speaking for my employer);
Re: Feature Request: Stop on error from input url file list.
On 2002-06-29 21:09 -0400, Dang P. Tran wrote: I use the -i option to download files from an url list. The server I use have a password that change often. When I have a large list if the password change while I'm downloading and give 401 error, I want wget stop to prevent hammering the site with bad password. A workaround : $ echo '#!/bin/sh' wrapper $ echo 'wget $@ || kill $PPID' wrapper $ chmod +x wrapper $ xargs -n10 ./wrapper urllist If for whatever reason Wget exits with a non-zero status, xargs is killed. Thus the server will be hit at most 9 too many times. -- André Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: wget and javascript links
On 2002-05-14 13:01 -0400, Kevin Murphy wrote: However, I am trying to suck a particular site which relies excessively on javascript'ed links, e.g. via window.open, sometimes wrapped in function calls. I realize that in general this an intractable problem, but is anybody aware of a partial solution? Some people expressed interest in having a Javascript interpreter included in Wget but AFAIK no one actually did it. Someone pointed out that Javascript code is often simple enough that one could write a script to parse it and extract the links. -- André Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: ScanMail Message: To Recipient virus found or matched file blocki ng setting.
On 2002-04-19 11:21 +0200, Hrvoje Niksic wrote: There are now fewer spams than there were (I know because I get the ones that get caught in the net), but we're not quite there yet. We will be, though. In case this is of any use to you, these procmail recipes block at least 3/4 of the asian language spam I get : :0 * ^Subject: .*±¤[-_ :]*°í spam :0 * ^Subject: =\?euc-kr\? spam :0 * ^Subject: =\?ks_c_5601-1987\? spam :0 * ^Subject: .*[æÃÁÆÇÏÎÑõÚýÝÞ±¹³º¼¾¥¶®·µ].*[æÃÁÆÇÏÎÑõÚýÝÞ±¹³º¼¾¥¶®·µ].*[æÃÁÆÇÏÎÑõÚýÝÞ±¹³º¼¾¥¶®·µ] spam -- André Majorel URL:http://www.teaser.fr/~amajorel/ std::disclaimer (Not speaking for my employer);
Re: Proposal for despamming the list
On 2002-04-14 05:00 +0200, Hrvoje Niksic wrote: The moderators are informed about each message that awaits moderation; that alert would contain a URL they can visit and approve or reject the mail, at their discretion. The web interface is not necessary. Listar, for instance, just forwards the dubious mails to the moderator. Approving the message is done by replying to listar (actually forwarding to somelist-repost@somedomain, but you get the idea). -- André Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: Current download speed in progress bar
On 2002-04-10 01:14 +0200, Hrvoje Niksic wrote: Andre Majorel [EMAIL PROTECTED] writes: If find it very annoying when a downloader plays yoyo with the remaining time. IMHO, remaining time is by nature a long term thing and short term jitter should not cause it to go up and down. Agreed wholeheartedly, but how would you *implement* a non-jittering ETA? I'm not sure you can, but using the average speed will at least low pass filter out most of the jittering. Do you think it makes sense the way 1.8.1 does it, i.e. to calculate the ETA from the average speed? Yes. -- André Majorel URL:http://www.teaser.fr/~amajorel/ std::disclaimer (Not speaking for my employer);
Re: Referrer Faking and other nifty features
On 2002-04-03 08:50 -0500, Dan Mahoney, System Admin wrote: 1) referrer faking (i.e., wget automatically supplies a referrer based on the, well, referring page) It is the --referer option, see (wget)HTTP Options, from the Info documentation. Yes, that allows me to specify _A_ referrer, like www.aol.com. When I'm trying to help my users mirror their old angelfire pages or something like that, very often the link has to come from the same directory. I'd like to see something where when wget follows a link to another page, or another image, it automatically supplies the URL of the page it followed to get there. Is there a way to do this? Somebody already asked for this and AFAICT, there's no way to do that. 3) Multi-threading. I suppose you mean downloading several URIs in parallel. No, wget doesn't support that. Sometimes, however, one may start several wget in parallel, thanks to the shell (the operator on Bourne shells). No, I mean downloading multiple files from the SAME uri in parallel, instead of downloading files one-by-one-by-one (thus saving time on a fast pipe). This doesn't make sense to me. When downloading from a single server, the bottleneck is generally either the server or the link ; in either case, there's nothing to win by attempting several simultaneous transfers. Unless there are several servers at the same IP and the bottleneck is the server, not the link ? -- André Majorel URL:http://www.teaser.fr/~amajorel/ std::disclaimer (Not speaking for my employer);
Re: OK, time to moderate this list
On 2002-03-22 04:08 +0100, Hrvoje Niksic wrote: May I suggest that you set a filter that prevents postings to the list unless the poster is a subscriber. That filter should forward the mail to the admins to allow them the pass the mail through if suitable. Do you volunteer to do the work? I don't mean to be flippant here -- I often don't have time to do maintenance for weeks, and I would like the list to be alive even when the admin is not available. A simple rule to reject any message whose subject contains more than, say, fifty percent of non-ASCII characters would effortlessly block most of the spam. -- André Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: Incorrect 'beautification' of URL?
On 2002-03-05 11:41 +0100, Philipp Thomas wrote: When requesting a URL like http://tmp.logix.cz/slash.xp , wget shortens this to http://tmp.logix.cz/slash.xp/. All Browsers I tested (Opera 6b1, Mozilla 0.9.8, Konqueror 2.9.2) pass this URL as given. So the question is, why wget (1.8.1) does what it does Presumably because the author thought that both URLs are equivalent. To my surprise, RFC 1945 seems to agree with you. It says : URI= ( absoluteURI | relativeURI ) [ # fragment ] absoluteURI= scheme : *( uchar | reserved ) relativeURI= net_path | abs_path | rel_path net_path = // net_loc [ abs_path ] abs_path = / rel_path rel_path = [ path ] [ ; params ] [ ? query ] path = fsegment *( / segment ) fsegment = 1*pchar segment= *pchar Which I understand to mean that a segment can be empty, which in turn could be interpreted as stating that the trailing slashes in slash.xp are significant. That said, setting up a web site to rely on empty path segments strikes me as a creative way of looking for problems. :-) Why is it important to you ? -- André Majorel URL:http://www.teaser.fr/~amajorel/ std::disclaimer (Not speaking for my employer);
Re: KB or kB
On 2002-02-08 08:54 +0100, Hrvoje Niksic wrote: Wget currently uses KB as abbreviation for kilobyte. In a Debian bug report someone suggested that kB should be used because it is more correct. The reporter however failed to cite the reference for this, and a search of the web has proven inconclusive. Does someone understand the spelling issues involved enough to point out the correct spelling and back it up with arguments? The applicable standard is the SI (Système International) established by the CGPM (Conférence Générale des Poids et Mesures). It defines the metric system units (s, m, V, g, etc.) and the following prefixes for multiples and submultiples : yocto y 10**-24 zepto z 10**-21 atto a 10**-18 femto f 10**-15 pico p 10**-12 nano n 10**-9 micro µ 10**-6 milli m 10**-3 centi c 10**-2 deci d 10**-1 deca da 10**1 hecto h 10**2 kilo k 10**3 mega M 10**6 giga G 10**9 tera T 10**12 peta P 10**15 exaE 10**18 zetta Z 10**21 yotta Y 10**24 Capital K is not a prefix, it's the SI abbreviation for the temperature unit, the kelvin (note : lower case k) named after Lord Kelvin. So it's definitely kB for kilobyte. Whether that means 1000 bytes or 1024 bytes is another issue. Regardless, KB is incorrect. As are mb, mB, gb and gB, by the way. -- André Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: Noise ratio getting a bit high?
On 2002-01-29 22:02 +0100, Hrvoje Niksic wrote: But that was just an example. The actual reasoning for allowing non-subscriber posting boils down to three reasons: 1. I believe it is the right thing to do. I personally hate allegedly supportive mailing lists that require me to subscribe before asking a question. I don't want to subscribe, dammit, I just want to ask something. I respectfully disagree. If we can spend the time to read and answer the poster's question, the poster can spend five minutes to subscribe/unsubscribe. For reference, see the netiquette item on posting to newsgroups and asking for replies by email. 2. It allows the discussion to extend to non-subscribers. You can simply Cc a person to a discussion pertinent to him, and he will be able to respond to the list. 3. It allows the mails from [EMAIL PROTECTED] to be rerouted to this list. Yup. I am aware that in this matter, as well as in the infamous `Reply-To' debate, this list lies in the minority. But that is not a sufficient reason to back down and let the spammers win. Right now, [EMAIL PROTECTED] is providing free relaying for spammers to all its subscribers. sarcasmIf this is not letting the spammers win, I wonder what is./sarcasm If you have a spam-fighting suggestion that does *not* include disallowing non-subscriber postings, I am more than willing to listen. Mmm... What would you think of having the list software automatically add a special header (say X-Non-Subscriber) to every mail sent by a non-subscriber ? -- André Majorel URL:http://www.teaser.fr/~amajorel/ std::disclaimer (Not speaking for my employer);
Re: Noise ratio getting a bit high?
On 2002-01-28 14:33 -0500, Thomas Reinke wrote: Is anyone else not finding the noise ratio (i.e. spam) a bit high here? A bit *low* you mean ? You bet. I sympathize with the effort required to lightly moderate, but might I recommend that _something_ be done to rid us all of this spam? It's getting to be irritating enough that I'm tempted to drop off the list, which I'd just as soon not do - wget is a fantastic little tool that I'd just as soon stay involved with actively, if possible. Setting up a spam filter requires some effort on the part of the list master. If the list master is too busy, a quick fix is preventing non-subscribers from posting. That can usually be done by flipping a bit in the config of the list software. But what about [EMAIL PROTECTED], then ? -- André Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: stdout
On 2002-01-25 14:01 +0100, Jens Röder wrote: for wget I would suggest a switch that allows to send the output directly to stdout. It would be easier to use it in pipes. Does wget ... 21 | command solve your problem ? -- André Majorel URL:http://www.teaser.fr/~amajorel/ std::disclaimer (Not speaking for my employer);
Re: Can not build wget-1.8 under SunOS-4.1.4
On 2001-12-16 19:02 +0100, Hrvoje Niksic wrote: Andre Majorel [EMAIL PROTECTED] writes: On 2001-12-15 07:37 +0100, Hrvoje Niksic wrote: Is there a good fallback value of RAND_MAX for systems that don't bother to define it? The standard (SUS2) says : The value of the {RAND_MAX} macro will be at least 32767. c9x says the same, but there is a subtle difference between statement and the information I actually need. A SUS-conformant system will not present a problem because it will define RAND_MAX anyway. The information I need is what RAND_MAX should fall back to on the traditional Unix systems that have rand(), but don't bother to define RAND_MAX. Online SunOS manuals are not very helpful -- the one at http://www.freebsd.org/cgi/man.cgi?query=randsektion=3manpath=SunOS+4.1.3 can't even seem to decide whether RAND_MAX is 2^31-1 or 2^15-1, and there is no mention of RAND_MAX or of an include file that might define it. 5th edition, 6th edition, 7th edition and System III all returned 0-32767. As RAND_MAX didn't exist at the time, plenty of code must have been written that assumed 0-32767. For that reason I think it unlikely that anybody ever wrote an implementation of rand() that returned less than 0-32767. I believe that a default value of 32767 is safe. Not optimal, but safe. Apparently, not all 32-bit systems use 2**31 - 1 : according to one clcm-er, MSVC defines RAND_MAX as 32767. -- André Majorel Work: [EMAIL PROTECTED] Home: [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: Can not build wget-1.8 under SunOS-4.1.4
On 2001-12-15 07:37 +0100, Hrvoje Niksic wrote: Is there a good fallback value of RAND_MAX for systems that don't bother to define it? The standard (SUS2) says : The value of the {RAND_MAX} macro will be at least 32767. -- André Majorel Work: [EMAIL PROTECTED] Home: [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: Wget 1.8-beta2 now available
On 2001-12-01 23:30 +0100, Hrvoje Niksic wrote: Here is the next 1.8 beta. Please test it if you can -- try compiling it on your granma's Ultrix box, run it on your niece's flashy web site, see if cookies work, etc. Get it from: ftp://gnjilux.srk.fer.hr/pub/unix/util/wget/.betas/wget-1.8-beta2.tar.gz Success: - Debian GNU/Linux woody, 80x86, GCC 2.95.4 - Solaris 7, SPARC, GCC 2.95.2 Failure: - HP-UX 10.0, PA-RISC, GCC 3.0.1 Problem #1 : gcc -I. -I.-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O2 -Wall -Wno-implicit -c connect.c connect.c: In function `test_socket_open': connect.c:190: warning: passing arg 2 of `select' from incompatible pointer type connect.c: In function `select_fd': connect.c:283: warning: passing arg 2 of `select' from incompatible pointer type connect.c:283: warning: passing arg 3 of `select' from incompatible pointer type connect.c:283: warning: passing arg 4 of `select' from incompatible pointer type (These are just warnings.) Problem #2 : gcc -I. -I.-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O2 -Wall -Wno-implicit -c host.c host.c: In function `lookup_host': host.c:258: `h_errno' undeclared (first use in this function) host.c:258: (Each undeclared identifier is reported only once host.c:258: for each function it appears in.) Apparently, h_errno is not declared at all under HP-UX (ie. find /usr/include -follow -type f | xargs grep h_errno turns up nothing). Declaring h_errno (extern int h_errno;) fixes the problem. I suppose we need something like : #if HPUX extern int h_errno; #endif Problem #3 : gcc -I. -I.-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O2 -Wall -Wno-implicit -c snprintf.c snprintf.c: In function `dopr': snprintf.c:311: `short int' is promoted to `int' when passed through `...' snprintf.c:311: (so you should pass `int' not `short int' to `va_arg') snprintf.c:323: `short unsigned int' is promoted to `int' when passed through `...' snprintf.c:335: `short unsigned int' is promoted to `int' when passed through `...' snprintf.c:349: `short unsigned int' is promoted to `int' when passed through `...' GCC has become very annoying with that sort of things... I did the suggested changes and the error messages vanished. - OSF/1 4.0, alpha, DEC C 5.6 Problem #1 : cc -std1 -I. -I.-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O -Olimit 2000 -c host.c cc: Error: host.c, line 221: In the initializer for lst[0], tmpstore does not have a constant address, but occurs in a context that requires an address constant. This is an extension of the language. char *lst[] = { tmpstore, NULL }; --^ The error message is misleading, IMO. The real problem is that we're initialising an auto array, which is something C does not support, at least not C89/C90. The following patch silences the compiler : diff -ur wget-1.8-beta2/src/host.c wget-1.8-beta2_aym/src/host.c --- wget-1.8-beta2/src/host.c Fri Nov 30 11:50:29 2001 +++ wget-1.8-beta2_aym/src/host.c Mon Dec 3 16:30:58 2001 @@ -218,7 +218,7 @@ if ((int)addr != -1) { char tmpstore[IP4_ADDRESS_LENGTH]; - char *lst[] = { tmpstore, NULL }; + char *lst[2]; /* ADDR is defined to be in network byte order, which is what this returns, so we can just copy it to STORE_IP. However, @@ -232,6 +232,8 @@ offset = 0; #endif memcpy (tmpstore, (char *)addr + offset, IP4_ADDRESS_LENGTH); + lst[0] = tmpstore; + lst[1] = NULL; return address_list_new (lst); } Problem #2 : There is also this shit. Take a deep breath : cc: Warning: snprintf.c, line 128: In this declaration, type signed long long is a language extension. LLONG value, int base, int min, int max, int flags); ---^ cc: Warning: snprintf.c, line 170: In this declaration, type signed long long is a language extension. LLONG value; --^ cc: Warning: snprintf.c, line 315: In this statement, type signed long long is a language extension. value = va_arg (args, LLONG); --^ cc: Warning: snprintf.c, line 315: In this statement, type signed long long is a language extension. value = va_arg (args, LLONG); --^ cc: Warning: snprintf.c, line 315: In this statement, type signed long long is a language extension. value = va_arg (args, LLONG); --^ cc: Warning: snprintf.c, line 315: In this statement, type signed long long is a language extension.
Re: Wget 1.8-beta2 now available
On 2001-12-03 18:30 +0100, Hrvoje Niksic wrote: Andre Majorel [EMAIL PROTECTED] writes: gcc -I. -I.-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O2 -Wall -Wno-implicit -c connect.c connect.c: In function `test_socket_open': connect.c:190: warning: passing arg 2 of `select' from incompatible pointer type connect.c: In function `select_fd': connect.c:283: warning: passing arg 2 of `select' from incompatible pointer type connect.c:283: warning: passing arg 3 of `select' from incompatible pointer type connect.c:283: warning: passing arg 4 of `select' from incompatible pointer type (These are just warnings.) And weird ones, too. These arguments are of type pointer to fd_set. What would HPUX like to see there? HP-UX 10 wants (int *). However it defines fd_set as struct { long[]; } so it works anyway. HP-UX 10 is wrong. SUS2 (and POSIX ?) say (fd_set *). HP-UX 11 has it right. I suppose the best thing to do is to ignore those warnings. I think I'll use something like: #ifndef h_errno extern int h_errno; #endif h_errno is not necessarily a macro ! What do you think of Maciej's proposal ? Two questions here: * Does HPUX really not have snprintf()? It sounds weird that a modern OS wouldn't have it. I find describing HP-UX 10 as a modern OS mildly amusing. :-) I completely disagree with your perception that snprintf() is to be taken for granted. It's only since C99 that's it's part of C. But to answer your question, no HP-UX doesn't have it (neither in the headers nor in libc). * short int is promoted to int, ok. Does that go for all the architectures, or just some? Should I simply replace short int with int to get it to compile? Yes, replace short int and unsigned short by int. It's not architecture specific, the same thing happened to me on x86. GCC 2.95 doesn't care, GCC 2.96 and 3.0 complain. The error message is misleading, IMO. The real problem is that we're initialising an auto array, which is something C does not support, at least not C89/C90. Indeed. I wonder why I thought that was legal C. Ok, I'll apply your patch. Not enough cafeine. :-) -- André Majorel URL:http://www.teaser.fr/~amajorel/ (Not speaking for my employer, etc.)
Re: Wget 1.8-beta2 now available
On 2001-12-01 23:30 +0100, Hrvoje Niksic wrote: Here is the next 1.8 beta. Please test it if you can -- try compiling it on your granma's Ultrix box, run it on your niece's flashy web site, see if cookies work, etc. Get it from: ftp://gnjilux.srk.fer.hr/pub/unix/util/wget/.betas/wget-1.8-beta2.tar.gz Success: - NCR MP-RAS 3.0, x86, NCR High Performance C Compiler R3.0c - FreeBSD 4.0, x86, GCC 2.95.2 Thanks ! -- André Majorel URL:http://www.teaser.fr/~amajorel/ (Not speaking for my employer, etc.)
Re: Wget 1.8-beta2 now available
On 2001-12-03 19:16 +0100, Hrvoje Niksic wrote: I find describing HP-UX 10 as a modern OS mildly amusing. :-) How old is it? I used to work on HPUX 9, and I'm not old by most definitions of the word. Around 1995. I completely disagree with your perception that snprintf() is to be taken for granted. It's only since C99 that's it's part of C. It's been a part of C since C99, that's true. But Wget relies on a lot of functionality not strictly in C, from alloca to the socket interface. Also, snprintf has become a big security thing recently, when a number of exploits was based on overflowing a buffer written to by sprintf. The pressure on vendors might be responsible for some of them being unusually swift in providing the function. But yes, I know I can't take it for granted, hence the provided replacement. Yes, I'm with you on that. We have exactly the same problems as you here and I for one wish snprintf() had been there from the start. But to answer your question, no HP-UX doesn't have it (neither in the headers nor in libc). HPUX 11 doesn't have it either? Interesting. HP-UX 10 doesn't but HP-UX 11 has it, according to docs.hp.com. The work you did on the list of already downloaded URLs seems to have been efficient ; Wget's long standing tendency to forget files in recursive downloads appears to be gone. A million thanks to Hrvoje, the contributors and the testers. -- André Majorel URL:http://www.teaser.fr/~amajorel/ (Not speaking for my employer, etc.)
Re: Wget 1.8-beta3 now available
On 2001-12-03 21:55 +0100, Hrvoje Niksic wrote: Bugfixes since 1.8-beta2. Please test it from clean compilation on Unix (Windows and MacOS are known not to compile without modifications when SSL is used.) Get it from: ftp://gnjilux.srk.fer.hr/pub/unix/util/wget/.betas/wget-1.8-beta3.tar.gz This one compiles on all platforms. Solaris 7 OK FreeBSD 4.0 OK HP-UX 10OK MP-RAS 3.0 OK Debian Linux woody OK OSF/1 4.0 OK Beautiful. :-) -- André Majorel URL:http://www.teaser.fr/~amajorel/ (Not speaking for my employer, etc.)
1.7.1-pre1 on NCR MP-RAS: success
Executive summary: complete success. On NCR MP-RAS, Wget 1.7.1-pre1 configured and compiled fine, and passed a few simple tests. The -lnsl/-lsocket and MAP_FAILED problems seen with previous versions did not occur. No SSL library is installed on the system. ./configure --with-ssl detected that correctly. The resulting executable worked fine with HTTP. For https: URLs, it prints Unknown/unsupported protocol and exits. A binary made with plain ./configure without --with-ssl exhibits the same exact behaviour. Should you need the logs, they're at http://www.teaser.fr/~amajorel/mpras/jp/wget-1.7.1-pre1.config.log.gz http://www.teaser.fr/~amajorel/mpras/jp/wget-1.7.1-pre1.config.log.with-ssl.gz Thanks to everyone involved. -- André Majorel Work: [EMAIL PROTECTED] Home: [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: wget-1.7 does not compile with glibc1 (libc5)
On 2001-06-08 17:57 -0400, Parsons, Donald wrote: Previous versions up to 1.6 compiled fine. cd src make CC='gcc' CPPFLAGS='' DEFS='-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/etc/wgetrc\ -DLOCA LEDIR=\/usr/share/locale\' CFLAGS='-O2 -fomit-frame-pointer -march=pentium -mcpu=pentium -pipe' LD FLAGS='-s' LIBS='' prefix='/usr' exec_prefix='/usr' bindir='/usr/bin' infodir='/usr/info' mandir='/u sr/man' manext='1' make[1]: Entering directory `/usr/src/wget-1.7/src' gcc -I. -I.-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/etc/wgetrc\ -DLOCALEDIR=\/usr/share/locale\ -O2 -fomit-frame-pointer -march=pentium -mcpu=pentium -pipe -c utils.c utils.c: In function `read_file': utils.c:980: `MAP_FAILED' undeclared (first use in this function) utils.c:980: (Each undeclared identifier is reported only once utils.c:980: for each function it appears in.) make[1]: *** [utils.o] Error 1 make[1]: Leaving directory `/usr/src/wget-1.7/src' make: *** [src] Error 2 Quick and dirty fix : insert the following in utils.c before the reference to MAP_FAILED : #ifndef MAP_FAILED # define MAP_FAILED -1 #endif -- André Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: Wget 1.7-pre1 available for testing
On 2001-06-06 12:47 +0200, Jan Prikryl wrote: Jan Prikryl [EMAIL PROTECTED] writes: It seems that -lsocket is not found as it requires -lnsl for linking. -lnsl is not detected as it does not contain `gethostbyname()' function. That's weird. What does libnsl contain if not gethostbyname()? It seems to contain `gethostname()' ... see the config.log submitted in one of the previous emails. But it's a very long distance shot: if, after adding -lsocket -lnsl everything works correctly and if with -lsocket only the linker complains about missing 'yp_*()' functions and also missing `gethostname()' and `getdomainname()', I thinks it's likely that these functions are defined in -lnsl. Of course, if -lnsl has built in dependency on some other library, the situation might be completely different. I've put the output of nm for libsocket and libnsl at http://www.teaser.fr/~amajorel/mpras/libnsl.so.nm.gz http://www.teaser.fr/~amajorel/mpras/libsocket.so.nm.gz -- André Majorel Work: [EMAIL PROTECTED] Home: [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Re: Wget 1.7-pre1 available for testing
On 2001-06-02 20:50 +0200, Andre Majorel wrote: On 2001-06-02 17:30 +0200, Hrvoje Niksic wrote: - The empty LIBS problem remains (add -lsocket -lnsl). Do you have a config.log for this? Wget's configure tries hard to determine whether `-lsocket' and `-lnsl' are needed, and this seems to work on Solaris. Can you see why it fails on your machine? The problem seems so be in autoconf. From my attempts at compiling v1.6 on the same system : checking for gethostbyname in -lnsl... no checking for socket in -lsocket... no when in fact they are there. I don't have access to the machine until tuesday. I'll post the config.log then. Sorry. Tuesday is today. config.log for 1.6 and 1.7-pre1 attached. 1.7 is identical to 1.7-pre1. -- André Majorel Work: [EMAIL PROTECTED] Home: [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/ config.log.gz config.log.gz
Re: SVR4 compile error
On 2001-05-26 11:10 +0200, Hrvoje Niksic wrote: Andre Majorel [EMAIL PROTECTED] writes: Compiling Wget 1.6 on an SVR4 derivative (NCR MP-RAS 3.0), I got this strange error: I think the problem is that Wget 1.6 tried to force strict ANSI mode out of the compiler. Try running make like this: make CC=cc CFLAGS=-g See if it compiles then. After removing -cX from $(CC) and adding -lsocket -lnsl to $(LIBS), it compiled. I guess autoconf has not been given much testing on this platform. :-) The binary seems fine. Is there a central repository for wget binaries ? -- André Majorel Work: [EMAIL PROTECTED] Home: [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
SVR4 compile error
Compiling Wget 1.6 on an SVR4 derivative (NCR MP-RAS 3.0), I got this strange error: # make CONFIG_FILES= CONFIG_HEADERS=src/config.h ./config.status creating src/config.h src/config.h is unchanged generating po/POTFILES from ./po/POTFILES.in creating po/Makefile cd src make CC='cc -Xc -D__EXTENSIONS__' CPPFLAGS='' DEFS='-DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\' CFLAGS='-O' LDFLAGS='' LIBS='' prefix='/usr/local' exec_prefix='/usr/local' bindir='/usr/local/bin' infodir='/usr/local/info' mandir='/usr/local/man' manext='1' cc -Xc -D__EXTENSIONS__ -I. -I. -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O -c cmpt.c NCR High Performance C Compiler R3.0c (c) Copyright 1994-98, NCR Corporation (c) Copyright 1987-98, MetaWare Incorporated cc -Xc -D__EXTENSIONS__ -I. -I. -DHAVE_CONFIG_H -DSYSTEM_WGETRC=\/usr/local/etc/wgetrc\ -DLOCALEDIR=\/usr/local/share/locale\ -O -c connect.c NCR High Performance C Compiler R3.0c (c) Copyright 1994-98, NCR Corporation (c) Copyright 1987-98, MetaWare Incorporated E /usr/include/arpa/inet.h,L66/C19(#164): in_addr_t |Symbol declaration is inconsistent with a previous declaration |at /usr/include/netinet/in.h,L47/C27. E /usr/include/arpa/inet.h,L67/C19(#164): in_port_t |Symbol declaration is inconsistent with a previous declaration |at /usr/include/netinet/in.h,L46/C28. E /usr/include/arpa/inet.h,L68/C1(#164): in_addr_t |Symbol declaration is inconsistent with a previous declaration |at /usr/include/netinet/in.h,L47/C27. E /usr/include/arpa/inet.h,L69/C1(#164): in_port_t |Symbol declaration is inconsistent with a previous declaration |at /usr/include/netinet/in.h,L46/C28. w (#657): (info) How referenced files were included: |File /usr/include/netinet/in.h from connect.c. |File /usr/include/arpa/inet.h from connect.c. 4 user errors 1 warning *** Error code 4 (bu21) make: fatal error. *** Error code 1 (bu21) make: fatal error. I find it strange that there would be more than one definition for in_addr_t and in_port_t. Does someone understand what's going on and how to fix it ? The output of configure: # ./configure creating cache ./config.cache configuring for GNU Wget 1.6 checking host system type... i586-ncr-sysv4.3.03 checking whether make sets ${MAKE}... yes checking for a BSD compatible install... ./install-sh -c checking for gcc... no checking for cc... cc checking whether the C compiler (cc ) works... yes checking whether the C compiler (cc ) is a cross-compiler... no checking whether we are using GNU C... no checking whether cc accepts -g... no checking how to run the C preprocessor... /lib/cpp checking for AIX... no checking for cc option to accept ANSI C... -Xc -D__EXTENSIONS__ checking for function prototypes... yes checking for working const... yes checking for size_t... yes checking for pid_t... yes checking whether byte ordering is bigendian... no checking size of long... 4 checking size of long long... 8 checking for string.h... yes checking for stdarg.h... yes checking for unistd.h... yes checking for sys/time.h... yes checking for utime.h... yes checking for sys/utime.h... yes checking for sys/select.h... yes checking for sys/utsname.h... yes checking for pwd.h... yes checking for signal.h... yes checking whether time.h and sys/time.h may both be included... yes checking return type of signal handlers... void checking for struct utimbuf... yes checking for working alloca.h... yes checking for alloca... yes checking for strdup... yes checking for strstr... yes checking for strcasecmp... no checking for strncasecmp... no checking for gettimeofday... yes checking for mktime... yes checking for strptime... yes checking for strerror... yes checking for snprintf... yes checking for vsnprintf... yes checking for select... yes checking for signal... yes checking for symlink... yes checking for access... yes checking for isatty... yes checking for uname... yes checking for gethostname... no checking for gethostbyname... no checking for gethostbyname in -lnsl... no checking for socket in -lsocket... no checking whether NLS is requested... yes language catalogs: cs da de el et fr gl hr it ja nl no pl pt_BR ru sk sl sv zh checking for msgfmt... msgfmt checking for xgettext... : checking for gmsgfmt... msgfmt checking for locale.h... yes checking for libintl.h... no checking for gettext... no checking for gettext in -lintl... no gettext not found; disabling NLS checking for makeinfo... no checking for emacs... no checking for xemacs... no updating cache ./config.cache creating ./config.status creating Makefile creating src/Makefile creating
Re: output to standard error?
On 2001-03-20 00:25 +0100, Hrvoje Niksic wrote: "Eddy Thilleman" [EMAIL PROTECTED] writes: Wget sends its output to standard error. Why is that? "It seemed like a good idea." The rationale behind it is that Wget's "output" is not real output, more a progress indication thingie. The real output is when you specify `-O -', and that goes to stdout. Francois Pinard once suggested that Wget prints its progress output to stdout, except when `-O -' is specified, when progress should go to stderr. Shrug. Anyone who wants to capture the output of a program for unattended operation (which is what I think Eddy wants) generally has to catch both stdout and stderr anyway. So does it matter much how much of it goes to stdout vs. stderr ? If you're doing wget 21, there's no surprise. If your shell is command.com, you might see things differently. ;-) -- Andr Majorel [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/
Patch: new option --ignore-size
I'm mirroring a very large tree locally. As the tree is larger than the local filesystem, I periodically stop wget, save what I've downloaded on CD-ROM, truncate the saved files to 0 and then start wget -N -r again to get more files. Unfortunately, wget checks not only the mtime but also the size of the local files and starts downloading them again. This patch adds the --ignore-size option which prevents this. When this option is present, wget will not retrieve the remote file again as long as the local file exists and is more recent, even if its size is not the same as the remote file. The patch has been posted to wget-patches. It's also available at URL:http://www.teaser.fr/~amajorel/wget/. I will write a documentation patch if you think the patch worth including in the distribution. -- Andr Majorel Work: [EMAIL PROTECTED] Home: [EMAIL PROTECTED] http://www.teaser.fr/~amajorel/