wget ipv6 patch
here is my first patch to improve ipv6 support of wget. please, notice that the code compiles, but is still buggy and will probably not work. i am sending this preliminary patch only to gather feedback from wget developers and to coordinate with other developers who are working on ipv6 support for wget. so, i am asking you: what do you think of these changes? -- Aequam memento rebus in arduis servare mentem... Mauro Tortonesi [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] Deep Space 6 - IPv6 with Linux http://www.deepspace6.net Ferrara Linux User Grouphttp://www.ferrara.linux.it wget-ipv6.diff.bz2 Description: Binary data
Re: some wget patches against beta3
[EMAIL PROTECTED] (Martin v. Löwis) writes: Why do you think the scheme is narrow-minded? Because 1.9-beta3 seems to be a problem. VERSION = ('[.0-9]+-?b[0-9]+' '|[.0-9]+-?dev[0-9]+' '|[.0-9]+-?pre[0-9]+' '|[.0-9]+-?rel[0-9]+' '|[.0-9]+[a-z]?' '|[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]') But that's narrow. Why support 1.9-b3, but not 1.9-beta3 or 1.9-alpha3, or 1.9-rc10? Those and similar version schemes are in wide use. That's really bad. But what's even worse is that something or someone silently changed beta3 to b3 in the POT, and then failed to perform the same change for my translation, which caused it to get dropped without notice. Nothing should get dropped without a notice. [...] I now understand that this could have been an exception due to the outage. But that's how it happened. I sent the translation -- twice -- and it got dropped. Karl told me to resend the translation with a 1.9-b3 version (which I'd never heard of before), so I naturally assumed that the submission had been dropped because of version. Now, since UMontreal has changed the translation@ alias, it might be that some messages were lost during the outage; this is unfortunate, but difficult to correct, as we cannot find out which messages might have lost. Fortunately, most translators know to get a message back from the robot for all submissions, so if they don't get one, they resend. Note that I did resend, but to no avail. My first attempt contained a MIME attachment, which I then found out the robot didn't understand. My second attempt was from po-mode, which should have produced a valid message, except for the version.
Re: wget ipv6 patch
Mauro Tortonesi [EMAIL PROTECTED] writes: so, i am asking you: what do you think of these changes? Overall they look very good! Judging from the patch, a large piece of the work part seems to be in an unexpected place: the FTP code. Here are some remarks I got looking at the patch. It inadvertently undoes the latest fnmatch move. I still don't understand the choice to use sockaddr and sockaddr_storage in a application code. They result in needless casts and (to me) uncomprehensible code. For example, this cast: (unsigned char *)(addr-addr_v4.s_addr) would not be necessary if the address were defined as unsigned char[4]. I don't understand the new PASSIVE flag to lookup_host. In lookup_host, the comment says that you don't need to call getaddrinfo_with_timeout, but then you call getaddrinfo_with_timeout. An oversight? You removed this code: - /* ADDR is defined to be in network byte order, which is what -this returns, so we can just copy it to STORE_IP. However, -on big endian 64-bit architectures the value will be stored -in the *last*, not first four bytes. OFFSET makes sure that -we copy the correct four bytes. */ - int offset = 0; -#ifdef WORDS_BIGENDIAN - offset = sizeof (unsigned long) - sizeof (ip4_address); -#endif But the reason the code is there is that inet_aton is not present on all architectures, whereas inet_addr is. So I used only inet_addr in the IPv4 case, and inet_addr stupidly returned `long', which requires some contortions to copy into a uchar[4] on 64-bit machines. (I see that inet_addr returns `in_addr_t' these days.) If you intend to use inet_aton without checking, there should be a fallback implementation in cmpt.c. I note that you elided TYPE from ip_address if ENABLE_IPV6 is not defined. That (I think) results in code duplication in some places, because the code effectively has to handle the IPv4 case twice: #ifdef ENABLE_IPV6 switch (addr-type) { case IPv6: ... IPv6 handling ... break; case IPv4: ... IPv4 handling ... break; } #else ... IPv4 handling because TYPE is not present without ENABLE_IPV6 ... #endif If it would make your life easier to add TYPE in !ENABLE_IPV6 case, so you can write it more compactly, by all means do it. By more compactly I mean something code like this: switch (addr-type) { #ifdef ENABLE_IPV6 case IPv6: ... IPv6 handling ... break; #endif case IPv4: ... IPv4 handling ... break; }
Re: [PATCH] wget-1.8.2: Portability, plus EBCDIC patch
On Tue, Oct 07, 2003 at 06:06:59PM +0200, Hrvoje Niksic wrote: Martin, thanks for the patch and the detailed report. Note that it might have made more sense to apply the patch to the latest CVS version, which is somewhat different from 1.8.2. What must I set CVSROOT to? I'm really not sure whether to add this patch. On the one hand, it's nice to support as many architectures as possible. But on the other hand, most systems are ASCII. All the systems I've ever seen or worked on have been ASCII. Right; that is exactly what makes it so hard for those who must work on EBCDIC systems: nobody supports them, and most available software is proprietary. So, getting a patch (even if only distributed as-is, e.g., in contrib/ebcdic.patch) is a valuable help for those who don't have it (yet). I am fairly certain that I would not be able to support EBCDIC in the long run and that, unless someone were to continually support EBCDIC, the existing support would bitrot away. Is anyone on the Wget list using an EBCDIC system? How can they if they don't have the patch? It only works if the socket talks ASCII on the network, and that is what the patch solves ;-) Martin -- [EMAIL PROTECTED] | Fujitsu Siemens Fon: +49-89-636-46021, FAX: +49-89-636-47655 | 81730 Munich, Germany
problem with 302 server respose parsing
I use Wget 1.8.2. When I try receive page with '-nc' option and server return 302 and new url, wget not test that url on rules in '-nc' and download and rewrite existing file. I think wget not used command line option rules when parse server response header! It is a bug?
Re: wget ipv6 patch
On Wed, 8 Oct 2003, Hrvoje Niksic wrote: Mauro Tortonesi [EMAIL PROTECTED] writes: so, i am asking you: what do you think of these changes? Overall they look very good! Judging from the patch, a large piece of the work part seems to be in an unexpected place: the FTP code. yes, i have added support for LPRT and LPSV, and refactored existing code. i still have to work on the code, but the main problem remains probably the duplication of ftp_port and ftp_pasv, which have two different versions (one for the IPv6-enabled case and the other for IPv4-only case). Here are some remarks I got looking at the patch. It inadvertently undoes the latest fnmatch move. sorry. i am working on an old wget cvs release. i will get up-to-date with the latest cvs changes ASAP. I still don't understand the choice to use sockaddr and sockaddr_storage in a application code. They result in needless casts and (to me) uncomprehensible code. well, using sockaddr_storage is the right way (TM) to write IPv6 enabled code ;-) quoting RFC3493 section 3.10: One simple addition to the sockets API that can help application writers is the struct sockaddr_storage. This data structure can simplify writing code that is portable across multiple address families and platforms. This data structure is designed with the following goals. - Large enough to accommodate all supported protocol-specific address structures. - Aligned at an appropriate boundary so that pointers to it can be cast as pointers to protocol specific address structures and used to access the fields of those structures without alignment problems. The sockaddr_storage structure contains field ss_family which is of type sa_family_t. When a sockaddr_storage structure is cast to a sockaddr structure, the ss_family field of the sockaddr_storage structure maps onto the sa_family field of the sockaddr structure. When a sockaddr_storage structure is cast as a protocol specific address structure, the ss_family field maps onto a field of that structure that is of type sa_family_t and that identifies the protocol's address family. using a union like: struct wget_sockaddr { struct sockaddr; struct sockaddr_in; struct sockaddr_in6; }; is not an elegant solution, and is probably not safe because of compiler alignments. see the chapter about struct sockaddr_storage in: http://www.kame.net/newsletter/19980604 For example, this cast: (unsigned char *)(addr-addr_v4.s_addr) would not be necessary if the address were defined as unsigned char[4]. in_addr is the correct structure to store ipv4 addresses. using in_addr instead of unsigned char[4] makes much easier to copy or compare ipv4 addresses. moreover, you don't have to care about the integer size in 64-bits architectures. I don't understand the new PASSIVE flag to lookup_host. well, that's a problem. to get a socket address suitable for bind(2), you must call getaddrinfo with the AI_PASSIVE flag set. for instance, if you call: getaddrinfo(NULL, ftp, hints, res) with the AI_PASSIVE flag, you get the :: port 21 and 0.0.0.0 port 21 socket addresses, while calling getaddrinfo without the AI_PASSIVE flag returns the ::1 port 21 and 127.0.0.1 port 21 addresses. the passive flag for lookup_host is a very unelegant hack, but i haven't found a way to get rid of it, yet. any suggestion? In lookup_host, the comment says that you don't need to call getaddrinfo_with_timeout, but then you call getaddrinfo_with_timeout. An oversight? You removed this code: - /* ADDR is defined to be in network byte order, which is what - this returns, so we can just copy it to STORE_IP. However, - on big endian 64-bit architectures the value will be stored - in the *last*, not first four bytes. OFFSET makes sure that - we copy the correct four bytes. */ - int offset = 0; -#ifdef WORDS_BIGENDIAN - offset = sizeof (unsigned long) - sizeof (ip4_address); -#endif But the reason the code is there is that inet_aton is not present on all architectures, whereas inet_addr is. So I used only inet_addr in the IPv4 case, and inet_addr stupidly returned `long', which requires some contortions to copy into a uchar[4] on 64-bit machines. (I see that inet_addr returns `in_addr_t' these days.) If you intend to use inet_aton without checking, there should be a fallback implementation in cmpt.c. are there __REALLY__ systems which do not support inet_aton? their ISVs should be ashamed of themselves... however, yours seemed to me an ugly hack, so i have temporarily removed it. as you say, it would be probably better to provide a fallback implementation of inet_aton in cmpt.c. I note that you elided TYPE from ip_address if ENABLE_IPV6 is not defined. That (I think) results in code duplication in some places, because the code effectively has to handle the IPv4 case twice: #ifdef
Re: wget ipv6 patch
Mauro Tortonesi [EMAIL PROTECTED] writes: I still don't understand the choice to use sockaddr and sockaddr_storage in a application code. They result in needless casts and (to me) uncomprehensible code. well, using sockaddr_storage is the right way (TM) to write IPv6 enabled code ;-) Not when the only thing you need is storing the result of a DNS lookup. I've seen the RFC, but I don't agree with it in the case of Wget. In fact, even the RFC states that the data structure is merely a help for writing portable code across multiple address families and platforms. Wget doesn't aim for AF independence, and the alternatives are at least as good for platform independence. For example, this cast: (unsigned char *)(addr-addr_v4.s_addr) would not be necessary if the address were defined as unsigned char[4]. in_addr is the correct structure to store ipv4 addresses. using in_addr instead of unsigned char[4] makes much easier to copy or compare ipv4 addresses. moreover, you don't have to care about the integer size in 64-bits architectures. An IPv4 address is nothing more than a 32-bit quantity. I don't see anything incorrect about using unsigned char[4] for that, and that works perfectly fine on 64-bit architectures. Besides, you seem to be willing to cache the string representation of an IP address. Why is it acceptable to work with a char *, but unacceptable to work with unsigned char[4]? I simply don't see that in_addr is helping anything in host.c's code base. I don't understand the new PASSIVE flag to lookup_host. well, that's a problem. to get a socket address suitable for bind(2), you must call getaddrinfo with the AI_PASSIVE flag set. Why? The current code seems to get by without it. There must be a way to get at the socket address without calling getaddrinfo. are there __REALLY__ systems which do not support inet_aton? their ISVs should be ashamed of themselves... Those systems are very old, possibly predating the very invention of inet_aton. If it would make your life easier to add TYPE in !ENABLE_IPV6 case, so you can write it more compactly, by all means do it. By more compactly I mean something code like this: [...] that's a question i was going to ask you. i supposed you were against adding the type member to ip_address in the IPv4-only case, Maintainability is more important than saving a few bytes per cached IP address, especially since I don't expect the number of cache entries to ever be large enough to make a difference. (If someone downloads from so many addresses that the hash table sizes become a problem, the TYPE member will be the least of his problems.) P.S. please notice that by caching the string representation of IP addresses instead of their network representation, the code could become much more elegant and simple. You said that before, but I don't quite understand why that's the case. It's certainly not the case for IPv4.
Re: wget ipv6 patch
Mauro Tortonesi wrote: are there __REALLY__ systems which do not support inet_aton? their ISVs should be ashamed of themselves... Solaris, for example. IIRC inet_aton isn't in any document which claims to be a standard. however, yours seemed to me an ugly hack, so i have temporarily removed it. as you say, it would be probably better to provide a fallback implementation of inet_aton in cmpt.c. But standards define inet_pton, which can do what inet_aton does, so that should be checked for before using the fallback implementation. -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]
Error: wget for Windows.
I am trying to use wget for Windows get this message: The ordinal 508 could not be located in the dynamic link library LIBEAY32.dll. This is the command I am using: wget http://www.website.com --http-user=username --http-passwd=password I have the LIBEAY32.dll file in the same folder as the wget. What could be wrong? Thanks in advance. Suhas
Re: Error: wget for Windows.
Hi Suhas! I am trying to use wget for Windows get this message: The ordinal 508 could not be located in the dynamic link library LIBEAY32.dll. You are very probably using the wrong version of the SSL files. Take a look at http://xoomer.virgilio.it/hherold/ Herold has nicely rearranged the links to wget binaries and the SSL binaries. As you can see, different wget versions need different SSL versions- Just download the matching SSL, everything else should then be easy :) Jens This is the command I am using: wget http://www.website.com --http-user=username --http-passwd=password I have the LIBEAY32.dll file in the same folder as the wget. What could be wrong? Thanks in advance. Suhas -- NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien... Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService Jetzt kostenlos anmelden unter http://www.gmx.net +++ GMX - die erste Adresse für Mail, Message, More! +++
Re: some wget patches against beta3
[EMAIL PROTECTED] (Martin v. Löwis) writes: Hrvoje Niksic [EMAIL PROTECTED] writes: VERSION = ('[.0-9]+-?b[0-9]+' '|[.0-9]+-?dev[0-9]+' '|[.0-9]+-?pre[0-9]+' '|[.0-9]+-?rel[0-9]+' '|[.0-9]+[a-z]?' '|[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]') But that's narrow. Why support 1.9-b3, but not 1.9-beta3 or 1.9-alpha3, or 1.9-rc10? Those and similar version schemes are in wide use. Are you requesting the addition of these three formats? Yes, please. To be clear: it would be ideal if the Robot didn't care about versioning at all. But if it really has to, then it should support versioning schemes in wide use.