ipv6 patch

2003-11-19 Thread Herold Heiko
Attached a little patch needed for current cvs in order to compile on
windows nt 4 (any system without IPV6 really).

Changelog:
connect.c (socket_has_inet6): don't use AF_INET6 without ENABLE_IPV6
main.c (main): don't test opt.ipv[46]_only without ENABLE_IPV6

Heiko

-- 
-- PREVINET S.p.A. www.previnet.it
-- Heiko Herold [EMAIL PROTECTED]
-- +39-041-5907073 ph
-- +39-041-5907472 fax



20031119.diff
Description: Binary data


Re: ipv6 patch

2003-11-19 Thread Hrvoje Niksic
Herold Heiko [EMAIL PROTECTED] writes:

 Attached a little patch needed for current cvs in order to compile
 on windows nt 4 (any system without IPV6 really).

Thanks.  Note that the function isn't even called when IPv6 is
unavailable, so you can feel free to wrap the entire function in
#ifdef ENABLE_IPV6.


P.S.
Please send or Cc patches to the wget-patches list.


Re: ipv6 patch

2003-11-19 Thread Gisle Vanem
Herold Heiko [EMAIL PROTECTED] said:

 Attached a little patch needed for current cvs in order to compile on
 windows nt 4 (any system without IPV6 really).

FYI, Wget/IPv6 on Windows do work somewhat; getaddrinfo()
is able to resolve a host to it's IPv6 address(es). But getnameinfo()
isn't able to convert it back to a presentation address. Same behaviour
as Linux w/o a IPv6 stack AFAICS.

Due to lack of inet_ntop() on Windows, I used this instead:

  struct sockaddr_in6 addr6;
  addr6.sin6_family = AF_INET6;
  memcpy (addr6.sin6_addr, address, sizeof(addr6.sin6_addr));
  if (getnameinfo ((const struct sockaddr*)addr6, sizeof(addr6),
   buf, size, NULL, 0, NI_NUMERICHOST) == 0)
return (buf);
  return (NULL);

but Wget doesn't check return value of inet_ntop(). Hint hint.
So the trace looks a bit weird:

wget -d6 ftp://ftp.deepspace6.net/
Resolving ftp.deepspace6.net... seconds 0.00, ,
Caching ftp.deepspace6.net = 
Connecting to ftp.deepspace6.net||:21... failed: Address family not supported
Connecting to ftp.deepspace6.net||:21... failed: Address family not supported

Note Wget tries twice; once for each address in the list.
A little refinement would be to stop trying if the address-families
are the same.

--gv



Re: ipv6 patch

2003-11-19 Thread Hrvoje Niksic
Gisle Vanem [EMAIL PROTECTED] writes:

 Due to lack of inet_ntop() on Windows, I used this instead:

   struct sockaddr_in6 addr6;
   addr6.sin6_family = AF_INET6;
   memcpy (addr6.sin6_addr, address, sizeof(addr6.sin6_addr));
   if (getnameinfo ((const struct sockaddr*)addr6, sizeof(addr6),
buf, size, NULL, 0, NI_NUMERICHOST) == 0)
 return (buf);
   return (NULL);

 but Wget doesn't check return value of inet_ntop(). Hint hint.

I wasn't aware that inet_ntop could really fail.  Why did getaddrinfo
return the address if I can't print it?

 So the trace looks a bit weird:

wget -d6 ftp://ftp.deepspace6.net/
 Resolving ftp.deepspace6.net... seconds 0.00, ,
 Caching ftp.deepspace6.net = 
 Connecting to ftp.deepspace6.net||:21... failed: Address family not supported
 Connecting to ftp.deepspace6.net||:21... failed: Address family not supported

 Note Wget tries twice; once for each address in the list.
 A little refinement would be to stop trying if the address-families
 are the same.

getaddrinfo shouldn't even return IPv6 addresses if AF_INET6 is not
supported.  There is code that tries to handle this case, but it
obviously fails on Windows.  Are you using the latest CVS?


Re: ipv6 patch

2003-11-19 Thread Gisle Vanem
  but Wget doesn't check return value of inet_ntop(). Hint hint.
 
 I wasn't aware that inet_ntop could really fail.  Why did getaddrinfo
 return the address if I can't print it?

getaddrinfo() on Win-XP seems to be a thin wrapper over the DNS
client which resolves  records fine. But getnameinfo() seems
to rely on some deeper IPv6 stuff being installed.

So inet_ntop() can fail if coded using getnameinfo() as I described.
Therefore I adapted Paul Vixie's inet_ntop() which works w/o IPv6
installed.

 getaddrinfo shouldn't even return IPv6 addresses if AF_INET6 is not
 supported.  There is code that tries to handle this case, but it
 obviously fails on Windows.  Are you using the latest CVS?
 
Compiled from yesterdays CVS. Note, I used '-6', so socket_has_inet6() 
is bypassed which is okay IMHO.

Your statement getaddrinfo shouldn't even return IPv6 ...
contadicts what you wrote earlier:

[EMAIL PROTECTED]
As to why my system resolves IPv6 addresses in the first place -- good
question.  But it's a more or less default Red Hat Linux 9 setting,
I'm sure I won't be the only one with this problem.

So Win-XP and RH9 does pretty much the same.

--gv



Re: ipv6 patch

2003-11-19 Thread Hrvoje Niksic
Gisle Vanem [EMAIL PROTECTED] writes:

  but Wget doesn't check return value of inet_ntop(). Hint hint.
 
 I wasn't aware that inet_ntop could really fail.  Why did
 getaddrinfo return the address if I can't print it?

 getaddrinfo() on Win-XP seems to be a thin wrapper over the DNS
 client which resolves  records fine. But getnameinfo() seems to
 rely on some deeper IPv6 stuff being installed.

 So inet_ntop() can fail if coded using getnameinfo() as I described.

I still think it's weird; getnameinfo is supposed to be the reverse of
getaddrinfo -- if one resolves the  records, the other should grok
IPv6 addresses.  But I guess we have to live with it anyway.

 Note, I used '-6', so socket_has_inet6() is bypassed which is okay
 IMHO.

But why are you explicitly using -6 if you don't have IPv6 fully
installed?  It sounds like a case for the proverbial
doctor,-doctor,-it-hurts-when-I-do-that,-so-don't-do-that-then
response.  :-)

 Your statement getaddrinfo shouldn't even return IPv6 ...
 contadicts what you wrote earlier:
 [EMAIL PROTECTED]
 As to why my system resolves IPv6 addresses in the first place -- good
 question.  But it's a more or less default Red Hat Linux 9 setting,
 I'm sure I won't be the only one with this problem.

I wrote that before socket_has_inet6.  In other words, the fix for the
problem that the quote refers to is the socket_has_inet6 test on
systems without AI_ADDRCONFIG.

A more accurate form of the first statement would be getaddrinfo, as
used by Wget by default, shouldn't even return IPv6 ...  Therefore
it's not in contradiction, but in accordance with the second quote.

 So Win-XP and RH9 does pretty much the same.

Well, inet_ntop at least works on RH 9!


recv and the MSG_PEEK flag

2003-11-19 Thread Hrvoje Niksic
Does anyone know whether the MSG_PEEK flag can be relied upon?  I'd
like to use peeking to get rid of the ad hoc rbuf layer used in Wget
since time immemorial.

Peeking would require additional work under SSL, but I think I know
how to make it work.  But I'm more worried about TCP/IP stacks that
don't support MSG_PEEK, such as is allegedly the case on BEOS.

Does anyone have more information about this?  When did MSG_PEEK
appear in the socket API, and how widely is it available?


P.S.
I know that peeking is inefficient in general, but I plan to use it
only for reading HTTP headers and individual lines of output from FTP
servers, not for request bodies.


problem with LF/CR etc.

2003-11-19 Thread Peter GILMAN


hello.

i have run into a problem while using wget: when viewing a web page with
html like this:

   a href=images/IMG_01
   .jpgimg src=images/tnIMG_01
   .jpg/a

browsers (i tested with mozilla and IE) can handle the line breaks in
the urls (presumably stripping them out), but wget chokes on the
linefeeds and carriage returns; it inserts them into the urls, and then
(naturally) fails with a 404:

   --17:17:31--  http://www.someurl.tld/images/IMG_01%0A.jpg
  = `www.someurl.tld/images/IMG_01
   .jpg'
   Connecting to www.someurl.tld[10.0.0.40]:80... connected.
   HTTP request sent, awaiting response... 404 Not Found
   17:17:31 ERROR 404: Not Found.

i've run into variants of this problem in several different places; is
there a way to handle situations like this with wget?

technical details: i am using wget 1.8.2, and my command-line invocation
is typically a very simple:

   wget -m -A jpg http://www.someurl.tld

any tips/clues would be much appreciated.

NOTE: please cc me in any replies, as i am not currently subscribed to
the list.  thanks!

- pete gilman


RE: problem with LF/CR etc.

2003-11-19 Thread Post, Mark K
That is _really_ ugly, and perhaps immoral.  Make it an option, if you must.
Certainly don't make it the default behavior.

Shudder


Mark Post

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 19, 2003 4:59 PM
To: Peter GILMAN
Cc: [EMAIL PROTECTED]
Subject: Re: problem with LF/CR etc.


Peter GILMAN [EMAIL PROTECTED] writes:

 i have run into a problem while using wget: when viewing a web page with
 html like this:

a href=images/IMG_01
.jpgimg src=images/tnIMG_01
.jpg/a

Eek!  Are people really doing that?  This is news to me.

 browsers (i tested with mozilla and IE) can handle the line breaks
 in the urls (presumably stripping them out), but wget chokes on the
 linefeeds and carriage returns; it inserts them into the urls, and
 then (naturally) fails with a 404:
[...]

So, Wget should squash all newlines?  It's not hard to implement, but
it feels kind of ... unclean.