Re: Does AI_ADDRCONFIG really work?

2005-04-20 Thread Hrvoje Niksic
Mauro Tortonesi [EMAIL PROTECTED] writes:

 there is another possible solution. reordering the addresses returned by 
 getaddrinfo so that IPv4 addresses are at the beginning of the list.

Will that cause problems in some setups?  I thought there was an RFC
that mandated that the order of records returned by getaddrinfo be
respected.  The documentation of lookup_host even promises to return
the addresses in the same order they were received from
getaddrinfo/gethostbyname.


Re: too many users

2005-04-20 Thread Hrvoje Niksic
Leonid [EMAIL PROTECTED] writes:

 Yes, wget 1.9.1 consideres failure to connect as a fatal error and
 abandoned to re-try attempts. I have submitted several times a patch
 for fixing this and similar problems. Presumably, it will be
 inlcuded in the future wget 1.11 . If yoy need the fix now, you can
 find it the patch against wget-1.10-alpha1 in
 http://software.lpetrov.net/wget-1.10-persistent.patch or
 alternatively an entire tarball in
 http://software.lpetrov.net/wget-pet/

I reviewed the persistent patch and I think it's a very good idea --
Wget has needed something like that for a long time.

The patch as it stands IMHO does require one change: I don't think
things like opt.ntry and printwhat should be checked in as
low-level a function as connect_to_host.  connect_to_host was meant as
a fairly straightforward wrapper around functions like
gethostbyname/socket/connect that correctly multiple A/ records,
IPv6, etc.  It probably shouldn't contain UI code such as retries.

The code in http.c and ftp.c has all the loops that retry connecting.
The patch should trivially modify them to check for persistence before
giving up.


Re: wget 1.10 alpha 2

2005-04-20 Thread Hrvoje Niksic
Mauro Tortonesi [EMAIL PROTECTED] writes:

 i totally agree with hrvoje here. in the worst case, we can add an
 entry in the FAQ explaining how to compile wget with those buggy
 versions of microsoft cc.

Umm.  What FAQ?  :-)


Re: IPv4 mapped addresses and -4/-6 switches

2005-04-20 Thread Hrvoje Niksic
Mauro Tortonesi [EMAIL PROTECTED] writes:

 well, to defend myself, i have to say that nc6 handles the -4 and -6
 switches by simply setting the ai_family member of the hints struct
 to be passed to getaddrinfo to PF_INET6 and PF_INET respectively,
 instead of the PF_UNSPEC default. so, the behaviour of -4 and -6
 switches with numeric addresses in nc6 is just a byproduct of this
 simple assignement. ;-)

But so does Wget!  Take a look at the code:

if (opt.ipv4_only)
  hints.ai_family = AF_INET;
else if (opt.ipv6_only)
  hints.ai_family = AF_INET6;
else
  {
hints.ai_family = AF_UNSPEC;
hints.ai_flags |= AI_ADDRCONFIG;
  }

That's why I am surprised at the different behavior.


RE: wget 1.10 alpha 2

2005-04-20 Thread Herold Heiko
(sorry for the late answer, three days of 16+ hours/day migration aren't
fun, UPS battery exploding inside the UPS almost in my face even less)


 -Original Message-
 From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]

 Herold Heiko [EMAIL PROTECTED] writes:
 
  do have a compiler but aren't really developers (yet) (for example
  first year CS students with old lab computer compilers).
 
 From my impressions of the Windows world, non-developers won't touch
 source code anyway -- they will simply use the binary.

I feel I must dissent. Even today I'm not exactly a developer, I certainly
wasn't when I first placed my greedy hands on wget sources (in order to add
a couple of chars to URL_UNSAFE... back in 98 i think). I just knew where I
could use a compiler and followed instructions.
I'd just like wget still being compilable in an old setup by (growing)
newbies, for the learning value. Maybe something like a small note in the
windows/Readme instructions would be ok, as by the enclosed patch ?

 The really important thing is to make sure that the source works for
 the person likely to create the binaries, in this case you.  Ideally
 he should have access to the latest compiler, so we don't have to
 cater to brokenness of obsolete compiler versions.  This is not about

I must confess I'm torn between the two options. Your point is very valid,
on the other hand while it is still possible I'd like to continue using an
old setup exactly because there are still plenty of those around and I'd
like to catch these problems. Unfortunately I don't have the time to test
everything on two setups, so I think I'll continue with the old one till
easily feasable.

 Also note that there is a technical problem with your patch (if my
 reading of it is correct): it unconditionally turns on debugging,
 disregarding the command-line options.  Is it possible to save the old
 optimization options, turn off debugging, and restore the old options?
 (Borland C seems to support some sort of #pragma push to achieve
 that effect.)

It seems not, msdn mentions push only for #pragma warning, not for
#pragma optimize :(

   optimization, or with a lesser optimization level.  Ideally this
   would be done by configure.bat if it detects the broken compiler
   version.

I tried but didn't find a portably (w9x-w2x) way to do that, since in w9x we
can't redirect easily the standard error used by cl.exe.
Possibly this could be worked around by running the test from a simple perl
script, on the other hand today perl is required (on released packages) only
in order to build the documentation, not for the binary, adding another
dependency would be a pity.

 You mean that you cannot use later versions of C++ to produce
 Win95/Win98/NT4 binaries?  I'd be very surprised if that were the

Absolutely not, what I meant is, later versions can't be installed on older
windows operating systems. I think Visual Studio 6 is the last MS compiler
which runs on even NT4.

  Personally I feel wget should try to still support that not-so-old
  compiler platform if possible,
 
 Sure, but in this case some of the burden falls on the user of the
 obsolete platform: he has to turn off optimization to avoid a bug in
 his compiler.  That is not entirely unacceptable.

I concur, after all if a note is dropped in the windows/Readme either they
will read it, or they will stall due to OpenSSL dependencies (on by default)
anyway.

Heiko

-- 
-- PREVINET S.p.A. www.previnet.it
-- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED]
-- +39-041-5907073 ph
-- +39-041-5907472 fax



20050420.winreadme.diff
Description: Binary data


Re: wget 1.10 alpha 2

2005-04-20 Thread Hrvoje Niksic
Herold Heiko [EMAIL PROTECTED] writes:

 From my impressions of the Windows world, non-developers won't touch
 source code anyway -- they will simply use the binary.

 I feel I must dissent.

I am greatly surprised.  Do you really believe that Windows users
outside an academic environment are proficient in using the compiler?
I have never seen a home Windows installation that even contained a
compiler, the only exception being ones that belonged to professional
C or C++ developers.

The very idea that a Windows user might grab source code and compile a
package is strange.  I don't remember ever seeing a Windows program
distributed in source form.

 Even today I'm not exactly a developer, I certainly wasn't when I
 first placed my greedy hands on wget sources (in order to add a
 couple of chars to URL_UNSAFE... back in 98 i think). I just knew
 where I could use a compiler and followed instructions.  I'd just
 like wget still being compilable in an old setup by (growing)
 newbies, for the learning value. Maybe something like a small note
 in the windows/Readme instructions would be ok, as by the enclosed
 patch ?

That would be fine with me.


Re: Does AI_ADDRCONFIG really work?

2005-04-20 Thread Mauro Tortonesi
On Wednesday 20 April 2005 04:47 am, you wrote:
 Mauro Tortonesi [EMAIL PROTECTED] writes:
  there is another possible solution. reordering the addresses returned by
  getaddrinfo so that IPv4 addresses are at the beginning of the list.

 Will that cause problems in some setups?

on IPv6-only networks probably yes.

 I thought there was an RFC that mandated that the order of records returned 
 by getaddrinfo be respected.

not really. IIRC keeping the order of records returned by getaddrinfo is 
suggested but not mandatory.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute of Human  Machine Cognition   http://www.ihmc.us
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget 1.10 alpha 2

2005-04-20 Thread Mauro Tortonesi
On Wednesday 20 April 2005 04:58 am, Hrvoje Niksic wrote:
 Mauro Tortonesi [EMAIL PROTECTED] writes:
  i totally agree with hrvoje here. in the worst case, we can add an
  entry in the FAQ explaining how to compile wget with those buggy
  versions of microsoft cc.

 Umm.  What FAQ?  :-)

the official FAQ:

http://www.gnu.org/software/wget/faq.html

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute of Human  Machine Cognition   http://www.ihmc.us
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: IPv4 mapped addresses and -4/-6 switches

2005-04-20 Thread Mauro Tortonesi
On Wednesday 20 April 2005 05:12 am, Hrvoje Niksic wrote:
 Mauro Tortonesi [EMAIL PROTECTED] writes:
  well, to defend myself, i have to say that nc6 handles the -4 and -6
  switches by simply setting the ai_family member of the hints struct
  to be passed to getaddrinfo to PF_INET6 and PF_INET respectively,
  instead of the PF_UNSPEC default. so, the behaviour of -4 and -6
  switches with numeric addresses in nc6 is just a byproduct of this
  simple assignement. ;-)

 But so does Wget!  Take a look at the code:

 if (opt.ipv4_only)
   hints.ai_family = AF_INET;
 else if (opt.ipv6_only)
   hints.ai_family = AF_INET6;
 else
   {
   hints.ai_family = AF_UNSPEC;
   hints.ai_flags |= AI_ADDRCONFIG;
   }

 That's why I am surprised at the different behavior.

sorry, i forgot to tell you that when the -6 switch is used, nc6 also sets the 
IPV6_V6ONLY option for the PF_INET6 socket used in the communication. that's 
why IPv4-compatible addresses are rejected.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute of Human  Machine Cognition   http://www.ihmc.us
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget 1.10 alpha 2

2005-04-20 Thread Mauro Tortonesi
On Wednesday 20 April 2005 05:55 am, Herold Heiko wrote:
 (sorry for the late answer, three days of 16+ hours/day migration aren't
 fun, UPS battery exploding inside the UPS almost in my face even less)

  -Original Message-
  From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
 
  Herold Heiko [EMAIL PROTECTED] writes:
   do have a compiler but aren't really developers (yet) (for example
   first year CS students with old lab computer compilers).
 
  From my impressions of the Windows world, non-developers won't touch
  source code anyway -- they will simply use the binary.

 I feel I must dissent. Even today I'm not exactly a developer, I certainly
 wasn't when I first placed my greedy hands on wget sources (in order to add
 a couple of chars to URL_UNSAFE... back in 98 i think). I just knew where I
 could use a compiler and followed instructions.
 I'd just like wget still being compilable in an old setup by (growing)
 newbies, for the learning value. Maybe something like a small note in the
 windows/Readme instructions would be ok, as by the enclosed patch ?

publishing a separate patch on the website and including it in the tarball 
along with a note in windows/Readme is ok for me. but including an ugly 
workaround in the main sources just to support some older versions of 
microsoft c is definitely not.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute of Human  Machine Cognition   http://www.ihmc.us
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: [Feature request] accept/reject path expresion

2005-04-20 Thread Mauro Tortonesi
On Wednesday 20 April 2005 12:38 am, Oliver Schulze L. wrote:
 Hi,
 it would be nice if you could tell wget not to download a file if the URL
 matches some pattern. The pattern could be shell like or a regular
 expresion.

 For example, if you do a mirror of a ftp site, and you don't want to
 download
 the directories that match the i686 pattern, you should run a command
 like this:

 wget -m -url-reject=*i686* ftp://ftp.updates.distro.org/pub/

 Then, wget will ignore this paths:
 ftp://ftp.updates.distro.org/pub/distro1/i686/
 ftp://ftp.updates.distro.org/pub/distro2/i686/
 ftp://ftp.updates.distro.org/pub/distro3/i686/

 but will download this ones:
 ftp://ftp.updates.distro.org/pub/distro1/i386/
 ftp://ftp.updates.distro.org/pub/distro1/athlon/
 ftp://ftp.updates.distro.org/pub/distro2/i386/
 ftp://ftp.updates.distro.org/pub/distro2/athlon/
 ftp://ftp.updates.distro.org/pub/distro3/i386/
 ftp://ftp.updates.distro.org/pub/distro3/athlon/

 Since you don't have the list of all directories in the ftp site, this
 features
 can really help wget becomes a even more powerfull mirror tool.

hi oliver,

for the moment the development of wget is in feature freeze state for the 
upcoming 1.10 release. but we are considering to add regex support to wget 
1.11. stay tuned ;-)

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute of Human  Machine Cognition   http://www.ihmc.us
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


wget 1.9.1 -- 2 GB limit -- negative filesize

2005-04-20 Thread Alexander Elgert
Greetings,

I tried to get the wikipedia DVD image from
ftp.uni-erlangen.de/pub/mirrors/wikipedia.de/wp_1_2005.iso
and there is a problem with the size handling in wget.
Filesystem is ext3 and does definitively support files greater than 2GB.

wget justs abort at 2GB-1 Byte and declares the file as already retrieved
wget --spider reports a negative filesize.


0 mandola ~wget --spider 
ftp.uni-erlangen.de/pub/mirrors/wikipedia.de/wp_1_2005.iso
--17:01:21--  http://ftp.uni-erlangen.de/pub/mirrors/wikipedia.de/wp_1_2005.iso
   = `wp_1_2005.iso.1'
Resolving ftp.uni-erlangen.de... 131.188.3.71
Connecting to ftp.uni-erlangen.de[131.188.3.71]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: -1,542,565,888 [application/octet-stream]
200 OK

0 mandola ~wget --version
GNU Wget 1.9.1

0 mandola ~/we/wikipedia/ftp.uni-erlangen.de/pub/mirrors/wikipedia.dewget -c 
ftp.uni-erlangen.de/pub/mirrors/wikipedia.de/wp_1_2005.iso
--17:32:16--  http://ftp.uni-erlangen.de/pub/mirrors/wikipedia.de/wp_1_2005.iso
   = `wp_1_2005.iso'
Resolving ftp.uni-erlangen.de... 131.188.3.71
Connecting to ftp.uni-erlangen.de[131.188.3.71]:80... connected.
HTTP request sent, awaiting response... 206 Partial Content

The file is already fully retrieved; nothing to do.

0 mandola ~/we/wikipedia/ftp.uni-erlangen.de/pub/mirrors/wikipedia.dell 
wp_1_2005.iso
-rw---1 elgert   stud 2147483647 Apr 18 18:54 wp_1_2005.iso
0 mandola ~/we/wikipedia/ftp.uni-erlangen.de/pub/mirrors/wikipedia.de


Alexander Elgert


Re: [Feature request] accept/reject path expresion

2005-04-20 Thread Oliver Schulze L.
Excelent news!
Thanks Mauro,
will be waiting ;)
Oliver
Mauro Tortonesi wrote:
On Wednesday 20 April 2005 12:38 am, Oliver Schulze L. wrote:
 

Hi,
it would be nice if you could tell wget not to download a file if the URL
matches some pattern. The pattern could be shell like or a regular
expresion.
For example, if you do a mirror of a ftp site, and you don't want to
download
the directories that match the i686 pattern, you should run a command
like this:
wget -m -url-reject=*i686* ftp://ftp.updates.distro.org/pub/
Then, wget will ignore this paths:
ftp://ftp.updates.distro.org/pub/distro1/i686/
ftp://ftp.updates.distro.org/pub/distro2/i686/
ftp://ftp.updates.distro.org/pub/distro3/i686/
but will download this ones:
ftp://ftp.updates.distro.org/pub/distro1/i386/
ftp://ftp.updates.distro.org/pub/distro1/athlon/
ftp://ftp.updates.distro.org/pub/distro2/i386/
ftp://ftp.updates.distro.org/pub/distro2/athlon/
ftp://ftp.updates.distro.org/pub/distro3/i386/
ftp://ftp.updates.distro.org/pub/distro3/athlon/
Since you don't have the list of all directories in the ftp site, this
features
can really help wget becomes a even more powerfull mirror tool.
   

hi oliver,
for the moment the development of wget is in feature freeze state for the 
upcoming 1.10 release. but we are considering to add regex support to wget 
1.11. stay tuned ;-)

 

--
Oliver Schulze L.
[EMAIL PROTECTED]


Re: wget 1.9.1 -- 2 GB limit -- negative filesize

2005-04-20 Thread Mauro Tortonesi

hi alexander,

this is a known problem which is already fixed in cvs. perhaps you may want to 
try using wget 1.10-alpha2:

ftp://ftp.deepspace6.net/pub/ds6/sources/wget/wget-1.10-alpha2.tar.gz
ftp://ftp.deepspace6.net/pub/ds6/sources/wget/wget-1.10-alpha2.tar.bz2

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute of Human  Machine Cognition   http://www.ihmc.us
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget 1.10 alpha 2

2005-04-20 Thread Hrvoje Niksic
Mauro Tortonesi [EMAIL PROTECTED] writes:

 On Wednesday 20 April 2005 04:58 am, Hrvoje Niksic wrote:
 Mauro Tortonesi [EMAIL PROTECTED] writes:
  i totally agree with hrvoje here. in the worst case, we can add an
  entry in the FAQ explaining how to compile wget with those buggy
  versions of microsoft cc.

 Umm.  What FAQ?  :-)

 the official FAQ:

 http://www.gnu.org/software/wget/faq.html

This is the first time that I see it.  It's actually pretty good, I
like it.


Re: IPv4 mapped addresses and -4/-6 switches

2005-04-20 Thread Hrvoje Niksic
Mauro Tortonesi [EMAIL PROTECTED] writes:

 sorry, i forgot to tell you that when the -6 switch is used, nc6
 also sets the IPV6_V6ONLY option for the PF_INET6 socket used in the
 communication. that's why IPv4-compatible addresses are rejected.

Should Wget do the same?  It seems to make sense to me.


Re: cygwin wget ssl

2005-04-20 Thread Hrvoje Niksic
[ Cc'ing the Wget mailing list ]

Konrad Chan [EMAIL PROTECTED] writes:

 Hi, I was wondering if you could provide some assistance on how to
 resolve this problem.

 wget using SSL works except for this site. Any reason why and how to
 resolve?

It seems this site is sending something that the OpenSSL library
cannot handle.  For example, both Wget and curl on Linux display the
same error:

$ curl https://www.danier.com
curl: (35) error:1408F455:SSL routines:SSL3_GET_RECORD:decryption failed or bad 
record mac

Maybe this should be reported to the OpenSSL maintainers?


Re: wget 1.10 alpha 2

2005-04-20 Thread Mauro Tortonesi
On Wednesday 20 April 2005 02:42 pm, Hrvoje Niksic wrote:
 Mauro Tortonesi [EMAIL PROTECTED] writes:
  On Wednesday 20 April 2005 04:58 am, Hrvoje Niksic wrote:
  Mauro Tortonesi [EMAIL PROTECTED] writes:
   i totally agree with hrvoje here. in the worst case, we can add an
   entry in the FAQ explaining how to compile wget with those buggy
   versions of microsoft cc.
 
  Umm.  What FAQ?  :-)
 
  the official FAQ:
 
  http://www.gnu.org/software/wget/faq.html

 This is the first time that I see it.  It's actually pretty good, I
 like it.

yes, i like it very much too. it will need an update after the release of 
1.10, though.

-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
Institute of Human  Machine Cognition   http://www.ihmc.us
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it